19Nov

Junior Software Tester at Logistics Management Institute – Remote, United States


Overview

LMI is a consultancy dedicated to powering a future-ready, high-performing government, drawing from expertise in digital and analytic solutions, logistics, and management advisory services. We deliver integrated capabilities that incorporate emerging technologies and are tailored to customers’ unique mission needs, backed by objective research and data analysis. Founded in 1961 to help the Department of Defense resolve complex logistics management challenges, LMI continues to enable growth and transformation, enhance operational readiness and resiliency, and ensure mission success for federal civilian and defense agencies.

 

LMI has been named a 2022 #TopWorkplace in the United States by Top Workplaces! We are honored to be recognized as a company that values a people-centered culture, and we are grateful to our employees for making this possible!

 

LMI is seeking  a software tester to be part of a team that is delivering comprehensive software testing support for the products developed within the USPS Artificial Intellgence Virtual Assistant (AIVA) Team. All software testing candidates must have strong communication skills, a passion for testing, and a desire to learn new technologies on high-profile projects. 

Responsibilities

  • Collaborate with a team of seasoned professionals to craft, implement, and evaluate test scenarios for USPS Innovation Lab’s cutting-edge, customer-facing, hardware, and software technologies.
  • Interact professionally with customers and team members to solve problems.
  • Carry out test cases and testing scripts to identify defects. 
  • Present test results to customers, developers, and others.
  • Support root cause analysis on defects found to identify and mitigate project risks in order to provide feedback to developers.
  • Support completion of comprehensive testing against the software, ensuring all features function as designed and intended, running manual and automated tests to assess performance, compatibility and user interface.
  • Support testers, developers and business teams in the identification of testing approaches and infrastructure. 
  • Rapidly test designs with users using both passive and active methods. 
  • Provide input to front-end application developers on how to effectively deliver model outcomes to users.
  • Ensure compliance with industry standards, best practices, and regulatory requirements, including accessibility (508 compliance) reducing rework and cost where possible. 

Qualifications

MINIMUM REQUIREMENTS:

  • Minimum bachelor’s degree in computer science, management information systems, engineering, or a related field.
  • Two (2) or more years of professional experience.
  • Demonstrated experience in agile software testing.
  • Ability to execute test cases based on business requirements.
  • Strong communication skills both verbal and written and the ability to work tactfully with a team of software developers and domain experts.
  • Ability to professionally interact with customers and team members to solve problems.
  • Presentation skills to regularly present test results.
  • Ability to manipulate and analyze data using Excel, or other tools.
  • Ability to obtain a USPS security clearance.

PREFERRED EXPERIENCE/SKILLS:

  • Basic understanding of Python, C#, HTML, ASP.NET, SQL and JavaScript desired.
  • Experience testing various types of software, such as web and native mobile applications desired.
  • Experience using formal task and ticketing applications such as Atlassian JIRA or Bugzilla desired.
  • Knowledge of information systems and experience working with federal agencies desired.
  • Experience working in a consultant/client environment is desired.
  • Prior experience with DialogFlow, Google Cloud Computing Services, ALM and Version 1. 
  • Familiarty with AI and machine learning concepts and their applications in user experience design. 



Source link

18Nov

Backend distributed systems Engineer-SMTS/MTS at Salesforce – India – Hyderabad


To get the best candidate experience, please consider applying for a maximum of 3 roles within 12 months to ensure you are not duplicating efforts.

Job Category

Software Engineering

Job Details

About Salesforce

We’re Salesforce, the Customer Company, inspiring the future of business with AI+ Data +CRM. Leading with our core values, we help companies across every industry blaze new trails and connect with customers in a whole new way. And, we empower you to be a Trailblazer, too — driving your performance and career growth, charting new paths, and improving the state of the world. If you believe in business as the greatest platform for change and in companies doing well and doing good – you’ve come to the right place.

About Salesforce

We’re Salesforce, the Customer Company, inspiring the future of business with AI+ Data +CRM. Leading with our core values, we help companies across every industry blaze new trails and connect with customers in a whole new way. And, we empower you to be a Trailblazer, too — driving your performance and career growth, charting new paths, and improving the state of the world. If you believe in business as the greatest platform for change and in companies doing well and doing good– you’ve come to the right place.

Role Description

Salesforce has immediate opportunities for software developers who want their lines of code to have significant and measurable positive impact for users, the company’s bottom line, and the industry. You will be working with a group of world-class engineers to build the breakthrough features our customers will love, adopt, and use while keeping our trusted CRM platform stable and scalable. The software engineer role at Salesforce encompasses architecture, design, implementation, and testing to ensure we build products right and release them with high quality.

Code review, mentoring junior engineers, and providing technical guidance to the team (depending on the seniority level) We pride ourselves on writing high-quality, maintainable code that strengthens the stability of the product and makes our lives easier. We embrace the hybrid model and celebrate the individual strengths of each team member while encouraging everyone on the team to grow into the best version of themselves. We believe that autonomous teams with the freedom to make decisions will empower the individuals, the product, the company, and the customers they serve to thrive.

Your Impact

Responsibilities

  • Build new and exciting components in an ever-growing and evolving market technology to provide scale and efficiency.

  • Develop high-quality, production-ready code that millions of users of our cloud platform can use.

  • Design, implement, and tune robust APIs and API framework-related features that perform and scale in a multi-tenant environment.

  • Work in a Hybrid Engineering model and contribute to all phases of SDLC including design, implementation, code reviews, automation, and testing of the features.

  • Build efficient components/algorithms on a microservice multi-tenant SaaS cloud environment

  • Code review, mentoring junior engineers, and providing technical guidance to the team (depending on the seniority level)

 

Required Skills/Experience

  • Industry Experience: 2-8 years including

    • Experience in designing, implementing and operating large scale distributed systems in public cloud environments (AWS, GCP or Azure)

    • Experience with Containers and orchestration technologies (e.g. Docker, Kubernetes)

    • Familiarity with DevOps practices, CI/CD tools, Configuration management and Infrastructure as a code (IaC)

  • Programming: Proficiency in object-oriented and multi-threaded programming in Java and/or Python

  • Software Design: Demonstrable understanding of design patterns, distributed systems, data structures and algorithms

  • Operating System: Development and software management on Linux (e.g., CentOS, RHEL) and Windows

  • Agile: Experience with Agile development methodology

  • Communication: Excellent oral and written communication skills

  • Team: Work as a team player and value team success beyond personal contributions

  • Strong debugging and troubleshooting skills

  • Education: Bachelor or Master’s degree in Computer Sciences or equivalent field

Desired Skills/Experience

  • Security: Strong fundamentals knowledge in security concepts: authentication/authorization frameworks (e.g., SSO, SAML, Oauth), secure transport (e.g., SSL, TLS), identity management (e.g., certificates, PKI)

  • Platform development: Proven track of designing and coding large-scale PaaS or IaaS systems.

  • Big Data: Proficiency and experience with relational and NoSQL databases, message queues (Kafka, SNS/SQS, etc), Splunk and other components in data pipelines

  • Previous experience in MLOps

Accommodations

If you require assistance due to a disability applying for open positions please submit a request via this Accommodations Request Form.

Posting Statement

At Salesforce we believe that the business of business is to improve the state of our world. Each of us has a responsibility to drive Equality in our communities and workplaces. We are committed to creating a workforce that reflects society through inclusive programs and initiatives such as equal pay, employee resource groups, inclusive benefits, and more. Learn more about Equality at www.equality.com and explore our company benefits at www.salesforcebenefits.com.

Salesforce is an Equal Employment Opportunity and Affirmative Action Employer. Qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender perception or identity, national origin, age, marital status, protected veteran status, or disability status. Salesforce does not accept unsolicited headhunter and agency resumes. Salesforce will not pay any third-party agency or company that does not have a signed agreement with Salesforce.

Salesforce welcomes all.



Source link

18Nov

Senior Data Engineer – 1971 at CES – Chennai, India


CES has 26+ years of experience in delivering Software Product Development, Quality Engineering, and Digital Transformation Consulting Services to Global SMEs & Large Enterprises. CES has been delivering services to some of the leading Fortune 500 Companies including Automotive, AgTech, Bio Science, EdTech, FinTech, Manufacturing, Online Retailers, and Investment Banks. These are long-term relationships of more than 10 years and are nurtured by not only our commitment to timely delivery of quality services but also due to our investments and innovations in their technology roadmap. As an organization, we are in an exponential growth phase with a consistent focus on continuous improvement, process-oriented culture, and a true partnership mindset with our customers. We are looking for the right qualified and committed individuals to play an exceptional role as well as to support our accelerated growth. You can learn more about us at: http://www.cesltd.com/
Job Role: We are looking for a seasoned Senior Data Engineer to join our team. 

Roles & Responsibilities:

  • Hands-on experience in designing and developing data integration and ETL solutions for the Cloud.
  • Hands-on experience in AWS suite: AWS Glue, Glue Studio, Redshift, Kinesis, Step functions, Lambda, SES.
  • Additional technology exposure: SQL, PySpark, Python.
  • Versed in designing synchronous as well as asynchronous data integration approaches.
  • Experience in developing CI/CD pipelines and IaaS code using Terraform, as well as CI/CD pipelines to sup personal attributes.
  • Excellent verbal and written communication skills.
  • Strong management and the ability to work independently and as a team.

Why CES?

  • Flexible working hours to create a work-life balance.
  • Opportunity to work on advanced tools and technologies.
  • Global exposure to not only collaborate with the team, but also to connect with the client portfolio and build professional relationships.
  • Highly encouraged for any innovative ideas & thoughts and we support in executing the same.
  • Periodical and on-spot rewards and recognitions on your performance.
  • Provides a better platform for enhancing skills via many different L&D programs.
  • Enabling and empowering atmosphere to work along.



Source link

18Nov

Revolutionising AI Agents With Computer Use Tools | by Cobus Greyling | Nov, 2024


Anthropic’s New Framework for Seamless Computer Use Is a good example for what is to come.

The future of AI Agents isn’t just about better language models — it’s about integrating intelligent agents into your computing environment.

Anthropic has taken a bold step forward with its new framework for computer use, designed to work locally on your system while leveraging reasoning and task decomposition capabilities.

𝗕𝗿𝗲𝗮𝗸𝗶𝗻𝗴 𝗶𝘁 𝗱𝗼𝘄𝗻:
🌟 𝘐𝘵’𝘴 𝘕𝘰𝘵 𝘑𝘶𝘴𝘵 𝘢 𝘓𝘢𝘯𝘨𝘶𝘢𝘨𝘦 𝘔𝘰𝘥𝘦𝘭
– This framework isn’t a typical AI chatbot or model — it’s a powerful tool that lives within your local computing space.

– At its core is the AI Agent, running within a Docker container, that:
— Decomposes tasks into manageable steps.
— Reasons logically to solve problems effectively.
— Interfaces directly with your computer to get things done.

– Think of it as a local, reasoning-driven AI agent, where the ability to navigate your computer is just one of the tools in its arsenal.

🛠️ 𝘛𝘩𝘳𝘦𝘦 𝘛𝘰𝘰𝘭 𝘛𝘺𝘱𝘦𝘴
– The framework features a robust tool system to handle diverse tasks:
1. Computer Tools
2. Text Editor Tool
3. Bash Tool

🌍 𝘏𝘰𝘸 𝘵𝘩𝘦 𝘈𝘯𝘵𝘩𝘳𝘰𝘱𝘪𝘤 𝘔𝘰𝘥𝘦𝘭 𝘍𝘪𝘵𝘴 𝘐𝘯
While the framework operates locally, it integrates seamlessly with Anthropic’s state-of-the-art language model (which includes vision capabilities) over the internet.

𝘛𝘩𝘪𝘴 𝘮𝘰𝘥𝘦𝘭 𝘱𝘳𝘰𝘷𝘪𝘥𝘦𝘴:
– Deep language understanding for advanced reasoning.
– Vision processing for interpreting images and other visual data.
– This modular approach ensures local control and performance while accessing cloud-powered intelligence only when necessary.

𝗪𝗵𝘆 𝗜𝘁 𝗠𝗮𝘁𝘁𝗲𝗿𝘀

Anthropic’s framework isn’t just about answering questions — it’s about giving you an AI Agent that understands your environment and helps you tackle complex tasks end-to-end.

This blend of local computing and cloud-powered reasoning opens up new possibilities for developers, researchers, and businesses looking to integrate AI into their workflows in a meaningful way.

💡 Imagine having an AI agent that doesn’t just think — it acts, navigates, and adapts within your computing space.

Chief Evangelist @ Kore.ai | I’m passionate about exploring the intersection of AI and language. From Language Models, AI Agents to Agentic Applications, Development Frameworks & Data-Centric Productivity Tools, I share insights and ideas on how these technologies are shaping the future.



Source link

18Nov

Diagnostics Architect at Zoox – Foster City, CA


About the RoleSafe, stable, and reliable transportation is the key promise of autonomous mobility.  You will architect, implement, and refine the central vehicle stability framework that fulfills this promise.  You will partner with Zoox’s Software, Hardware, Safety, Data, and Operational teams to develop the robot’s Fault Management System.  You will support our stability analytics teams and use their results to optimize the on-robot code for rapid fault triage at scale.
You will join the Platform Stability team inside of Software Core, a team of system-oriented C++ engineers developing frameworks to track robot instability in real time.  You will provide technical leadership to the team through architectural reviews and data-driven development.  You will evangelize safe, sustainable coding practices and work collaboratively with other teams to gather and implement software requirements.
Ultimately, we seek a leader with previous experience delivering mission-critical software that supports large teams.  We desire a team member who can visualize an ideal final design, but who also maintains the strategic mindset to deliver the software incrementally.

Responsibilities

  • You will provide technical leadership to the Platform Stability team
  • You will drive the development of the robot’s Fault Management System, in collaboration with Safety and Firmware teams
  • You will help improve the performance of the on-bot stability framework and off-bot stability analysis tools
  • You will champion the adoption of the stability framework across all software teams, including requiring new diagnostics when novel failures occur on-vehicle
  • You will monitor metrics about the health and adoption of the stability framework across the Zoox fleet

Qualifications

  • Experience tech leading teams of 4+ engineers
  • Experience using metrics and statistical methods to drive decision-making at various levels in an organization
  • Strong knowledge of C++ and experience in large code bases
  • Good communication and collaboration skills

Bonus Qualifications

  • Experience with large-scale robot, vehicle, or device health telemetry and familiarity with developing Big Data technologies like Databricks, Spark, Redshift
  • Experience with reliability engineering or related systems engineering fields
  • Experience in Autonomous Vehicle domain

CompensationThere are three major components to compensation for this position: salary, Amazon Restricted Stock Units (RSUs), and Zoox Stock Appreciation Rights. The salary range for this position is $217,000 to $360,000. A sign-on bonus may be offered as part of the compensation package. Compensation will vary based on geographic location and level. Leveling, as well as positioning within a level, is determined by a range of factors, including, but not limited to, a candidate’s relevant years of experience, domain knowledge, and interview performance. The salary range listed in this posting is representative of the range of levels Zoox is considering for this position. Zoox also offers a comprehensive package of benefits including paid time off (e.g. sick leave, vacation, bereavement), unpaid time off, Zoox Stock Appreciation Rights, Amazon RSUs, health insurance, long-term care insurance, long-term and short-term disability insurance, and life insurance.
About ZooxZoox is developing the first ground-up, fully autonomous vehicle fleet and the supporting ecosystem required to bring this technology to market. Sitting at the intersection of robotics, machine learning, and design, Zoox aims to provide the next generation of mobility-as-a-service in urban environments. We’re looking for top talent that shares our passion and wants to be part of a fast-moving and highly execution-oriented team.
Follow us on LinkedIn
AccommodationsIf you need an accommodation to participate in the application or interview process please reach out to

ac************@zo**.com











or your assigned recruiter.
A Final Note:You do not need to match every listed expectation to apply for this position. Here at Zoox, we know that diverse perspectives foster the innovation we need to be successful, and we are committed to building a team that encompasses a variety of backgrounds, experiences, and skills.



Source link

18Nov

Commercial Data Analyst – Georgia (They/She/Her) at Glovo – Tbilisi, Georgia


If you’re here, it’s because you’re looking for an exciting ride

A ride that will fuel up your ambitions to take on a new challenge and stretch yourself beyond your comfort zone. 

We’ll deliver a non-vanilla culture built on talent, where we work to amplify the impact on millions of people, paving the way forward together. 

 

Not your usual app. We are the fastest-growing multi-category app connecting millions of users with businesses, and couriers, offering on-demand services from more than 170,000 local restaurants, grocers and supermarkets, and high street retail stores. We operate in more than 1500 cities across 25 countries.

Together we revolutionize the way people connect with their everyday needs, from delivering essentials to connecting our ecosystem of users through innovative solutions powered by technology. For us, every day is filled with purpose.

What makes our ride unique? 

🤝 Our culture and strong values. 

💪 Our career development philosophy.  

🤝 Our commitment to being a force for good. 

 

We have a vision: To give everyone easy access to anything in their cities. And this is where your ride starts.

YOUR MISSION

Glovo is looking for a world class Data Analyst to join our Local, Partners & Brands team in Georgia. The team is deployed in the top cities in Georgia, leading our growth and working on delivering on our vision of giving everyone easy access to anything in every city in Georgia.

THE JOURNEY

  • Develop and own scalable and reliable reporting tools for the Local team, enabling the team to make decisions based on up to date and insightful data with a 360 view, including commercial, operational and financial KPIs.
  • Support the definition of targets and calculation of achievements both for the team and for single individuals, ensuring fair but challenging targets.
  • Track KPIs, identifying trends and factors that represent risks and opportunities, across areas such as Sales, Marketing & Growth, Operations, Profitability.
  • Analyze KPIs, evaluate historical data and trends, propose new thresholds or criteria, and recommend strategies to optimize the performance of these KPIs
  • Perform ad hoc analyses and deep dives as required, owning them end to end: frame problem, identify and gather data, perform analysis, present insights and results with appropriate visualizations.
  • Actively collaborate with your colleagues and learn from each other in a supportive environment that allows you to grow, develop and make a difference

WHAT YOU WILL BRING TO THE RIDE

  • At least 2 years of experience in an analytical role, working with data to generate insights and make the right decisions
  • Used to work with ETL processes and gather data from different sources (APIs, Excel and other databases)
  • Advanced worksheet / modeling skills
  • Expertise in SQL – Redshift experience a plus
  • Knowledge of visualization tools like Looker, Tableau, QlikView, etc..
  • Previous experience in Python and with analysis tools such as Pandas, SciPy, Scikit, Jupyter/iPython notebooks, R is a plus
  • Bachelor’s degree in a scientific or business discipline
  • Project management / coordination skills. Determined to get things done
  • You are eager to learn, hungry for knowledge, and want to work towards growing in your career from day one.
  • You are independent in solving problems within your capabilities and proactive in asking for support when the task is beyond your expertise.
  • You are not afraid of giving/receiving constant & honest feedback
  • Excellent spoken and written English.
  • Based in Georgia

 

Individuals representing diverse profiles, and abilities, encompassing various genders, ethnicities, and backgrounds, are less likely to apply for this role if they do not possess solid experience in 100% of these areas. Even if it seems you don’t meet our musts don’t let it stop you, we are all about finding the best talent out there! Skills can be learned, and embracing diversity is invaluable.

We believe driven talent deserves:

  • 💪 Top-notch private health insurance to keep you at your peak.
  • 🍔 Monthly Glovo credit to satisfy your cravings!
  • 🏊 Discounted gym memberships to keep you energized.
  • 🏖️ Extra time off, the freedom to work from home two days a week, and the opportunity to work from anywhere for up to three weeks a year!
  • 👪 Enhanced parental leave, and office-based nursery.
  • 🧠 Online therapy and wellbeing benefits to ensure your mental well-being.

Here at Glovo, we thrive on diversity, we believe it enhances our teams, products, and culture. We know that the best ideas come from a mashup of brilliant diverse minds. This is why we are committed to providing equal opportunities to talent from all backgrounds – all genders, racial/diverse backgrounds, abilities, ages, sexual orientations and all other unique characteristics that make you YOU. We will encourage you to bring your authentic self to work, fostering an inclusive environment where everyone feels heard. 

Feel free to note your pronouns in your application (e.g., she/her/hers, he/him/his, they/them/theirs, etc).

So, ready to take the wheel and make this the ride of your life? 

Delve into our culture by taking a peek at our Instagram and check out our Linkedin and website!





Source link

17Nov

Stage Ingénieur R&D : Mise au point d’un système d’exploitation des bases de résultats du CRC F/H at Framatome – France, Auvergne-Rhône-Alpes, Savoie (73)



Informations générales


Entité légale

Chez Framatome, filiale d’EDF, nous concevons et fournissons des équipements, des services, du combustible, et des systèmes de contrôle-commande pour les centrales nucléaires du monde entier.

Nos 18 000 collaborateurs permettent chaque jour à nos clients de produire un mix énergétique bas-carbone toujours plus propre, plus sûr et plus économique.

Nos équipes développent également des solutions pour les secteurs de la défense, de la médecine nucléaire et du spatial.

Implantée dans une vingtaine de pays, Framatome rassemble les expertises d’hommes et de femmes passionnés et convaincus que le nucléaire est une énergie d’avenir.

Entreprise responsable, nous développons des actions pour former et accompagner les premières expériences professionnelles (label Happy Trainees), intégrer tous les talents, dont les personnes en situation de handicap, œuvrer pour l’égalité professionnelle et la mixité de nos métiers (94/100 à l’index de l’égalité hommes-femmes) et concilier les temps de vie.

Pour suivre notre actualité, retrouvez-nous sur www.framatome.com, LinkedIn, Instagram et X.  


Référence

2024-17371  


Date de parution

23/10/2024

Description du poste


Métier

R – RECHERCHE & DEVELOPPEMENT – R3 – Réalisation de projets R&D


Intitulé du poste

Stage Ingénieur R&D : Mise au point d’un système d’exploitation des bases de résultats du CRC F/H


Emploi de rattachement

R31-005 Ingénieur R et D niv.1 (cl. F-11)


Contrat

Stage


Description de la BU

Au sein de Framatome, la Business Unit Combustible conçoit, fabrique et vend du combustible nucléaire pour les centrales de production d’électricité ainsi que pour les réacteurs de recherche.
La Direction des Opérations Composants (DOC) maîtrise toutes les étapes de la métallurgie du zirconium, du minerai jusqu’à la réalisation de composants en alliage de zirconium: produits plats, barres et tubes destinés à la fabrication du combustible nucléaire ; certains dérivés sont aussi utilisés dans l’industrie aéronautique ou médicale.
La Direction des Opérations Composants dispose d’une expertise reconnue et de la plus forte capacité de production au monde avec ses 5 usines (Jarrie, Ugine, Rugles, Montreuil-Juigné et Paimboeuf). Elle dispose également d’un Centre de Recherches spécialisé dans la métallurgie et les procédés de transformation des alliages de zirconium.


Description de la mission

Dans le cadre des projets de R&D sur les propriétés d’emploi des alliages de zirconium, vous créerez une application permettant d’exploiter les bases de données de résultats du Centre de Recherche Composants.

Différentes bases de données (Access, fichiers Excel…) existent, recensant des données obtenues le plus souvent lors d’essais en laboratoire. L’objectif du stage sera d’harmoniser le formalisme de ces bases de données et de créer un outil permettant d’en analyser le contenu du point de vue métier par la réalisation de croisement de données (exemple : rapports powerBI, interface Python etc.)


Profil

Diplôme préparé : Bac +4 / +5.

Domaine de compétence : Informatique / Programmation / Python

Localisation du poste


Localisation du poste

France, Auvergne-Rhône-Alpes, Savoie (73)


Site

Ugine


Déplacements

Non


Durée du contrat en mois

6


BU

FL – FLDOC

Critères candidat


Niveau d’études min. requis

Bac+4


Niveau d’expérience min. requis

Etudiant


Niveau d’emploi

Etudiant

Suivi par


Recruteur

Nadege DJOUKOUO

Informations additionnelles


Poste soumis à enquête administrative

Oui


Poste soumis à autorisation au titre du contrôle des exportations

Non



Source link

17Nov

Spoiler Alert: The Magic of RAG Does Not Come from AI | by Frank Wittkampf | Nov, 2024


Why retrieval, not generation, makes RAG systems magical

Quick POCs

Most quick proof of concepts (POCs) which allow a user to explore data with the help of conversational AI simply blow you away. It feels like pure magic when you can all of a sudden talk to your documents, or data, or code base.

These POCs work wonders on small datasets with a limited count of docs. However, as with almost anything when you bring it to production, you quickly run into problems at scale. When you do a deep dive and you inspect the answers the AI gives you, you notice:

  • Your agent doesn’t reply with complete information. It missed some important pieces of data
  • Your agent doesn’t reliably give the same answer
  • Your agent isn’t able to tell you how and where it got which information, making the answer significantly less useful

It turns out that the real magic in RAG does not happen in the generative AI step, but in the process of retrieval and composition. Once you dive in, it’s pretty obvious why…

* RAG = Retrieval Augmented Generation — Wikipedia Definition of RAG

RAG process — Illustration

A quick recap of how a simple RAG process works:

  1. It all starts with a query. The user asked a question, or some system is trying to answer a question. E.g. “Does patient Walker have a broken leg?”
  2. A search is done with the query. Mostly you’d embed the query and do a similarity search, but you can also do a classic elastic search or a combination of both, or a straight lookup of information
  3. The search result is a set of documents (or document snippets, but let’s simply call them documents for now)
  4. The documents and the essence of the query are combined into some easily readable context so that the AI can work with it
  5. The AI interprets the question and the documents and generates an answer
  6. Ideally this answer is fact checked, to see if the AI based the answer on the documents, and/or if it is appropriate for the audience

The dirty little secret is that the essence of the RAG process is that you have to provide the answer to the AI (before it even does anything), so that it is able to give you the reply that you’re looking for.

In other words:

  • the work that the AI does (step 5) is apply judgement, and properly articulate the answer
  • the work that the engineer does (step 3 and 4) is find the answer and compose it such that AI can digest it

Which is more important? The answer is, of course, it depends, because if judgement is the critical element, then the AI model does all the magic. But for an endless amount of business use cases, finding and properly composing the pieces that make up the answer, is the more important part.

The first set of problems to solve when running a RAG process are the data ingestion, splitting, chunking, document interpretation issues. I’ve written about a few of these in prior articles, but am ignoring them here. For now let’s assume you have properly solved your data ingestion, you have a lovely vector store or search index.

Typical challenges:

  • Duplication — Even the simplest production systems often have duplicate documents. More so when your system is large, you have extensive users or tenants, you connect to multiple data sources, or you deal with versioning, etc.
  • Near duplication — Documents which largely contain the same data, but with minor changes. There are two types of near duplication:
    — Meaningful — E.g. a small correction, or a minor addition, e.g. a date field with an update
    — Meaningless — E.g.: minor punctuation, syntax, or spacing differences, or just differences introduced by timing or intake processing
  • Volume — Some queries have a very large relevant response data set
  • Data freshness vs quality — Which snippets of the response data set have the most high quality content for the AI to use vs which snippets are most relevant from a time (freshness) perspective?
  • Data variety — How do we ensure a variety of search results such that the AI is properly informed?
  • Query phrasing and ambiguity — The prompt that triggered the RAG flow, might not be phrased in such a way that it yields the optimal result, or might even be ambiguous
  • Response Personalization — The query might require a different response based on who asks it

This list goes on, but you get the gist.

Short answer: no.

The cost and performance impact of using extremely large context windows shouldn’t be underestimated (you easily 10x or 100x your per query cost), not including any follow up interaction that the user/system has.

However, putting that aside. Imagine the following situation.

We put Anne in room with a piece of paper. The paper says: *patient Joe: complex foot fracture.* Now we ask Anne, does the patient have a foot fracture? Her answer is “yes, he does”.

Now we give Anne a hundred pages of medical history on Joe. Her answer becomes “well, depending on what time you are referring to, he had …”

Now we give Anne thousands of pages on all the patients in the clinic…

What you quickly notice, is that how we define the question (or the prompt in our case) starts to get very important. The larger the context window, the more nuance the query needs.

Additionally, the larger the context window, the universe of possible answers grows. This can be a positive thing, but in practice, it’s a method that invites lazy engineering behavior, and is likely to reduce the capabilities of your application if not handled intelligently.

As you scale a RAG system from POC to production, here’s how to address typical data challenges with specific solutions. Each approach has been adjusted to suit production requirements and includes examples where useful.

Duplication

Duplication is inevitable in multi-source systems. By using fingerprinting (hashing content), document IDs, or semantic hashing, you can identify exact duplicates at ingestion and prevent redundant content. However, consolidating metadata across duplicates can also be valuable; this lets users know that certain content appears in multiple sources, which can add credibility or highlight repetition in the dataset.

# Fingerprinting for deduplication
def fingerprint(doc_content):
return hashlib.md5(doc_content.encode()).hexdigest()

# Store fingerprints and filter duplicates, while consolidating metadata
fingerprints = {}
unique_docs = []
for doc in docs:
fp = fingerprint(doc['content'])
if fp not in fingerprints:
fingerprints[fp] = [doc]
unique_docs.append(doc)
else:
fingerprints[fp].append(doc) # Consolidate sources

Near Duplication

Near-duplicate documents (similar but not identical) often contain important updates or small additions. Given that a minor change, like a status update, can carry critical information, freshness becomes crucial when filtering near duplicates. A practical approach is to use cosine similarity for initial detection, then retain the freshest version within each group of near-duplicates while flagging any meaningful updates.

from sklearn.metrics.pairwise import cosine_similarity
from sklearn.cluster import DBSCAN
import numpy as np

# Cluster embeddings with DBSCAN to find near duplicates
clustering = DBSCAN(eps=0.1, min_samples=2, metric="cosine").fit(doc_embeddings)

# Organize documents by cluster label
clustered_docs = {}
for idx, label in enumerate(clustering.labels_):
if label == -1:
continue
if label not in clustered_docs:
clustered_docs[label] = []
clustered_docs[label].append(docs[idx])

# Filter clusters to retain only the freshest document in each cluster
filtered_docs = []
for cluster_docs in clustered_docs.values():
# Choose the document with the most recent timestamp or highest relevance
freshest_doc = max(cluster_docs, key=lambda d: d['timestamp'])
filtered_docs.append(freshest_doc)

Volume

When a query returns a high volume of relevant documents, effective handling is key. One approach is a **layered strategy**:

  • Theme Extraction: Preprocess documents to extract specific themes or summaries.
  • Top-k Filtering: After synthesis, filter the summarized content based on relevance scores.
  • Relevance Scoring: Use similarity metrics (e.g., BM25 or cosine similarity) to prioritize the top documents before retrieval.

This approach reduces the workload by retrieving synthesized information that’s more manageable for the AI. Other strategies could involve batching documents by theme or pre-grouping summaries to further streamline retrieval.

Data Freshness vs. Quality

Balancing quality with freshness is essential, especially in fast-evolving datasets. Many scoring approaches are possible, but here’s a general tactic:

  • Composite Scoring: Calculate a quality score using factors like source reliability, content depth, and user engagement.
  • Recency Weighting: Adjust the score with a timestamp weight to emphasize freshness.
  • Filter by Threshold: Only documents meeting a combined quality and recency threshold proceed to retrieval.

Other strategies could involve scoring only high-quality sources or applying decay factors to older documents.

Data Variety

Ensuring diverse data sources in retrieval helps create a balanced response. Grouping documents by source (e.g., different databases, authors, or content types) and selecting top snippets from each source is one effective method. Other approaches include scoring by unique perspectives or applying diversity constraints to avoid over-reliance on any single document or perspective.

# Ensure variety by grouping and selecting top snippets per source

from itertools import groupby

k = 3 # Number of top snippets per source
docs = sorted(docs, key=lambda d: d['source'])

grouped_docs = {key: list(group)[:k] for key, group in groupby(docs, key=lambda d: d['source'])}
diverse_docs = [doc for docs in grouped_docs.values() for doc in docs]

Query Phrasing and Ambiguity

Ambiguous queries can lead to suboptimal retrieval results. Using the exact user prompt is mostly not be the best way to retrieve the results they require. E.g. there might have been an information exchange earlier on in the chat which is relevant. Or the user pasted a large amount of text with a question about it.

To ensure that you use a refined query, one approach is to ensure that a RAG tool provided to the model asks it to rephrase the question into a more detailed search query, similar to how one might carefully craft a search query for Google. This approach improves alignment between the user’s intent and the RAG retrieval process. The phrasing below is suboptimal, but it provides the gist of it:

tools = [{ 
"name": "search_our_database",
"description": "Search our internal company database for relevent documents",
"parameters": {
"type": "object",
"properties": {
"query": {
"type": "string",
"description": "A search query, like you would for a google search, in sentence form. Take care to provide any important nuance to the question."
}
},
"required": ["query"]
}
}]

Response Personalization

For tailored responses, integrate user-specific context directly into the RAG context composition. By adding a user-specific layer to the final context, you allow the AI to take into account individual preferences, permissions, or history without altering the core retrieval process.



Source link

17Nov

Reference Data Specialist at JPMorgan Chase & Co. – Bengaluru, Karnataka, India


You are a strategic thinker passionate about driving solutions in Static team (Receivables). You have found the right team

Job Responsibilities

  • Task Execution: Each staff member is responsible for executing assigned tasks, which include setting up client records, maintaining work logs, and investigating exceptions. This requires attention to detail and organizational skills.
  • Data Management: Responsible for accurately adding and updating client information in various systems and mainframes. This involves ensuring data integrity and accuracy.
  • Workflow Management: Manage the workflow of Business As Usual (BAU) tasks, handling requests from Customer Service Officers (CSOs), the Implementation Team, and other internal departments. This requires effective time management and prioritization skills.
  • Timely Processing: Ensure that work is processed within established timeframes and meets necessary controls and targets. This involves adhering to deadlines and quality standards.
  • Communication: Interact with Lines of Business and internal clients, including senior personnel, via telephone or email. Strong communication skills are essential for this aspect of the role.
  • Query Resolution: Respond to queries received via email and prioritize resolving client issues. This requires problem-solving skills and a customer-focused approach.
  • Project Work: Work on various BAU projects as assigned, ensuring timely completion. This involves flexibility and the ability to manage multiple tasks.
  • Process Improvement: Identify ways to improve current work practices and handle any required backup tasks related to production. This requires a proactive approach to process optimization.
  • Investigation: Conduct extensive investigations at the transaction and aging level. This involves analytical skills and attention to detail.
  • Team Contribution: While the role has individual responsibilities, there is an expectation to contribute to the team’s goals and Service Level Agreements (SLAs).

Required qualifications, capabilities and skills

  • Strong English written and verbal communication skills are essential. This is crucial for effective interaction with clients and colleagues, as well as for documentation and reporting.
  • A graduate with 0-1 years of experience in a bank operations environment is required. While this is an entry-level position, knowledge of cash and trade products is advantageous, suggesting a preference for candidates with some familiarity with banking products.
  • Ability to analyze information and pay close attention to detail is important. This ensures accuracy and thoroughness in handling banking operations.
  • Being highly organized and able to meet deadlines are key strengths. This is important for managing tasks efficiently in a fast-paced environment.
  • A high level of client centricity, sound communication, and solution orientation are required. This indicates a focus on providing excellent service and resolving client issues effectively.
  • Technical Skills: Excellent PC skills, particularly proficiency with the MS Office suite, especially MS Excel, are necessary. This suggests that data analysis and reporting are part of the role.
  • Excellent interpersonal skills are needed to work effectively with colleagues and clients, fostering positive relationships.
  • The ability to ask questions and pursue open issues assertively until resolution is important. This requires a proactive and professional approach to problem-solving.
  • Willingness to work in WHEM Shift (7:30 PM to 4:30 AM).

JPMorganChase, one of the oldest financial institutions, offers innovative financial solutions to millions of consumers, small businesses and many of the world’s most prominent corporate, institutional and government clients under the J.P. Morgan and Chase brands. Our history spans over 200 years and today we are a leader in investment banking, consumer and small business banking, commercial banking, financial transaction processing and asset management.

We recognize that our people are our strength and the diverse talents they bring to our global workforce are directly linked to our success. We are an equal opportunity employer and place a high value on diversity and inclusion at our company. We do not discriminate on the basis of any protected attribute, including race, religion, color, national origin, gender, sexual orientation, gender identity, gender expression, age, marital or veteran status, pregnancy or disability, or any other basis protected under applicable law. We also make reasonable accommodations for applicants’ and employees’ religious practices and beliefs, as well as mental health or physical disability needs. Visit our FAQs for more information about requesting an accommodation.



Source link

17Nov

Stage Data Engineer H/F at Devoteam – Nantes, France


Description de l’entreprise

A Cloud, l’entité AWS du groupe Devoteam, est un cabinet de conseil de premier plan dans l’accompagnement des entreprises vers une adoption réussie du Cloud. Partenaire stratégique d’AWS depuis 2013, nous aidons nos clients à devenir des entreprises centrées sur la donnée en tirant profit de la puissance du cloud. Pour atteindre cet objectif, la data engineering joue un rôle clé. 

En construisant et en optimisant les pipelines de données, les data engineers permettent aux entreprises d’exploiter pleinement leur potentiel analytique. Leur rôle crucial dans la collecte, le stockage et le traitement des données garantit une base solide pour les décisions stratégiques et l’innovation continue chez nos clients.

Description du poste

Afin de renforcer nos équipes Data à l’Ouest, nous sommes à la recherche d’un consultant Data Engineer en stage souhaitant rejoindre un collectif de 30 consultants spécialisés en data et cloud engineering à Nantes.

👉🏻 Missions

  • Monter en compétence sur les stacks data de nos clients, accompagné par nos consultant·e·s plus séniors,

  • Collaborer avec le métier pour concevoir des transformations data pertinentes pour le business,

  • Industrialiser et gérer des pipelines de traitement de données en production,

  • Participer à l’implémentation d’évolutions des plateformes data de nos clients,

  • Participer aux cérémonies Agiles chez les clients sous la direction du Product Owner.

👉🏻 Déroulé du stage

Votre objectif premier sera de monter en compétence. Pour ce faire, nous pratiquons la méthodologie Learning by doing

Durant ces 6 mois de stage, vous aurez : 

  • 1 jour par semaine dédié à l’auto formation par le passage de certifications sur certaines technologies mentionnées précédemment. 

  • 4 jours par semaine dédiés à la réalisation de tâches pour un projet client ciblé. 

Afin de cadrer votre montée en compétence, un tuteur assurera votre suivi dans vos missions quotidiennes. 

Qualifications

👉🏻 Compétences 

En tant que Stagiaire Data Engineer, vous possédez une ou plusieurs des compétences suivantes : 

  • Le langage SQL

  • Le langage objet (Python, JAVA, Scala)

  • Un framework de calcul distribué (Spark, Databricks)

  • L’intégration continue (Git, SonarQube, Jenkins)

  • L’infrastructure as code (Terraform, Github action)

  • Un cloud provider (AWS, GCP, Azure)

👉🏻 Profil recherché 

Nous recherchons un.e étudiant.e en dernière année de cycle ingénieur ou équivalent Bac +5.

👉🏻 Rémunération

Nous rémunérons nos consultants en stage de fin d’étude à hauteur de 1500€ brut par mois.

Informations supplémentaires

👉🏻 Avantages

Pour faire de notre collaboration un succès, nous vous proposons :

  • 🤓 Un compte AWS dédié par consultant pour expérimenter au gré de votre imagination

  • 📈 Une trajectoire sur mesure que nous élaborons ensemble

  • 🎓 Des accès gratuits et illimités à des plateformes E-learning pour monter en compétences sur les technos que vous jugez prometteuses (Udemy / Cloud Guru / WhizLabs)

  • 🏡 Une politique de télétravail flexible avec remboursement de certains frais

  • 💻 Laptop au choix (Mac ou Windows)

👉🏻 Process de recrutement

Le processus de recrutement se déroule en trois étapes :

  1. Un premier contact téléphonique pour échanger sur les modalités du poste.

  2. Un échange RH pour apprendre à vous connaître et comprendre votre projet professionnel, ainsi que pour vous présenter l’entreprise et notre vision.

  3. Un échange technique pour challenger vos compétences et votre adéquation pour ce poste.

  4. Un entretien avec le Tribe Manager et le commerce pour se projeter ensemble sur votre trajectoire professionnelle.

 

Ce qui fait la différence chez Devoteam, c’est notre façon de :

  • Partager ouvertement, largement et délibérément les informations.

  • Encourager les prises de décision autonomes de la part des collaborateurs et collaboratrices.

  • Être particulièrement sincères les uns avec les autres.

  • Ne collaborer sur le long terme qu’avec des collaborateurs et collaboratrices hautement compétent·es et ayant un impact positif sur le collectif.

  • Toujours rechercher une “meilleure façon” de faire les choses.

  • Être ouvert·e d’esprit aux idées changeantes et aux approches nouvelles.

Devoteam s’engage à promouvoir la diversité et est fier de favoriser l’égalité des chances au sein de l’entreprise. Chaque candidature est considérée sans tenir compte de l’origine, de la couleur de peau, de la religion, du genre, de l’identité de genre, de l’orientation sexuelle, du handicap, des caractéristiques génétiques ou de l’âge.

Parce que nous voulons que le savoir soit utile au plus grand nombre, nous croyons à l’inclusion de toutes et tous.



Source link

Protected by Security by CleanTalk