Put your RAG to the test: Component-per-component evaluation of our LLM-powered airplane manufacturing assistant

Nataliia Kees

Monday 16:10 in A1

Type/Track Talk pydata-generative-ai

Your RAG-powered LLM application might look pretty convincing at first glance, but how do you really know if it’s any good? And how do you justify the design choices you make? In this talk, you will learn about the RAG evaluation concept we produced at Airbus for evaluating the components of our digital engineering assistant, its implementation with open source tools paired with Google Vertex AI, and what we learnt in the process.

Level Domain Expertise Intermediate Python Skill Level Novice

Nataliia Kees

Affiliation: Airbus GmbH

I am a Data Scientist at Airbus, where I am a part of the team Digital, building AI products which empower engineering, manufacturing, sales and other business activities of the company. I enjoy diving deep into natural language processing and am passionate about MLOps, good coding practices and deploying AI applications in the cloud. Apart from that, I teach Python, and in my free time, I enjoy hiking and learning new languages.

visit the speaker at: Homepage