(Un)leashed potential of AI in Government
Rosa Marie Keller

As the world is being reshaped through the rise of advanced AI technologies, governments seek their place in the arena. This talk explores how German government institutions adapt to these changes by focusing on three key areas of action: Adoption, Regulation, and Up-/Reskilling.

525 days working full-time on FOSS: lessons learned
Rodrigo Girão Serrão

Rodrigo has been working full-time on FOSS for 525. Join him as he shares some of the key lessons he learned during that time.

A conceptual and practical introduction to Hilbert Space Gaussian Process (HSGP) approximation methods
Dr. Juan Orduz

In this talk, we explore a new method to approximate Gaussian processes using spectral analysis methods, known as the Hilbert Space Gaussian process (HSGP) approximation.

A Retrieval Augmented Generation system to query the scikit-learn documentation
Guillaume Lemaitre

A Retrieval Augmented Generation system to query the scikit-learn documentation

Advanced Observability with OpenTelemetry and Python
Anton Caceres

Facing observability challenges in Python microservices? 🐍 ☁️ Learn how #OpenTelemetry can streamline your system monitoring in cloud environments. We're diving into its Python SDK integration, showcasing real-world implementation for enhanced efficiency.

AI, SQL, and GraphQL Walk into a Fertility Clinic… LLM-based Medical feature development
Shirli Di-Castro Shashua

Excited to share my journey at the intersection of healthcare and AI! Join me as I unveil the game-changing 'chatting with my medical database' feature, powered by LLMs. Discover how AI, SQL, and GraphQL team up to revolutionize doctors' access to crucial patient data.

Analyzing COVID-19 Protest Movements: A Multidimensional Approach Using Geo-Social Media Data
Nefta Kanilmaz

Understanding protest movements through multi-dimensional geo-social media data analysis.

Async Awaits: Mastering Asynchronous Python in FastAPI
Bojan Miletic

Gear up for 'Async Awaits: Mastering Asynchronous Python in FastAPI' 🚀. Discover the power of async/await in #Python and learn how to supercharge your web apps with #FastAPI. Perfect for developers eager to excel in modern web development! 🌐 #TechTalk #AsyncPython #WebDev

AsyncApp. My contribution to hype Pythons asyncio a bit more
Jens Nie

Asyncio based code is still not the go-to solution for many when starting new Python projects. Let me show you a simple and attractive asyncio based approach for console applications to address that.

Automate Distributed ML Pipelines with DVC and Ray for Computer Vision and Generative AI Applications
Mikhail Rozhkov

Join our talk at PyCon DE & PyData Berlin to master 'Automating Distributed ML Pipelines with DVC & Ray' for advanced applications in Computer Vision and Generative AI.

Best of both worlds - How we built an AI-aided content creation tool for language learning
Hector Hernandez, Pascal Wenker

Dive into the fusion of human intelligence & AI at Babbel! Explore our journey in crafting an AI-aided content creation tool for personalized language learning. #LanguageLearning #AI #LLMs

Better safe than sorry: Threat Modeling for Python Developers
Clemens Hübner

Is your code secure enough? Find out by doing Threat Modeling!

Beyond Deployment: Exploring Machine Learning Inference Architectures and Patterns
Tim Elfrink

"Beyond Deployment: Exploring Machine Learning Inference Architectures and Patterns" - uncover the ML inference strategies that power StepStone's success and learn to scale your models with confidence!

Boost your app to Flash speed by mastering performance tricks
Laysa Uchoa, Yuliia Barabash

If you're keen to move beyond basic optimizations and truly understand what happens under Python's hood during application execution, this session is for you.Boost your app to Flash speed by mastering performance tricks by @laysauchoa and @yuliia_barabash

Breaking AI Boundaries: Fairness Metrics in Unstructured Data Domains
Daniel Klitzke

Exploring the need for fairness in machine learning in indirect human impact areas, proposing solutions for challenges in unstructured data.

Bridging the worlds: pixi reimplements pip and conda in Rust
Wolf Vollprecht, Ruben Arts

pixi is an awesome new package manager that combines the power of conda & pip.

Building accessible documentation sites
Dr. Tania Allard

Come learn how you can make your tools and documentation more accessible and usable by disabled end-users and contributors

Building Professional Voice AI with Vocode
Lev Konstantinovskiy

Meet Vocode, an open-source framework for AI voice agents. We'll cover its integration of speech APIs, LLMs, and conversation etiquette in real-world applications. #OpenSource #AI #VoiceAgents

Can ChatGPT convince you to get a COVID19 vaccine? Comparing ChatGPT to an expert system - which one is more convincing?
Dr. Lisa Andreevna Chalaguine

Comparison between ChatGPT and Domain-Specific Expert System - which is more convincing in getting people to vaccinate against COVID-19?

Content Recommendation with Graphs: From Basic Walks to Neural Networks
Dr. Mirza Klimenta

Content Recommendation with Graphs: From Basic Walks to Neural Networks

Data valuation for machine learning
Miguel de Benito Delgado

pyDVL is the library for data valuation in machine learning. Use it to clean, prune and select your data to improve model performance.

Deploying your Python application to Android
Shyamnath Premnadh

Current state of deploying Python applications to Android with a comparison of the current available frameworks.

Drive performance with Automation : Unleash the Power of Python CDK on AWS for Infrastructure Transformation
Amogha Kancharla

Deploy your Infrastructure error-free on AWS using Python CDK.

Encoding Charactersets - may the force be with you
Martin Hoermann

Understanding and repairing garbled text (Mojibake) with Python

Enhance your balcony power plant with Python
Jannis Lübbe

Improve the efficiency of your balcony power plant with Python

Everything you need to know about change-point detection
Charles Truong

How do you detect an activity change from smartwatch data, abrupt climate transitions, or server failures? If you work with long time series, you will inevitably have to detect changes. This talk describes how to do that using ruptures (https://github.com/deepcharles/ruptures).

Exploring Zarr: From Fundamentals to Version 3.0 and Beyond
Sanket Verma

Hi all! I’ll discuss Zarr, an open-source data format for storing chunked, compressed N-dimensional arrays. We’ll explore the Zarr ecosystem from fundamentals to V3.0 and beyond. If you’re interested in storing massive datasets, please attend my talk. Thanks!

Flix CitySnap: How we use GenAI and not only to collect captivating images for cities and confirm their locations
Andrei Chernov

Flix’s buses operate in more than 5,000 cities around the world. In this talk, we will demonstrate how we leverage state-of-the-art models, including Generative AI, to automatically collect images for thousands of cities and verify their locations. Our comprehensive end-to-end pi

From idea to production in a day: Leveraging AzureML and Streamlit to build and user test machine learning ideas quickly
Florian Roscheck

How to leverage AzureML, automated machine learning, and Streamlit to build and test machine learning apps quickly? Find out about our favorite Hackathon stack and walk away with some code to build and user-test your own machine learning ideas fast.

From LLM as oracle to LLM as translator - our journey from theory to everyday’s practice in a corporate setting with dmGPT (and python)
Emma Haley, Niklas Lederer

Learn how dm-drogeriemarkt put LLMs in production and implemented a day-to-day assistant for everyone.

Going beyond Parquet's default settings – be surprised what you can get
Uwe L. Korn

Only ever used pandas.to_parquet? Would you like to know what it does and how you could make it even more efficient? Find out about Parquet's newest features in this talk.

Green Software Engineering

Green Software Engineering and the concept of sustainability

High Performance Data Visualization for the Web
Tim Paine

Building a high performance streaming data website with Perspective

How Python helped us uncover secrets of protein motion
Zoran Štefanić, Boris Gomaz

Uncovering protein motion by leveraging awsome Python tools.

How to Do Monolingual, Multilingual, and Cross-lingual Text Classification in April, 2024
Daryna Dementieva

If I want a text classifier in 2024, what should I choose -- LLMs or pre-LLM era classifier? Is the answer the same for English and other languages? We will provide the recipe how to find your classifier depending on the target language and data availability.

How to embrace your Leadership role as a Data Nerd (or other creative types)
Paula Gonzalez Avalos

Is it possible to find a Leadership role rewarding as a creative hands-on work driven person? In this talk I’ll talk about my journey from Data Scientist into management and how - after a transition period - I learned to embraced and ultimately really like the new Head roles.

How to Improve the Python Development Experience for Millions of Ubuntu Users
Jürgen Gmach

Have you ever tried to install a different Python version on Ubuntu or tried to upgrade your current one? This talk will explain why this is hard, and introduce the available options in-depth, and give an outlook what Ubuntu could do in the future to make our lives easier.

I achieved peak performance in python, here's how ...
Dishant Sethi

In the ever-evolving landscape of software development, crafting code that not only functions flawlessly but also operates at peak performance is a skill that sets exceptional developers apart. This talk delves into the art of optimizing Python code, exploring techniques and stra

Improve LLM-based Applications with Fallback Mechanisms
Bilge Yücel

RAG handles common issues in LLM applications, but a dependable system requires one more step: a fallback mechanism. Explore the implementation of LLM applications with diverse fallback techniques using Haystack in Bilge's insightful talk.

Improving LLM Math Reasoning with Agent-Based Systems
Chris Hoge

Learn how agent-based reasoning can improve baseline performance on LLMs on math benchmarks by 30%.

Is GenAI All You Need to Classify Text? Some Learnings from the Trenches
Marc Palyart, Kateryna Budzyak

GenAI is sometimes touted as the panacea for all natural language processing (NLP) tasks. This presentation explores a practical text classification scenario at Malt, highlighting the practical hurdles encountered when employing GenAI and how we overcame these obstacles.

Jupyter Notebooks for Print Media
Tim Paine

Jupyter Notebooks as a platform to create books, magazine and newspaper articles, and other print media

Lessons learned from deploying Machine Learning in an old-fashioned heavy industry
Robert Meyer

Cement is responsible for about 8% of worldwide carbon emissions. Let me tell you about lessons learned decarbonizing the industry with Machine Learning.

Leveraging the Art of Parallel Unit Testing in Django
Azan Bin Zahid, Syed Ansab Waqar Gillani

Unlocking the power of parallel unit testing with Python and Django! 🚀

Machine Learning on microcontrollers using MicroPython and emlearn
Jon Nordby

Deploy ML models to microcontrollers - using just the Python you already know! A practical presentation on how to use the emlearn Machine Learning package and MicroPython to build smart sensor systems.

Marketing Media Mix Models with Python & PyMC: a Case Study
Emanuele Fabbiani

Discover how Italy's fastest-growing tour operator unlocked transformative marketing insights using Bayesian models, domain knowledge, Python, and PyMC. Gain valuable tips to develop similar models for your business.

Missing Data, Bayesian Imputation and People Analytics with PyMC
Nathaniel Forde

Hierarchical structures are everywhere in business! Ever wondered how trickle-down management missteps drive non-response bias in Employee Engagement? Model the hierarchy, model the missing-ness with PyMC!

Mojo 🔥 - Is it Python's faster cousin or just hype?
Jamie Coombes

"Chris Lattner's Mojo promised to revolutionize AI dev with 68k times speed & Python ease. One year later, we dissect its reality—can it outshine Rust & Julia, or is it just hype? #PyData #MojoLanguage #PythonCousin"

Mostly Harmless Fixed Effects Regression in Python with PyFixest
Alexander Fischer

"Discover PyFixest, a Python library inspired by R's 'fixest'! 🐍📊 It speeds up regression model estimation with high-dimensional fixed effects, offering tools for robust inference and efficient post-processing. Perfect for AB Tests and event studies! #Python #DataScience #PyDat

Moving from Offline to Online Machine Learning with River
Tun Shwe

Learn the differences between online and offline ML and get started on your online ML journey today with River, an open source Python ML library

Next Stop: Insights! How Streamlit and Snowflake Power Up Data Stories
Marie-Kristin Wirsching

Data stories are the bridge between complex data insights and business impact! Transforming data into clear, actionable narratives is no easy task. That's where Streamlit and Snowflake come in - a duo for creating visually engaging, interactive data applications.

No more robots, no more glowing brains. It's time for Better Images of AI.
rens dimmendaal

No more robots. No more glowing brains. It's time for Better Images of AI

Pandas + Dask DataFrame 2.0 - Comparison to Spark, DuckDB and Polars
Florian Jetter, Patrick Hoefler

Dask DataFrame is fast now - The re-implementation of DataFrames in Dask is fast, reliable and fun.

Personalizing Carousel Ranking on Wolt's Discovery Page: A Hierarchical Multi-Armed Bandit Approach
Marcel Kurovski, Steffen Klempau

Personalizing Carousel Ranking on Wolt's Discovery Page with a Hierarchical Multi-Armed Bandit Approach

Polars and Time Series: what it can do, and how to overcome any limitation
Marco Gorelli

Learn how to use Polars for time series: what it does, and it doesn't do (and what to do about that!)

Power structures. The fair advantage
Anja Kunkel

Humans are complex. As developers, we wanna ignore that ... but to do our job right, we cannot.

Public Money, Public Experiment - open source processes in the public administration
Lisa Reiber

Imagine a data lab in a federal ministry wants to publish code and share it - how long could it possibly take? Take a guess and come to the talk to find out.

Put your RAG to the test: Component-per-component evaluation of our LLM-powered airplane manufacturing assistant
Nataliia Kees

This talk discusses the topic of component-wise evaluation of RAG-based applications on the example of a digital shopfloor assistant developed at Airbus using open source Python libraries paired with Google Vertex AI.

PyCon Community Backstage: A Decade of Growth and Lessons Learned
Alexander CS Hendorf

A joyful journey through a decade in the Python community: contributing, leadership, magic, personal and professional growth!

Python 3.12's new monitoring and debugging API
Johannes Bechberger

Python 3.12 got a new debugging and monitoring API. Learn in this talk why it will change debugging forever.

Python Monorepos: The Polylith Developer Experience
David Vujic

What if writing software would be more like building with LEGO bricks, and have a more playful developer experience. Polylith solves this in a nice and simple way. I’ll walk through the simple Architecture & the Developer friendly tooling for a joyful Python Experience.

RAG for a medical company: the technical and product challenges
Noé Achache

While developing a Proof-Of-Concept RAG is widely accessible, creating a performant version that truly adds value remains a challenge. We willl share our learnings from building a RAG for a medical company, aiding doctors with drug documentation.

Reinforcement Learning: Bridging The Gap Between Research and Applications
Michael Panchenko

Reinforcement learning (RL) has untapped potential for industry. This talk presents Tianshou, an open-source library with interfaces facilitating both industrial RL applications and new algorithm research, with the dual goals of accelerating progress and adoption.

Replacing Callbacks with Generators: A Case Study in Computer-Assisted Live Music
Matthieu Amiguet

How we made our code more readable by replacing intricated callback-based code with much more readable generators. Also a great example of using python in an unexpected domain: realtime audio processing for live music!

Robust Configuration Management with Pydantic's Data Validation
Philipp Stephan

How Pydantic's strong data validation based on type annotations can help build a strict spec for your configuration format, catch misconfiguration early, and mitigate the aforementioned problems with a non-formalized configuration management system.

Safeguarding Privacy and Mitigating Vulnerabilities: Navigating Security Challenges in Generative AI
John Robert

How to protect and secure your data will using LLM and Generative AI. Your data privacy and security is importance.

Select ML from Databases
Gregor Bauer

Select ML from Databases: New workflow for building your machine learning models using the capabilities of modern databases

Streamlining Python Development: A Guide to a Modern Project Setup
Florian Wilhelm

🚀 Streamlining Python Development: A Guide to a Modern Project Setup. We'll explore tools like Hatch, mypy, and ruff, and dive into efficient project setups. Perfect for Python beginners!

Streamlining Python Development: A Practical Approach to CI/CD with GitHub Actions
Artem Kislovskiy

Learn how continuous integration/delivery boosts project resilience to Python updates and packaging changes. Automate for peace of mind, better code, and seamless collaboration.

Tackling the Cold Start Challenge in Demand Forecasting
Alexander Meier, Daria Mokrytska

Exploring the Cold Start problem in Demand Forecasting. Overcoming difficulties faced by Time Series and ML models. Uncover practical techniques and a systematic evaluation framework for effective forecasting.

Tailored and Trending: Key learnings from 3 years of news recommendations
Dr. Christian Leschinski

Diving into the world of recommendations! Learn how we overcome the special challenges of recommending news at Axel Springer NMT by using simple statistics.

That’s it?! Dealing with unexpected data problems
Simon Pressler

That’s it?! How to deal with unexpected data quality and quantity issues

The AI Revolution Will Not Be Monopolized: How open-source beats economies of scale, even for LLMs
Ines Montani

Are we heading further into a black box era with larger and larger models, obscured behind APIs controlled by big tech monopolies? I don’t think so, and in this talk, I’ll show you why.

The evolution of Feature Stores
Olamilekan Wahab

Feature Stores have become an important component of the machine learning lifecycle. They have been particularly pivotal in bridging the gap between data engineering and machine learning workflows(experimentation, training and serving). This talk will explore Feature Stores with

The key to reliability - Testing in the field of ML-Ops
Gunar Maiwald, Tobias Senst

idealo.de presents its holistic approach for testing in machine learning

The pragmatic Pythonic data engineer
Robson Junior

Learn to make practical decisions in data engineering with Python's vast ecosystem. Avoid blindly following market guidelines and consider the reality of your situation for better performance and architecture

The Struggles We Skipped: Data Engineering for the TikTok Generation
Anuun, Hiba Jamal

A new wave in data engineering! From tangled tasks to sleek, plug-and-play magic in data pipelines. 🚀

There is a Better Way to Automate and Manage Your (Fluid) Simulations
Julian Wagenschütz

Exploring the integration of Python into Computer Aided Engineering (CAE) workflows: While shell scripts are ubiquitous, they face challenges in CAE, particularly in Computational Fluid Dynamics (CFD). Python + DVC provides a robust alternative to manage simulations at scale.

Unleashing Confidence in SQL Development through Unit Testing
Tobias Lampert

Confidently ship changes to your SQL data model by validating logic with a SQL unit testing framework. Our framework, powered by pytest, ensures robust deployments, making data model evolution a breeze.

Unlock the Power of Dev Containers: Build a Consistent Python Development Environment in Seconds!
Thomas Fraunholz

Unlock the Power of Dev Containers: Say goodbye to the hassle and build a Consistent Python Development Environment in Seconds!

Using LLMs to Create Knowledge Graphs From a Large Corpus of Parliamentary Debates

This talk demonstrates how we can intuitively analyze political debates using knowledge graphs created using LLMs.

When and how to start coding with kids
Anna-Lena Popkes

Have you always wondered when and how to start coding with your kid? This talk will shed light on all your questions, giving concrete advice on how to get started.

Which kind of software tests do I really need?
Pascal Puchtler

Explore a variety of software testing methodologies, from Manual and A/B Testing to Unit and Performance Tests. Learn how to make informed decisions for enhanced software delivery, matching the unique needs of your projects.

Whispered Secrets: Building An Open-Source Tool To Live Transcribe & Summarize Conversations
John Sandall

🕵️ Calling all Spythonistas: Do you need a live speech transcription and summarization "secret agent" that works offline by running on your own hardware? Learn about the latest trends in open-source GenAI tools and how to build your own in this light-hearted talk.

Would you rely on ChatGPT to dial 911? A talk on balancing determinism and probabilism in production machine learning systems
Nicolas Guenon des Mesnards

Combining deterministic and probabilistic models to boost ML system robustness. Learn their benefits and applications in AI, backed by NLP case studies. #AIInnovation #MLTech #RobustAI

You shall not pass! 🧙 Strengthen your python code against attacks.
Antonia Scherz, Roman Krafft

You shall not pass! Make your Python code strong against attacks.

Your Model _Probably_ Memorized the Training Data
Katharine Jarmul

So, just how much data did ChatGPT memorize? Let's find out!

µDjango, an asynchronous microservices technique.
Maxim Danilov

Django - the new trend in creating asynchronous Python microservices.

🌳 The taller the tree, the harder the fall. Determining tree height from space using Deep Learning and very high resolution satellite imagery 🛰️
Ferdinand Schenck

🌳 The taller the tree, the harder the fall. Measuring tree height from space using Deep Learning 🛰️