Domain Expertise Intermediate | PyConDE & PyData Berlin 2024

Talk general-ethics-privacy

(Un)leashed potential of AI in Government

Rosa Marie Keller

As the world is being reshaped through the rise of advanced AI technologies, governments seek their place in the arena. This talk explores how German government institutions adapt to these changes by focusing on three key areas of action: Adoption, Regulation, and Up-/Reskilling.

Talk pydata-machine-learning-deep-learning-stats

A conceptual and practical introduction to Hilbert Space Gaussian Process (HSGP) approximation methods

Dr. Juan Orduz

In this talk, we explore a new method to approximate Gaussian processes using spectral analysis methods, known as the Hilbert Space Gaussian process (HSGP) approximation.

Tutorial pydata-pydata-scientific-libraries-stack

A deep dive into the Arrow Columnar format with pyarrow and nanoarrow

Joris Van den Bossche, Raúl Cumplido, Alenka Frim

Apache Arrow has become a de-facto standard for efficient in-memory columnar data representation. But what is this format exactly? This tutorial will dive deep into the details of the Arrow columnar format and explore interactively the different types and buffer layouts.

Talk pydata-generative-ai

A Retrieval Augmented Generation system to query the scikit-learn documentation

Guillaume Lemaitre

A Retrieval Augmented Generation system to query the scikit-learn documentation

Talk pycon-programming-software-engineering

Advanced Observability with OpenTelemetry and Python

Anton Caceres

Facing observability challenges in Python microservices? 🐍 ☁️ Learn how #OpenTelemetry can streamline your system monitoring in cloud environments. We're diving into its Python SDK integration, showcasing real-world implementation for enhanced efficiency.

Talk general-industry-academia-use-cases

Best of both worlds - How we built an AI-aided content creation tool for language learning

Lea Petters, Hector Hernandez

Dive into the fusion of human intelligence & AI at Babbel! Explore our journey in crafting an AI-aided content creation tool for personalized language learning. #LanguageLearning #AI #LLMs

Talk pycon-security

Better safe than sorry: Threat Modeling for Python Developers

Clemens Hübner

Is your code secure enough? Find out by doing Threat Modeling!

Sponsored sponsor

Better search relevance using Learning to Rank at mobile.de

Manish Saraswat

This talk will discuss our current search relevance ranking framework and how it ranks millions of searches daily.

Talk pycon-mlops-devops

Beyond Deployment: Exploring Machine Learning Inference Architectures and Patterns

Tim Elfrink

"Beyond Deployment: Exploring Machine Learning Inference Architectures and Patterns" - uncover the ML inference strategies that power StepStone's success and learn to scale your models with confidence!

Talk pydata-machine-learning-deep-learning-stats

Breaking AI Boundaries: Fairness Metrics in Unstructured Data Domains

Daniel Klitzke

Exploring the need for fairness in machine learning in indirect human impact areas, proposing solutions for challenges in unstructured data.

Sponsored sponsor

Bridging the Gap: From Analytical Models to Operational Success

Ignacio Vergara, Nick Harmening

Putting ML models in productions is not as simple as it sounds. Learn how we bridge the chasm between analytics and production.

Sponsored sponsor

Build a personalized Bitcoin (BTC) virtual assistant in Python with Hopsworks and LLM function calling

Javier de la Rúa Martínez

TDB ------------------------------------------------------------------------------------

Tutorial pydata-machine-learning-deep-learning-stats

Build TikTok's Personalized Real-Time Recommendation System in Python with Hopsworks

Jim Dowling

The real-time recommendations engine, Monolith, in Tiktok is so good it has been described as "digital crack". In 1 hr, we will build Monolith in Python as 3 ML pipelines that run on Hopsworks .

Tutorial pycon-testing

Bulletproof Python - Property-Based Testing with Hypothesis

Michael Seifert

Less is more! Rather than working harder and write more test code, property-based testing forces you to work smarter and cover more cases with fewer tests. Join Michael for his Tutorial "Bulletproof Python - Property-Based Testing with Hypothesis"

Sponsored pydata-generative-ai

Cloud? No Thanks! I’m Gonna Run GenAI on My AI PC

Adrian Boguszewski, Dmitriy Pastushenkov

Join this talk to learn that cloud is no longer needed for GenAI. All you need is an AI PC.

Talk pydata-machine-learning-deep-learning-stats

Content Recommendation with Graphs: From Basic Walks to Neural Networks

Dr. Mirza Klimenta

Content Recommendation with Graphs: From Basic Walks to Neural Networks

Talk pydata-data-handling-engineering

Data valuation for machine learning

Miguel de Benito Delgado

pyDVL is the library for data valuation in machine learning. Use it to clean, prune and select your data to improve model performance.

Talk pycon-programming-software-engineering

Deploying your Python application to Android

Shyamnath Premnadh

Current state of deploying Python applications to Android with a comparison of the current available frameworks.

Tutorial pycon-django-web

Django loves strawberries

Arthur Bayr

Explore the dynamic duo of GraphQL Strawberry and Django in an immersive workshop! Discover the seamless integration of Strawberry with Django, mastering type definitions, queries and mutations.

Sponsored sponsor

Documenting R&D Progress using jupyter-book - and feel safe for the next performance audit

Jens Nie

Rosenxt has just been founded, yet we're already very busy to create the next best thing. Let me show you how we create our R&D progress using the jupyter-book ecosystem to be safe for the next performance audit.

Talk pycon-python-language-ecosystem

Encoding Charactersets - may the force be with you

Martin Hoermann

Understanding and repairing garbled text (Mojibake) with Python

Talk pydata-data-handling-engineering

Exploring Zarr: From Fundamentals to Version 3.0 and Beyond

Sanket Verma

Hi all! I’ll discuss Zarr, an open-source data format for storing chunked, compressed N-dimensional arrays. We’ll explore the Zarr ecosystem from fundamentals to V3.0 and beyond. If you’re interested in storing massive datasets, please attend my talk. Thanks!

Talk pydata-machine-learning-deep-learning-stats

From idea to production in a day: Leveraging Azure ML and Streamlit to build and user test machine learning ideas quickly

Florian Roscheck

How to leverage Azure ML, automated machine learning, and Streamlit to build and test machine learning apps quickly? Find out about our favorite Hackathon stack and walk away with some code to build and user-test your own machine learning ideas fast.

Talk pydata-generative-ai

From LLM as oracle to LLM as translator - our journey from theory to everyday’s practice in a corporate setting with dmGPT (and python)

Emma Haley, Niklas Lederer

Learn how dm-drogeriemarkt put LLMs in production and implemented a day-to-day assistant for everyone.

Talk pycon-programming-software-engineering

Green Software Engineering

Farah

Green Software Engineering and the concept of sustainability

Talk pydata-natural-language-processing-computer-vision

How to Do Monolingual, Multilingual, and Cross-lingual Text Classification in April, 2024

Daryna Dementieva

If I want a text classifier in 2024, what should I choose -- LLMs or pre-LLM era classifier? Is the answer the same for English and other languages? We will provide the recipe how to find your classifier depending on the target language and data availability.

Talk pycon-programming-software-engineering

I achieved peak performance in python, here's how ...

Dishant Sethi

In the ever-evolving landscape of software development, crafting code that not only functions flawlessly but also operates at peak performance is a skill that sets exceptional developers apart. This talk delves into the art of optimizing Python code, exploring techniques and stra

Talk pydata-natural-language-processing-computer-vision

Is GenAI All You Need to Classify Text? Some Learnings from the Trenches

Marc Palyart, Kateryna Budzyak

GenAI is sometimes touted as the panacea for all natural language processing (NLP) tasks. This presentation explores a practical text classification scenario at Malt, highlighting the practical hurdles encountered when employing GenAI and how we overcame these obstacles.

Talk pydata-visualisation-jupyter

Jupyter Notebooks for Print Media

Tim Paine

Jupyter Notebooks as a platform to create books, magazine and newspaper articles, and other print media

Talk pycon-testing

Leveraging the Art of Parallel Unit Testing in Django

Azan Bin Zahid, Syed Ansab Waqar Gillani

Unlocking the power of parallel unit testing with Python and Django! 🚀

Talk general-industry-academia-use-cases

Marketing Media Mix Models with Python & PyMC: a Case Study

Emanuele Fabbiani

Discover how Italy's fastest-growing tour operator unlocked transformative marketing insights using Bayesian models, domain knowledge, Python, and PyMC. Gain valuable tips to develop similar models for your business.

Talk pycon-python-language-ecosystem

Mojo 🔥 - Is it Python's faster cousin or just hype?

Jamie Coombes

"Chris Lattner's Mojo promised to revolutionize AI dev with 68k times speed & Python ease. One year later, we dissect its reality—can it outshine Rust & Julia, or is it just hype? #PyData #MojoLanguage #PythonCousin"

Talk pydata-data-handling-engineering

Next Stop: Insights! How Streamlit and Snowflake Power Up Data Stories

Marie-Kristin Wirsching

Data stories are the bridge between complex data insights and business impact! Transforming data into clear, actionable narratives is no easy task. That's where Streamlit and Snowflake come in - a duo for creating visually engaging, interactive data applications.

Tutorial pycon-programming-software-engineering

Performant, scientific computation in Python and Rust

Stefan Ulbrich

A tutorial session on how to build scientific package for numerical calculus algorithm in Python and Rust.

Talk pydata-machine-learning-deep-learning-stats

Personalizing Carousel Ranking on Wolt's Discovery Page: A Hierarchical Multi-Armed Bandit Approach

Marcel Kurovski, Steffen Klempau

Personalizing Carousel Ranking on Wolt's Discovery Page with a Hierarchical Multi-Armed Bandit Approach

Talk pydata-data-handling-engineering

Polars and Time Series: what it can do, and how to overcome any limitation

Marco Gorelli

Learn how to use Polars for time series: what it does, and it doesn't do (and what to do about that!)

Talk general-ethics-privacy

Power structures. The fair advantage

Anja Kunkel

Humans are complex. As developers, we wanna ignore that ... but to do our job right, we cannot.

Talk pydata-generative-ai

Put your RAG to the test: Component-per-component evaluation of our LLM-powered airplane manufacturing assistant

Nataliia Kees

This talk discusses the topic of component-wise evaluation of RAG-based applications on the example of the airplane manufacturing assistant developed at Airbus using open source Python libraries paired with Google Vertex AI.

Tutorial pycon-testing

pytest tips and tricks for a better testsuite

Florian Bruhin

pytest lets you write simple tests fast - but also scales to very complex scenarios: Beyond the basics of no-boilerplate test functions, this training will show various intermediate/advanced features, as well as gems and tricks.

Talk pycon-programming-software-engineering

Python Monorepos: The Polylith Developer Experience

David Vujic

What if writing software would be more like building with LEGO bricks, and have a more playful developer experience. Polylith solves this in a nice and simple way. I’ll walk through the simple Architecture & the Developer friendly tooling for a joyful Python Experience.

Tutorial pycon-programming-software-engineering

Refactoring Large Programs

Dr. Kristian Rother

Refactor a large Python program that is undocumented, unstructured and untested

Talk pydata-machine-learning-deep-learning-stats

Reinforcement Learning: Bridging The Gap Between Research and Applications

Michael Panchenko

Reinforcement learning (RL) has untapped potential for industry. This talk presents Tianshou, an open-source library with interfaces facilitating both industrial RL applications and new algorithm research, with the dual goals of accelerating progress and adoption.

Talk pydata-generative-ai

Safeguarding Privacy and Mitigating Vulnerabilities: Navigating Security Challenges in Generative AI

John Robert

How to protect and secure your data will using LLM and Generative AI. Your data privacy and security is importance.

Tutorial pycon-security

Securing Python: Race Condition Vulnerabilities

Shahriyar Rzayev

Explore and secure Python code against race condition vulnerabilities

Talk pydata-machine-learning-deep-learning-stats

Select ML from Databases

Gregor Bauer

Select ML from Databases: New workflow for building your machine learning models using the capabilities of modern databases

Talk pycon-mlops-devops

Streamlining Python Development: A Practical Approach to CI/CD with GitHub Actions

Artem Kislovskiy

Learn how continuous integration/delivery boosts project resilience to Python updates and packaging changes. Automate for peace of mind, better code, and seamless collaboration.

Talk pydata-machine-learning-deep-learning-stats

Tackling the Cold Start Challenge in Demand Forecasting

Alexander Meier, Daria Mokrytska

Exploring the Cold Start problem in Demand Forecasting. Overcoming difficulties faced by Time Series and ML models. Uncover practical techniques and a systematic evaluation framework for effective forecasting.

Talk pydata-machine-learning-deep-learning-stats

That’s it?! Dealing with unexpected data problems

Simon Pressler

That’s it?! How to deal with unexpected data quality and quantity issues

Talk pydata-natural-language-processing-computer-vision

The AI Revolution Will Not Be Monopolized: How open-source beats economies of scale, even for LLMs

Ines Montani

Are we heading further into a black box era with larger and larger models, obscured behind APIs controlled by big tech monopolies? I don’t think so, and in this talk, I’ll show you why.

Talk pycon-mlops-devops

The key to reliability - Testing in the field of ML-Ops

Gunar Maiwald, Tobias Senst

idealo.de presents its holistic approach for testing in machine learning

Talk pydata-data-handling-engineering

The Struggles We Skipped: Data Engineering for the TikTok Generation

Anuun, Hiba Jamal

A new wave in data engineering! From tangled tasks to sleek, plug-and-play magic in data pipelines. 🚀

Talk general-industry-academia-use-cases

There is a Better Way to Automate and Manage Your (Fluid) Simulations

Julian Wagenschütz

Exploring the integration of Python into Computer Aided Engineering (CAE) workflows: While shell scripts are ubiquitous, they face challenges in CAE, particularly in Computational Fluid Dynamics (CFD). Python + DVC provides a robust alternative to manage simulations at scale.

Talk pycon-programming-software-engineering

Unleashing Confidence in SQL Development through Unit Testing

Tobias Lampert

Confidently ship changes to your SQL data model by validating logic with a SQL unit testing framework. Our framework, powered by pytest, ensures robust deployments, making data model evolution a breeze.

Talk pydata-natural-language-processing-computer-vision

Using LLMs to Create Knowledge Graphs From a Large Corpus of Parliamentary Debates

Usman

This talk demonstrates how we can intuitively analyze political debates using knowledge graphs created using LLMs.

Talk pydata-machine-learning-deep-learning-stats

Your Model _Probably_ Memorized the Training Data

Katharine Jarmul

So, just how much data did ChatGPT memorize? Let's find out!

Talk pydata-machine-learning-deep-learning-stats

🌳 The taller the tree, the harder the fall. Determining tree height from space using Deep Learning and very high resolution satellite imagery 🛰️

Ferdinand Schenck

🌳 The taller the tree, the harder the fall. Measuring tree height from space using Deep Learning 🛰️

Filter