Using ML to find out the "Why"? A Tutorial in Causal Machine Learning
Oliver Schacht, Jan Teichert-Kluge
Machine learning is mostly used for predicting outcome variables. But in many cases, we are interested in causal questions: Why do customers churn? What is the effect of a price change on sales? How can we optimize personalized marketing campaigns or medical treatments?
This tutorial introduces participants to the field of Causal Machine Learning (Causal ML). We will start with a basic motivation of causal analysis and share insights on how to recognize causal questions in data science. We will dive into the basics of Causal ML: Why can't we simply use of-the-shelf ML methods to answer causal questions? The tutorial will focus on the Double Machine Learning approach and demonstrate the use of Causal ML with the Python library DoubleML (Bach et al., 2022). The general introduction will be complemented by hands-on data examples and interactive discussion and Q&A sessions. The tutorial is a great starting point for participants to discover Causality/Causal ML and start their own causal data science projects.
References
Bach, P., Chernozhukov, V., Kurz, M. S., and Spindler, M. (2022), DoubleML - An Object-Oriented Implementation of Double Machine Learning in Python, Journal of Machine Learning Research, 23(53): 1-6, https://www.jmlr.org/papers/v23/21-0862.html
Oliver Schacht
I am a PhD candidate at the University of Hamburg, passionately researching within the field of Causal Machine Learning. As part of my research activities, I am also a contributing developer to DoubleML, which is a toolbox for causal predictions with ML.
Jan Teichert-Kluge
My name is Jan and I work as a research associate at the University of Hamburg, where I am studying for my PhD in statistics and data science. I have a master's degree in industrial engineering and together with my experience from industry, I have a strong application-oriented background. I have contributed to the DoubleML package for Python and my research focuses on Causal ML for unstructured data such as text and images.