Data Science Decoded

Author: Mike E

We discuss seminal mathematical papers (sometimes really old ) that have shaped and established the fields of machine learning and data science as we know them today. The goal of the podcast is to introduce you to the evolution of these fields from a mathematical and slightly philosophical perspective. We will discuss the contribution of these papers, not just from pure a math aspect but also how they influenced the discourse in the field, which areas were opened up as a result, and so on. Our podcast episodes are also available on our youtube: https://youtu.be/wThcXx_vXjQ?sivnMfs
Be a guest on this podcast

Language: en

Genres: Mathematics, Science

Contact email: Get it

Feed URL: Get it

iTunes ID: Get it

Get all podcast data

Listen Now...

Data Science #31 - Correlation and causation (1921), Wright Sewall
Episode 4
Saturday, 26 July, 2025

On the 31st episode of the podcast, we add Liron to the team, we review a gem from 1921, where Sewall Wright introduced path analysis, mapping hypothesized causal arrows into simple diagrams and proving that any sample correlation can be written as the sum of products of “path coefficients.” By treating each arrow as a standardised regression weight, he showed how to split the variance of an outcome into direct, indirect, and joint pieces, then solve for unknown paths from an ordinary correlation matrix—turning the slogan “correlation ≠ causation” into a workable calculus for observational data.Wright’s algebra and diagrams became the blueprint for modern graphical causal models, structural‑equation modelling, and DAG‑based inference that power libraries such as DoWhy, Pyro and CausalNex. The same logic underlies feature‑importance decompositions, counterfactual A/B testing, fairness audits, and explainable‑AI tooling, making a century‑old livestock‑breeding study a foundation stone of present‑day data‑science and AI practice.