Causal inference is not about methods
Solo post: Academic's take
Causal modeling is becoming more popular after some notorious failures of overly optimistic reliance on black-box predictive models to solve business problems (e.g., Zillow's iBuying). This is great. We are also increasingly seeing the introduction of a new method that "solves" causal inference. This is not so good because it misdirects attention.
Causal inference has more to do with data and assumptions than it does with methods. No method can "solve" the causal inference problem (although it can help by reducing bias). If anything, regression is one of the most common methods for causal inference, so regression is an effective causal inference method when all else is in order. This is different from predictive modeling, where brute force bias reduction using the most complex method may succeed.
Price elasticity of demand problem
Simply put, we want to know how demand will change if we change the price by a certain amount in a certain direction. More specifically, we want to know the percentage change in quantity demanded for a one percent increase or decrease in price. This is essentially a causal problem, although it has not always been treated as such. To get the elasticity, we need some bespoke data, ideally from an experiment.
Let's say the product is sold in a physical store, so a fully controlled experiment (an ideal A/B test) is not feasible. Even if it is digital, differential pricing is not the best lever to experiment on (while localized pricing is considered more acceptable). So we need an experimental design where we change the price of the product in one store and estimate the change in demand and compare it to the demand in another comparable store where the price remains the same. Such an experiment would generate the data we need. We could also use historical data if it fits.
Data and assumptions first, methods later
We start with the data that is intended (or generated) for the causal estimation of a target effect in the first place. Then we move on to the assumptions: positivity, consistency, exchangeability, and others like no interference. These assumptions are critical to identifying a causal effect (you can find a one-pager with brief definitions here).
Only when we have the right data and reasonably sound modeling assumptions do we need a method to estimate the effect. So the order is data, assumptions, and method. Does it matter which method we use? Not so much.
We can use regression (or partial correlation, as I saw popularized recently, 118 years after its introduction).1 We can also use a doubly robust estimator with nonparametric estimators. We can use any other method that helps us isolate the effect of price from the confounders (promotions on the item, store traffic, etc.) while avoiding colliders.
Do the latest and greatest methods not help at all?
Using Double ML instead of regression can help with model misspecification bias, but that's about it. For example, when estimating the price elasticity of demand, we need to have the correct model specification, i.e., we need to know exactly how promotions on the product or store traffic affect demand. Is the effect of store traffic non-linear? Does increasing traffic increase demand up to a point, and then level off due to crowding out as the store gets packed? If we do not know and specify the correct functional form in a regression, the regression results will be biased. If we use Double ML instead, we don't have to worry about this as much –the doubly robust estimator takes care of it. However, we still cannot omit variables that may be confounders or violate any of the assumptions.
So modeling price elasticity with Double ML does not magically turn the estimate into a causal effect unless we have the right data and assumptions. So saying that partial correlation helps causal inference is the same as saying that linear regression helps causal inference. They both help, but so does any other consistent and unbiased estimator.
Bottom line
Causal inference is largely method agnostic. Assumptions about data make it possible to estimate a causal effect. Methods serve the data and assumptions. If the assumptions needed to identify causality hold for the given data and problem, then "Causal inference using Double ML in Python" can easily be translated into "Causal inference using OLS in Stata" while keeping the estimated causal effect intact. As we quote in our work on data centricity:
Assumptions simplify the complex world and make it easier to understand. The art in scientific thinking -whether in physics, biology, or economics- is deciding which assumptions to make.
[1] Apparently, partial correlation was first introduced in a paper by Udny Yule in 1907, building on earlier work in correlation theory by Galton and Pearson. It is basically just another way of standardizing the coefficient in a multiple linear regression.↩
Podcast-style discussion of the article
The raw/unedited podcast discussion produced by NotebookLM (proceed with caution):