Are fixed effects really fixed?
Image courtesy of the authors: Bias vs. the standard deviation of temporal unobserved heterogeneity where the heterogeneity follows a random walk |
Academic's take
An interesting recent paper titled "Fixed Effects and Causal Inference" by Millimet and Bellemare (2023) discusses the feasibility of assuming fixed effects are fixed over long periods in causal models. The paper highlights the rather obvious but usually overlooked fact that fixed effects may fail to control for unobserved heterogeneity over long time periods.
This makes perfect sense, since any effects that are assumed to be fixed (firm characteristics, store attributes, consumer demographics, artistic talent) are more likely to be constant over shorter periods, but may as well vary over longer periods. The paper refers to a critical point made by Mundlak (1978):
"It would be unrealistic to assume that the individuals do not change in a differential way as the model assumes [...] It is more realistic to assume that individuals do change differentially but at a pace that can be ignored for short time intervals."
This is essentially a bias-variance tradeoff: over longer time periods, the variance of a fixed effects estimator gets smaller but the bias of the estimator is expected to get larger. The paper runs a series of simulations to test the robustness of the fixed effects estimator and offers an alternative, rolling estimator, approach for causal identification in panel data models. The figure shown above compares the bias of the proposed rolling estimator approach with the fixed effects model.[1]
One important takeaway for causal identification is to think in more detail before assuming away unobserved heterogeneity using fixed effects as a panel gets longer. There are more detailed insights in the paper, which can be accessed here.
[1] In the figure, the data generation process behind the unobserved heterogeneity is simulated as a random walk. If the unobserved heterogeneity follows unit-specific time trends instead (as usually modeled as in Mundlak (1978), Autor (2003), and other recent papers), the increase in bias takes a concave shape. See the appendix of the paper for more details.
Director's cut
The demographics of a market can affect customer demand, response to advertising, or willingness to pay. In analyzing the impact of an intervention, such as a marketing campaign, models usually assume that the demographics of customers within a market remain fairly constant. This is typically a valid assumption, since the analysis period doesn't usually extend to years: especially after Covid, the disruptions have limited the historical data that can be used without any corrections.
With that said, can we really assume that the demographics of a market do not change over time? Today, rapid gentrification in urban areas is changing the demographics of entire neighborhoods faster than ever before. In a store-level analysis, for example, using fixed effects to control for customer demographics at the neighborhood level is challenging: any observed change in the outcome may be due to the intervention as well as to the changes in customer demographics.
Implications for data centricity
References
- Autor, D. H. (2003). Outsourcing at will: The contribution of unjust dismissal doctrine to the growth of employment outsourcing. Journal of labor economics, 21(1), 1-42.
- Millimet, D., & Bellemare, M. F. (2023). Fixed Effects and Causal Inference.
- Mundlak, Y. (1978). On the pooling of time series and cross section data. Econometrica: journal of the Econometric Society, 69-85.