Posts

Measuring long-term outcomes using short-term data and surrogates

Image
Image courtesy of Cai et al. (2023) Solo post: Director's cut When measuring the outcomes of an intervention, organizations usually observe and quantify immediate or short-term results. For example, marketing could drive additional traffic, a discounted shipping rate could increase conversion rates, a price promotion or a loyalty program could drive sales. In most cases, however, these interventions would have effects that materialize over a longer period of time. After being exposed to a promotion, customers may become more price sensitive and start buying cheaper products or strategically time their purchases to take advantage of the next promotion. In general, companies will not conduct multi-month (or even multi-year) experiments to compare alternatives and find the option that optimizes long-term return on investment (ROI). Decisions must be made in the absence of long-term results. To address this shortcoming, in 2019, Susan Athey et al. published a paper on combining short...

Eat Mor Chikin, and fast? The story of Chick-fil-A’s multimodal data analysis and optimization

Image
Image courtesy of  Imago / Pubity Illustration -  pubity.com Introduction to the business case The Wall Street Journal article reveals how Chick-fil-A is using innovative data collection and analysis methods to optimize its drive-through operations. The company has developed a "Film Studies" unit that combines drone footage with security camera data to create comprehensive "game films" of its restaurant operations. This multimodal data collection approach, inspired by NFL game analysis, allows the company to model traffic patterns, identify operational bottlenecks, and analyze service efficiency in drive-through operations. Read the article here . The data-centric insights have led to significant operational improvements, including the development of new restaurant designs with elevated kitchens and multiple drive-through lanes capable of serving 700 cars per hour . The analysis also helped optimize staffing patterns, identify gaps in Wi-Fi coverage, and improve o...

Causal inference is not about methods

Image
Image courtesy of Eleanor Murray - epiellie.com Solo post: Academic's take Causal modeling is becoming more popular after some notorious failures of overly optimistic reliance on black-box predictive models to solve business problems (e.g., Zillow's iBuying). This is great. We are also increasingly seeing the introduction of a new method that "solves" causal inference. This is not so good because it misdirects attention. Causal inference has more to do with data and assumptions than it does with methods. No method can "solve" the causal inference problem (although it can help by reducing bias). If anything, regression is one of the most common methods for causal inference, so regression is an effective causal inference method when all else is in order . This is different from predictive modeling, where brute force bias reduction using the most complex method may succeed. Price elasticity of demand problem Simply put, we want to know how demand will change ...

How a supposedly data-centric decision cost Walgreens $200 million and how to avoid it

Image
Image courtesy of  Rob Klas -  bloomberg.com Introduction to the business case The business case discussed in this post was reported by Bloomberg with the title "Walgreens Replaced Fridge Doors With Smart Screens. It’s Now a $200 Million Fiasco". You can find the article here . In summary, a startup promised Walgreens that its high-tech fridges would track shoppers and spark an in-store advertising revolution. Then the project failed miserably for a number of reasons. Most importantly, Walgreens faced a backlash when customers ended up seeing their reflections in dark mirrors instead of the drinks in the fridges through the glass doors. Store associates rushed to put signs on the coolers to explain which drinks were in which cooler. The project went so badly that it ended in a lawsuit between the startup and Walgreens, and not only did it fail to deliver business value, it resulted in losses in customer satisfaction, employee morale, and revenue. But why was this allowe...

Explaining the unexplainable Part II: SHAP and SAGE

Image
Image courtesy of iancovert.com Academic's take This post is an exception to close the loop we opened with our post on LIME:  Explaining the unexplainable Part I: LIME . Starting in 2025, we're changing the scope of Data Duets to focus on business cases (as opposed to methods). Having discovered an excellent write-up  that explains both SHAP (SHapley Additive exPlanations) and SAGE (Shapley Additive Global importancE), I will focus on the why questions, possible links to counterfactuals and causality, and the implications for data centricity. Why do we need SHAP and SAGE? The need for explainability methods stems clearly from the fact that most ensemble methods are black boxes in nature: they minimize prediction error but they obscure how the predictions are made. This is problematic because some predictions are only as good as the underlying conditions are favorable. For example, Zillow's iBuying was a notorious failure , likely due to a lack of cla...

Explaining the unexplainable Part I: LIME

Image
Image courtesy of finalyse.com Academic's take Every model is a simplified version of the reality, as it must be. That is fine as long as we know and understand how the reality is simplified and reduced to the parameters of a model. In predictive analytics, where nonparametric models are heavily used with a kitchen sink approach of adding any and all features to improve predictive performance, we don't know even know how a model simplifies the reality. So, what if we use another model to simplify and explain the nonparametric predictive model? This other model  is called a surrogate model and it is designed to be interpretable. In short, surrogate models explore the boundary conditions of decisions made by a predictive model. What is a surrogate model? Surrogate models can help us understand (i) the average prediction of a model (global surrogate) or (ii) a single prediction (local surrogate). The quest then becomes finding surrogate models that can explain the predictions ...

How to (and not to) log transform zero

Image
Image courtesy of the authors: Survey results of the papers with log zero in the American Economic Review Academic's take Log transformation is widely used in linear models for several reasons: Making data "behave" or conform to parametric assumptions, calculating elasticity, etc. The figure above shows that nearly 40% of the empirical papers in a selected journal used a log specification and 36% had the problem of the log of zero. When an outcome variable naturally has zeros, however, log transformation is tricky. In most cases, the solution is to instinctively add a positive constant to each value in the outcome variable. One popular idea is to add 1 so that raw zeros remain as log-transformed zeros. Another is to add a very small constant, especially if the scale of the variable is small. Well, the bad news is that these are all arbitrary choices that bias the resulting estimates. If a model is correlational, a small bias due to the transformation may not be a big con...