Thursday, November 7, 2019
Defining Geographic Markets from Probabilistic Clusters: A Machine Learning Algorithm Applied to Supermarket Scanner Data
Stephen Bruestle, Federal Maritime Commission, Luca Pappalardo, Institute of Information Science and Technologies (ISTI), Consiglio Nazionale delle Ricerche (CNR), and Riccardo Guidotti, Institute of Information Science and Technologies (ISTI), Consiglio Nazionale delle Ricerche (CNR) are Defining Geographic Markets from Probabilistic Clusters: A Machine Learning Algorithm Applied to Supermarket Scanner Data.
ABSTRACT: We propose that we estimate geographic markets in two steps. First, estimate clusters of transactions interchangeable in use. Second, estimate markets from these clusters. We argue that these clusters are subsets of markets. We draw on both antitrust cases and economic intuition. We model and estimate these clusters using techniques from machine learning and data science. WE model these clusters using Blei et al.’s (2003) Latent Dirichlet Allocation (LDA) model. And, we estimate this model using Griffiths and Steyvers’s (2004) Gibbs Sampling algorithm (Gibbs LDA). We apply these ideas to a real-world example. We use transaction-level scanner data from the largest supermarket franchise in Italy. We find fourteen clusters. We present strong evidence that LDA fits the data. This shows that these interchangeability clusters exist in the marketplace. Then, we compare Gibbs LDA clusters with clusters from the Elzinga-Hogarty (E-H) test. We find similar clusters. LDA has a few identifiable parameters. The E-H test has too many parameters for identification. Also, Gibbs LDA avoids the silent majority fallacy of the E-H test. Then, we estimate markets from the Gibbs LDA clusters. We use consumption overlap and price stationarity tests on the clusters. We find four grocery markets in Tuscany.