"Data monkeys": A procedural model of extrapolation from partial statistics

Research output: Contribution to journalArticlepeer-review

Abstract

I present a behavioural model of a "data analyst" who extrapolates a fully specified probability distribution over observable variables from a collection of statistical data sets that cover partially overlapping sets of variables. The analyst employs an iterative extrapolation procedure, whose individual rounds are akin to the stochastic regression method of imputing missing data. Users of the procedure's output fail to distinguish between raw and imputed data, and it functions as their practical belief. I characterize the ways in which this belief distorts the correlation structure of the underlying data generating process-focusing on cases in which the distortion can be described as the imposition of a causal model (represented by a directed acyclic graph over observable variables) on the true distribution.

Original languageEnglish
Pages (from-to)1818-1841
Number of pages24
JournalReview of Economic Studies
Volume84
Issue number4
DOIs
StatePublished - 1 Oct 2017

Keywords

  • Bayesian networks
  • Belief extrapolation
  • Bounded rationality
  • Data analysts
  • Imputation
  • Maximum entropy
  • Missing data
  • Non-rational expectations
  • Running intersection property

Fingerprint

Dive into the research topics of '"Data monkeys": A procedural model of extrapolation from partial statistics'. Together they form a unique fingerprint.

Cite this