"Data monkeys": A procedural model of extrapolation from partial statistics

Ran Spiegler*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

13 Scopus citations

Abstract

I present a behavioural model of a "data analyst" who extrapolates a fully specified probability distribution over observable variables from a collection of statistical data sets that cover partially overlapping sets of variables. The analyst employs an iterative extrapolation procedure, whose individual rounds are akin to the stochastic regression method of imputing missing data. Users of the procedure's output fail to distinguish between raw and imputed data, and it functions as their practical belief. I characterize the ways in which this belief distorts the correlation structure of the underlying data generating process-focusing on cases in which the distortion can be described as the imposition of a causal model (represented by a directed acyclic graph over observable variables) on the true distribution.

Original languageEnglish
Pages (from-to)1818-1841
Number of pages24
JournalReview of Economic Studies
Volume84
Issue number4
DOIs
StatePublished - 1 Oct 2017

Funding

FundersFunder number
Limited Feedback Foundation of Boundedly Rational Expectations
Horizon 2020 Framework Programme692995
Economic and Social Research CouncilES/L003031/1
European Research Council

    Keywords

    • Bayesian networks
    • Belief extrapolation
    • Bounded rationality
    • Data analysts
    • Imputation
    • Maximum entropy
    • Missing data
    • Non-rational expectations
    • Running intersection property

    Fingerprint

    Dive into the research topics of '"Data monkeys": A procedural model of extrapolation from partial statistics'. Together they form a unique fingerprint.

    Cite this