## Abstract

I present a behavioural model of a "data analyst" who extrapolates a fully specified probability distribution over observable variables from a collection of statistical data sets that cover partially overlapping sets of variables. The analyst employs an iterative extrapolation procedure, whose individual rounds are akin to the stochastic regression method of imputing missing data. Users of the procedure's output fail to distinguish between raw and imputed data, and it functions as their practical belief. I characterize the ways in which this belief distorts the correlation structure of the underlying data generating process-focusing on cases in which the distortion can be described as the imposition of a causal model (represented by a directed acyclic graph over observable variables) on the true distribution.

Original language | English |
---|---|

Pages (from-to) | 1818-1841 |

Number of pages | 24 |

Journal | Review of Economic Studies |

Volume | 84 |

Issue number | 4 |

DOIs | |

State | Published - 1 Oct 2017 |

## Keywords

- Bayesian networks
- Belief extrapolation
- Bounded rationality
- Data analysts
- Imputation
- Maximum entropy
- Missing data
- Non-rational expectations
- Running intersection property