## Abstract

Assume we are asked to predict a real-valued variable y_{t} based on certain characteristics x_{t}=(x_{t}^{1},...,x _{t}^{d}), and on a database consisting of (x_{i} ^{1},...,x_{i}^{d},^{yi}) for i=1,...,n. Analogical reasoning suggests to combine past observations of x and y with the current values of x to generate an assessment of y by similarity-weighted averaging. Specifically, the predicted value of y, yts, is the weighted average of all previously observed values y_{i}, where the weight of y _{i}, for every i=1,...,n, is the similarity between the vector x _{t}^{1},...,x_{t}^{d}, associated with y _{t}, and the previously observed vector, x_{i} ^{1},...,x_{i}^{d}. The "empirical similarity" approach suggests estimation of the similarity function from past data. We discuss this approach as a statistical method of prediction, study its relationship to the statistical literature, and extend it to the estimation of probabilities and of density functions.

Original language | English |
---|---|

Pages (from-to) | 124-131 |

Number of pages | 8 |

Journal | Journal of Econometrics |

Volume | 162 |

Issue number | 1 |

DOIs | |

State | Published - May 2011 |

## Keywords

- Density estimation
- Empirical similarity
- Kernel
- Spatial models