The median hypothesis

Ran Gilad-Bachrach, Chris J.C. Burges

Research output: Chapter in Book/Report/Conference proceedingChapterpeer-review


The classification task uses observations and prior knowledge to select a hypothesis that will predict class assignments well. In this work we ask the question: what is the best hypothesis to select from a given hypothesis class? To address this question we adopt a PAC-Bayesian approach. According to this viewpoint, the observations and prior knowledge are combined to form a belief probability over the hypothesis class. Therefore, we focus on the next part of the learning process, in which one has to choose the hypothesis to be used given the belief. We call this problem the hypothesis selection problem. Based on recent findings in PAC-Bayesian analysis, we suggest that a good hypothesis has to be close to the Bayesian optimal hypothesis. We define a measure of “depth” for hypotheses to measure their proximity to the Bayesian optimal hypothesis and we show that deeper hypotheses have stronger generalization bounds. Therefore, we propose algorithms to find the deepest hypothesis. Following the definitions of depth in multivariate statistics, we refer to the deepest hypothesis as the median hypothesis. We show that similarly to the univariate and multivariate medians, the median hypothesis has good stability properties in terms of the breakdown point. Moreover, we show that the Tukey median is a special case of the median hypothesis. Therefore, the algorithms proposed here also provide a polynomial time approximation for the Tukey median. This algorithm makes the mildest assumptions compared to other efficient approximation algorithms for the Tukey median.

Original languageEnglish
Title of host publicationEmpirical Inference
Subtitle of host publicationFestschrift in Honor of Vladimir N. Vapnik
PublisherSpringer Berlin Heidelberg
Number of pages15
ISBN (Electronic)9783642411366
ISBN (Print)9783642411359
StatePublished - 1 Jan 2013
Externally publishedYes


Dive into the research topics of 'The median hypothesis'. Together they form a unique fingerprint.

Cite this