The Yahoo! Music Dataset and KDD-Cup’11

Gideon Dror, Noam Koenigstein, Yehuda Koren, Markus Weimer

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review


KDD-Cup 2011 challenged the community to identify user tastes in music by leveraging Yahoo! Music user ratings. The competition hosted two tracks, which were based on two datasets sampled from the raw data, including hundreds of millions of ratings. The underlying ratings were given to four types of musical items: tracks, albums, artists, and genres, forming a four level hierarchical taxonomy. The challenge started on March 15, 2011 and ended on June 30, 2011 attracting 2389 participants, 2100 of which were active by the end of the competition. The popularity of the challenge is related to the fact that learning a large scale recommender systems is a generic problem, highly relevant to the industry. In addition, the contest drew interest by introducing a number of scientific and technical challenges including dataset size, hierarchical structure of items, high resolution timestamps of ratings, and a non-conventional ranking-based task. This paper provides the organizers’ account of the contest, including: a detailed analysis of the datasets, discussion of the contest goals and actual conduct, and lessons learned throughout the contest.
Original languageEnglish
Title of host publicationKDDCUP'11
Subtitle of host publicationProceedings of the 2011 International Conference on KDD Cup 2011
EditorsGideon Dror, Yehuda Koren, Markus Weimer
Number of pages16
StatePublished - 1 Sep 2012

Publication series

NameProceedings of Machine Learning Research


Dive into the research topics of 'The Yahoo! Music Dataset and KDD-Cup’11'. Together they form a unique fingerprint.

Cite this