The 2021 RecSys Challenge Dataset: Fairness is not optional

Luca Belli, Alykhan Tejani, Frank Portman, Alexandre Lung-Yut-Fong, Ben Chamberlain, Yuanpu Xie, Jonathan Hunt, Michael Bronstein, Vito Walter Anelli, Saikishore Kalloori, Bruce Ferwerda, Wenzhe Shi

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review


After the success the RecSys 2020 Challenge, we are describing a novel and bigger dataset that was released in conjunction with the ACM RecSys Challenge 2021. This year's dataset is not only bigger (1B data points, a 5 fold increase), but for the first time it take into consideration fairness aspects of the challenge. Unlike many static datsets, a lot of effort went into making sure that the dataset was synced with the Twitter platform: if a user deleted their content, the same content would be promptly removed from the dataset too. In this paper, we introduce the dataset and challenge, highlighting some of the issues that arise when creating recommender systems at Twitter scale.

Original languageEnglish
Title of host publicationProceedings of Workshop on the RecSys Challenge 2021, RecSysChallenge 2021
PublisherAssociation for Computing Machinery
Number of pages6
ISBN (Electronic)9781450386937
StatePublished - 1 Oct 2021
Externally publishedYes
Event15th ACM Recommender Systems Challenge Workshop, RecSysChallenge 2021 - Amsterdam, Netherlands
Duration: 1 Oct 2021 → …

Publication series

NameACM International Conference Proceeding Series


Conference15th ACM Recommender Systems Challenge Workshop, RecSysChallenge 2021
Period1/10/21 → …


  • engagement prediction
  • fairness challenge
  • large-scale dataset
  • personalization
  • recommender system
  • twitter


Dive into the research topics of 'The 2021 RecSys Challenge Dataset: Fairness is not optional'. Together they form a unique fingerprint.

Cite this