Tuning Word2vec for Large Scale Recommendation Systems

Benjamin P. Chamberlain, Emanuele Rossi, Dan Shiebler, Suvash Sedhain, Michael M. Bronstein

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Word2vec is a powerful machine learning tool that emerged from Natural Language Processing (NLP) and is now applied in multiple domains, including recommender systems, forecasting, and network analysis. As Word2vec is often used off the shelf, we address the question of whether the default hyperparameters are suitable for recommender systems. The answer is emphatically no. In this paper, we first elucidate the importance of hyperparameter optimization and show that unconstrained optimization yields an average 221% improvement in hit rate over the default parameters. However, unconstrained optimization leads to hyperparameter settings that are very expensive and not feasible for large scale recommendation tasks. To this end, we demonstrate 138% average improvement in hit rate with a runtime budget-constrained hyperparameter optimization. Furthermore, to make hyperparameter optimization applicable for large scale recommendation problems where the target dataset is too large to search over, we investigate generalizing hyperparameters settings from samples. We show that applying constrained hyperparameter optimization using only a 10% sample of the data still yields a 91% average improvement in hit rate over the default parameters when applied to the full datasets. Finally, we apply hyperparameters learned using our method of constrained optimization on a sample to the Who To Follow recommendation service at Twitter and are able to increase follow rates by 15%.

Original languageEnglish
Title of host publicationRecSys 2020 - 14th ACM Conference on Recommender Systems
PublisherAssociation for Computing Machinery, Inc
Pages732-737
Number of pages6
ISBN (Electronic)9781450375832
DOIs
StatePublished - 22 Sep 2020
Externally publishedYes
Event14th ACM Conference on Recommender Systems, RecSys 2020 - Virtual, Online, Brazil
Duration: 22 Sep 202026 Sep 2020

Publication series

NameRecSys 2020 - 14th ACM Conference on Recommender Systems

Conference

Conference14th ACM Conference on Recommender Systems, RecSys 2020
Country/TerritoryBrazil
CityVirtual, Online
Period22/09/2026/09/20

Keywords

  • Embeddings
  • Hyperparameter Optimization
  • Neural Networks
  • Recommender System Evaluation

Fingerprint

Dive into the research topics of 'Tuning Word2vec for Large Scale Recommendation Systems'. Together they form a unique fingerprint.

Cite this