Is It Out Yet? Automatic Future Product Releases Extraction from Web Data

Gilad Fuchs, Ido Ben-Shaul, Matan Mandelbrod

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Identifying the release of new products and their predicted demand in advance is highly valuable for E-Commerce marketplaces and retailers. The information of an upcoming product release is used for inventory management, marketing campaigns and pre-order suggestions. Often, the announcement of an upcoming product release is widely available in multiple web pages such as blogs, chats or news articles. However, to the best of our knowledge, an automatic system to extract future product releases from web data has not been presented. In this work we describe an ML-powered multistage pipeline to automatically identify future product releases and rank their predicted demand from unstructured pages across the whole web. Our pipeline includes a novel Longformer-based model which uses a global attention mechanism guided by pre-calculated Named Entity Recognition predictions related to product releases. The model training data is based on a new corpus of 30K web pages manually annotated to identify future product releases. We made the dataset openly available at https://doi.org/10.5281/zenodo.6894770.

Original languageEnglish
Title of host publicationEMNLP 2022 - Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
Subtitle of host publicationIndustry Track
PublisherAssociation for Computational Linguistics (ACL)
Pages273-281
Number of pages9
ISBN (Electronic)9781952148255
StatePublished - 2022
Externally publishedYes
Event2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022 - Abu Dhabi, United Arab Emirates
Duration: 7 Dec 202211 Dec 2022

Publication series

NameEMNLP 2022 - Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing: Industry Track

Conference

Conference2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022
Country/TerritoryUnited Arab Emirates
CityAbu Dhabi
Period7/12/2211/12/22

Fingerprint

Dive into the research topics of 'Is It Out Yet? Automatic Future Product Releases Extraction from Web Data'. Together they form a unique fingerprint.

Cite this