Provenance for natural language queries

Daniel Deutch*, Nave Frost, Amir Gilad

*Corresponding author for this work

Research output: Contribution to journalConference articlepeer-review

22 Scopus citations

Abstract

Multiple lines of research have developed Natural Language (NL) interfaces for formulating database queries. We build upon this work, but focus on presenting a highly detailed form of the answers in NL. The answers that we present are importantly based on the provenance of tuples in the query result, detailing not only the results but also their explanations. We develop a novel method for transforming provenance information to NL, by leveraging the original NL query structure. Furthermore, since provenance information is typically large and complex, we present two solutions for its effective presentation as NL text: One that is based on provenance factorization, with novel desiderata relevant to the NL case, and one that is based on summarization. We have implemented our solution in an end-to-end system supporting questions, answers and provenance, all expressed in NL. Our experiments, including a user study, indicate the quality of our solution and its scalability.

Original languageEnglish
Pages (from-to)577-588
Number of pages12
JournalProceedings of the VLDB Endowment
Volume10
Issue number5
DOIs
StatePublished - 2016
Event43rd International Conference on Very Large Data Bases, VLDB 2017 - Munich, Germany
Duration: 28 Aug 20171 Sep 2017

Fingerprint

Dive into the research topics of 'Provenance for natural language queries'. Together they form a unique fingerprint.

Cite this