Dealing with the deep Web and all its quirks

Meghyn Bienvenu, Daniel Deutch, Davide Martinenghi, Pierre Senellart, Fabian Suchanek

Research output: Contribution to journalConference articlepeer-review


Several approaches harvest, query, or combine Deep Web sources. Yet, in addition to well-studied aspects of the problem such as query answering using views, access limitations, or top-k querying, the Deep Web exhibits a number of peculiarities that are often neglected. First, the services usually deliver not all results, but only the top-n results according to some ranking function. This function may not be compatible with the ordering specified in a user's query. Subsequent results have to be obtained by paging, or may not even be accessible. Second, the services may deliver results in a granularity that is incompatible with the query or joinable services (e.g., months vs. exact dates). Moreover, the services may perform selections or ranking over attributes that are not exposed in the results: this poses an incompleteness problem. Additional challenges come from uncertainty, recency constraints, and inter-service dependencies. In this article, we shed light on these peculiarities, and compile a list of desiderata of a query answering system for the Deep Web.

Original languageEnglish
Pages (from-to)21-24
Number of pages4
JournalCEUR Workshop Proceedings
StatePublished - 2012
Externally publishedYes
Event2nd International Workshop on Searching and Integrating New Web Data Sources: Very Large Data Search, VLDS 2012 - Istanbul, Turkey
Duration: 31 Aug 201231 Aug 2012


Dive into the research topics of 'Dealing with the deep Web and all its quirks'. Together they form a unique fingerprint.

Cite this