-----------------------------------------------------------------------------
                      Query Optimization in the Deep Web
-----------------------------------------------------------------------------
                Andrea Cali'               Davide Martinenghi
             Brunel University           Politecnico di Milano
-----------------------------------------------------------------------------
                                   Abstract

The term Deep Web refers to the data content that is created dynamically as
the result of a specific search on the Web. In this respect, such content
resides outside web pages, and is only accessible through interaction with the
web site – typically via HTML forms. It is believed that the size of the Deep
Web is several orders of magnitude larger than that of the so-called Surface
Web, i.e., the web that is accessible and indexable by search engines.
Usually, data sources accessible through web forms are modeled by relations
that require certain fields to be selected – i.e., some fields in the form
need to be filled in. These requirements are commonly referred to as access
limitations in that access to data can only take place according to given
patterns. Besides data accessible through web forms, access limitations may
also occur i) in legacy systems where data scattered over several files are
wrapped as relational tables, and ii) in the context of Web services, where
similar restrictions arise from the distinction between input parameters and
output parameters. In such contexts, computing the answer to a user query
cannot be done as in a traditional database; instead, a query plan is needed
that provides the best answer possible while complying with the access
limitations. In these talks, we illustrate the semantics of answers to queries
over data sources under access limitations and present techniques for query
answering in this context. We show different techniques to optimize query
answering both at the time of the query plan generation and at the time of the
execution of the query plan. We analyze the influence of integrity constraints
on the sources, of the kind that is usually found in database schemata, on
query answering. We present prototype systems that are aimed at querying the
deep web, and show their achievements.
-----------------------------------------------------------------------------
Giovedi' 10 Giugno 2010 Ore 11:30 Aula N7
Facoltà di Ingegneria - Università Roma Tre
Via Vasca Navale, 79 00146 Roma 
Come arrivare:
http://atzeni.dia.uniroma3.it/accesso/index.html
-----------------------------------------------------------------------------