Database Group

Dipartimento di Informatica e Automazione
Università Roma Tre
Via della Vasca Navale 79 
00146 Roma, Italy

Seminars in June 2006

Lucian Popa (IBM Almaden Research Center)

Schema Mappings: From Flat to Nested
Tuesday, June 13, 10:30, room N3 (ground floor)

Abstract

Many problems in information integration rely on specifications, called schema mappings, that model the relationships between schemas. Schema mappings are convenient abstractions for the runtime transformation of data and for rewriting of queries from one schema to another. Being able to automatically generate such mappings and then to automatically generate the necessary runtime artifacts based on mappings is a major step towards making heterogeneous information more readily accessible to human users and applications.

In this talk I will give an overview of the Clio project at IBM Almaden. I will discuss how schema mappings arise and are used in Clio and I will highlight the research problems that we are addressing. I will then describe the two major formalisms for schema mappings that we have considered at the core of Clio: first, the flat mappings, which are source-to-target constraints (or GLAV assertions, commonly used in data integration and exchange), and more recently, the nested mappings, which are an extension that allows (sub)mappings to be nested in the context of other mappings. I will overview our generation algorithms for mappings, under both formalisms, and show significant advantages of nested mappings in terms of specification power and flexibility.

Wang-Chiew Tan (University of California Santa Cruz)

Debugging Schema Mappings
Tuesday, June 13, 11:15, room N3 (ground floor)

Abstract

A schema mapping describes how data structured a source schema, called the source instance, is to be exchanged into data structured under a target schema, called the target instance. Schema mappings can be manually specified or (semi-)automatically generated with the help of schema matching tools. In either case, a schema mapping often needs to be further refined or debugged by a user, before it accurately reflects a user's intention.

In this talk, I will present a facility for debugging schema mappings with routes. A route describes the relationship between data in the source and target instances according to the schema mapping. I will present two algorithms; the first algorithm computes all routes for selected target data and the second algorithm computes one route for selected target data. In computing all routes, our algorithm produces a concise representation that factors common steps in the routes. Furthermore, the representation is complete in that every minimal route for the selected data is, essentially, embedded in the representation. Our second algorithm is able to produce one route fast, if there is one, and alternative routes, if needed. This algorithm is complete in that it will produce a route for the selected data, if there is one. We demonstrate the feasibility of our route algorithms through a set of experimental results on both real and synthetic datasets. I will also describe my vision of a prototype tool for debugging schema mappings.

(Joint work with Laura Chiticariu)

For further information please contact: Paolo Atzeni

Previous seminars hosted by the database group