Technical Report CSRG-609 (MSc), University of Toronto, 2008.
This thesis focuses on developing the Overlap and Difference operators for managing the relational model in a data exchange setting. A data exchange setting consists of a source schema and a target schema along with a mapping that transforms a source instance into a target instance. We define the overlap and the diff settings and provide algorithms to implement the Overlap and the Difference operations for relational schemas. An enterprise system that operates on multiple heterogeneous database systems often is required to manage independently created relational data sources by creating a unified schema of the data, a merged schema; finding a schema of common elements within two schemas, an overlap schema; or generating schema of elements in one schema that is not represented in another schema, a difference schema. These tasks are labour-intensive and our goal is to simplify the creation of overlap and difference schemas by proposing the Overlap and Difference operators, thereby helping to ease the process of maintaining and building enterprise solutions over large data sources.