SPIDER: Data Quality & Data Cleaning Project

Mohammad Sadoghi, D. Stantic, and N. Koudas.

In IBM CASCON, 2005.

Abstract

Data quality is a serious concern in every organization that relies on data. The quality of data is commonly poor due to a multitude of reasons including, but not limited to, spelling mistakes, abbreviations, lack of standards and inconsistent notations. SPIDER is a declarative data cleaning tool. It incorporates a set of algorithms that can be used to aid the improvement of data quality on any relational data source.

Download

Sorry, can't prepare a list of recommended papers at the moment.