Mohammad Sadoghi, D. Stantic, and N. Koudas.
In IBM CASCON, 2005.
Data quality is a serious concern in every organization that relies on data. The quality of data is commonly poor due to a multitude of reasons including, but not limited to, spelling mistakes, abbreviations, lack of standards and inconsistent notations. SPIDER is a declarative data cleaning tool. It incorporates a set of algorithms that can be used to aid the improvement of data quality on any relational data source.