"Data mining" consists of a family of techniques for extracting valuable information from an organization's stored or warehoused data. Data mining methods search for patterns and can be compromised if the data contain corrupted values that obscure these patterns. As the saying goes, "Garbage in, garbage out."

GritBot is an automatic tool that tries to find anomalies in data as a precursor to data mining. It can be thought of as an autonomous data quality auditor that hunts for records having "surprising" values of nominal (discrete) and/or numeric (continuous) attributes.

Values need not stand out in the complete dataset -- GritBot searches for subsets of records in which the anomaly is apparent. In one of the sample applications referenced below, GritBot identifies the age of two women in their seventies as being anomalous. Such ages are not surprising in the whole population, but they certainly are in this case because the women are noted as being pregnant.

