Stacks Image 28

Passionate About Precision In Data And Information

I believe that data quality impacts business performance, customer service and financial results. Errors in data disrupt processes and, in particular, errors in master data and reference data propagate to cause multiple disruptions.

Over time there is a divergence between the content of information systems as originally designed and the way those systems are used in practice. Data elements in the system are stretched and misused to meet practical needs of operations. The resulting lack of clarity about the data in the system causes increasing inefficiency and disruption of business processes.

There are three steps to restoring data quality:
  • make good data definitions - these form the standard to which the data should conform
  • cleanse the data that does not conform
  • make sure data stays clean

Defining The Data

What constitutes "clean data"? Before trying to cleanse the data there has to be a target, a set of criteria that states what the data should be like. This takes the form of a set of definitions of each data element in the system. On the one hand such definitions can be rigorously applied to cleansing the data and on the other hand they are understood by the people working with the system and executing day to day operations.

Making good data definitions requires input from the people who work with the data and who understand the business process. How do they name and describe the things that they work with? Knowledge of the business and its processes is needed to create the definitions.

Data Cleansing

Once a set of data element definitions is available as a standard, the operational data is compared with the standard and differences are identified. The changes needed to correct the operational data are then specified.

Most of this work takes place outside the operational information system. Only when proposed changes have been adequately reviewed are they applied as updates to the operational system.

Keeping Data Clean

Making a one off effort to clean up a data set is a good thing that should improve the operational business process. But keeping the data clean on an ongoing basis is another thing.

The definitions created in the first step and the results found in the second step form the basis for redesign and reorganisation of the data management processes. And they provide the material for a detailed and specific set of rules that ensure the operational data continues to conform to the standards.