A datum will be of high quality if it correctly represents the real-world construct to which it refers. Data quality can be concisely described as data that is ‘fit for purpose’. High quality data should produce useful information but poor quality data will produce useless information. (‘Garbage in, garbage out’). The concept of data quality is not restricted to computers and the current system of paper notes is often an easy place to identify poor quality data and observe the consequences. The benefits of any healthcare computer system will be ultimately reliant on the quality of its data and NHS Connecting for Health is keen to point out that any improvements in patient care and safety as a result of information and communication technology are dependent on the quality of the information held therein.
Data is the plural of datum. A datum is merely an observation that is recorded in its raw form. A datum is made up of a label and a value, whereby the label defines the datum and the value, which may be numeric, date, time or text, records the measurement. For example, heart rate = 70. This datum can then be analysed within context to produce information. For example, a neonate with a heart rate = 70 beats per second is not normal.
How data is handled can be described using a data processing model:
Application ‹ Collection ‹ Warehousing ‹ Analysis
Application – purpose for which the data is being collected.
Collection – processes used to collect the data.
Warehousing – processes used to store and transfer the data.
Analysis- processes used to transform the data into information.
Each stage of the model presents its own challenges to maintaining data quality. There may be poor design at the application stage leading to over-collection and inefficiency. Input errors can lead to incomplete or inaccurate data entry at the collection stage. Power, media or security failures at the warehousing stage may become catastrophic without robust standby and backup procedures. Finally any information derived from the data will be useless if the analysis stage is flawed.
Healthcare coding can be thought of as a thesaurus of medical terms with the purpose of accurately defining a diagnosis, procedure, drug or device and to group numerous descriptions with the same meaning together. The purpose of coding is to improve the quality of healthcare data.
The Read coding system was invented by Dr James Read, a GP in the United Kingdom, who sold the system to the government. The Read coding system includes codes for history, symptoms, examination signs, diagnostic procedures, diagnoses, preventative procedures, operations and therapeutic procedures, occupations and more.
OPCS (Office of Population Censuses and Surveys) is a coding classification of surgical operations and procedures and in its 4th revision is widely used by the Office of National Statistics to monitor activity and outcome in the UK.
ICD (International Classification of Diseases) contains over 12,000 diagnostic codes and is maintained by the World Health Organisation. Up until recently there was no coding system available for clinical terms related to anaesthesia, but with the development of SNOMED (Systemised nomenclature of medicine), which is a coding system that amalgamates all of the terms from the Read, OPCS and ICD coding systems along with new terms not found elsewhere, anaesthesia can finally be ‘codified’.
Coding does not occur without error and these errors may be described as:
1. Descriptive (missing data, missing or incorrect code)
2. Translative (wrong link between descriptor and meaning)
3. Transcriptive (human errors such as fatigue, poor training or keyboard errors)
Quality assurance is the process by which data quality is monitored and maintained. This occurs within an audit cycle environment whereby standards are set, practice is observed, practice is compared to standards, change is implemented and the audit cycle continues again. These processes may be implemented either in ‘batch’ (all at once) or ‘real time’ (ongoing).
The processes used within quality assurance include:
1. Profiling (where data processing is analysed for real and possible challenges to data quality)
2. Standardisation (application of a ‘rules’ engine so that data conforms to the standard, e.g. coding system)
3. Matching and linking (processes that match/link/merge similar (possibly replicated or fragmented data) using techniques such as ‘fuzzy logic’, ‘house-holding’ or ‘best-of-breed’ models)
4. Monitoring (ongoing data quality review)