BACKGROUND: Efficient and sustainable public health surveillance systems that collect timely, robust, and accurate data are critical to support the functions of the US Public Health System. Data quality is a critical surveillance asset that must be managed in a comprehensive, organized, and continuous manner. We propose a simple framework that can be easily implemented to begin to evaluate and improve data quality in any public health surveillance system.
METHODS: We conducted a systematic review of data quality management models and frameworks from a variety of different fields and professional data quality organizations and identified the most important indicators for successful data quality programs.
RESULTS: Our framework identifies three essential functions a public health surveillance system data quality program should have: documentation, monitoring, and automation. Documentation is critical to knowledge management, systems analysis, and user training. Data quality requires a well-developed Data Dictionary and a Standard Operating Procedure (SOP). The Data Dictionary should function as a centralized repository for all information regarding the values stored in the surveillance system, while the SOP should function as a set of instructions to manage all process-related issues. Monitoring is critical to refining data collection, entry, and analysis processes as well as the actual quality of the data maintained in the surveillance system. Monitoring occurs through two distinct, equally important components: quality assurance (QA) and quality control (QC). QA is the proactive component that prevents errors from being introduced into the database, while QC is the reactive component that identifies and remedies errors that exist in the data. Frequent QC issues should lead to the development of new quality expectations through the QA process. Automation, which involves creating controls and algorithms that allow processes to work without direct human interaction, reduces data entry error, increases timeliness, improves consistency, and simplifies processes. Aspects of a surveillance system that should be prioritized for automation include routine, common, manual, time-consuming, or low-quality data collection, entry or analysis processes.
CONCLUSIONS: Managing data quality involves establishing a comprehensive program that focuses on both the integrity of the data itself – its accuracy and representativeness of the population under surveillance; and the processes that manage the data – from collection, to processing and storage, to analysis and reporting. Our framework will allow public health departments with limited resources to begin to systematically identify data quality issues, even in the most complex public health surveillance systems; rapidly identify solutions; and quantify system improvements.