BACKGROUND: Typing methods capable of distinguishing different bacteria of the same species are essential epidemiological tools for outbreak investigation. Traditional typing methods based on phenotypes have been used for many years, but these techniques, which rely on gel-based nucleic acid analysis, have suboptimal sensitivity and specificity. Whole genome sequencing could greatly augment our ability to differentiate among bacterial types and subtypes. Backbone bacterial DNA whole genome sequence analysis, in particular, offers the opportunity to type pathogens using easily conveyed data (nucleotide polymorphisms), and provide greater degrees of discrimination than do pulsed field gel electrophoresis (PFGE) and other non-sequencing methodologies.
METHODS: A case-control study of the 2011 Escherichia coli O157:H7 outbreak in St Louis, Missouri, attributed cases of infection to the salad bar exposure. However, a subset of cases claimed no ingestion of food from the incriminated source, even though PFGE and multiple-locus variable-number tandem-repeats analysis (MLVA) suggested that all isolates had an identical origin. To reconcile this difference, we performed whole bacterial genome sequencing to find differentiating single nucleotide polymorphisms (SNPs) in open reading frames (ORF) in the highly stable backbone chromosome of the pathogen.
RESULTS: We sequenced isolates with identical PFGE and MLVA patterns from 23 outbreak cases. Backbone ORF SNP analysis of 3,442,673 nucleotides per strain demonstrated that each of 7 cases who reported only exposure to the salad bars of interest, as well as 2 cases who reported exposure to the salad bars of interest but also to other salad bars, and a single case who neither confirmed nor denied exposure to the salad bars of interest were infected with E. coli O157:H7 that were isogenic with the outbreak strain in their backbone ORFs, and possessed a unique set of SNPs. In contrast, 9 of 13 isolates from cases who denied exposure to an incriminated vehicle differed from the outbreak strain by only one or two single nucleotides (P = 0.002 compared to those with definite and exclusive, definite and non-exclusive, or with unsure exposure to the salad bars of interest).
CONCLUSIONS: Backbone ORF SNP Set analysis reconciled discrepancies that arose during investigation of an outbreak of E. coli O157:H7 infections, and provided greater pathogen differentiating power than PFGE and MLVA combined. Whole genome sequencing and analysis of selected regions of the bacterial chromosome offer highly precise pathogen differentiation. Pathogen sequencing during an outbreak investigation could be a major advance in public health practice.