from the bench
CDC Launches Bioinformatics App to Determine
Sequence Type from Legionella pneumophila
NGS Data
By Shatavia S. Morrison, PhD, US Centers for Disease Control and Prevention, Respiratory Diseases Branch; Brian H. Raphael, PhD, US Centers for
Disease Control and Prevention, Respiratory Diseases Branch; and Jonas M. Winchell, PhD, US Centers for Disease Control and Prevention,
Respiratory Diseases Branch
One feature that was developed
specifically for this app was to address
the challenge of analyzing data associated
with a paralog with one of the seven loci
used in SBT. A paralog sequence is defined
as a duplicate sequence located in a
different region of the genome. Traditional
methodologies such as PCR were designed
to handle this issue, but this is difficult
to extract from shot-gun sequencing
data. The feature tries to mitigate this
issue by anchoring reads to the paralog
location in the genome and retrieving
the actual loci allele information.
At the 2017 Advanced Molecular
Detection Day hosted at the US Centers
for Disease Control and Prevention
(CDC) in Atlanta, GA, the Pneumonia
Response and Surveillance Laboratory
presented a software app that allows
users to submit their Legionella
pneumophila whole genome sequencing
data to the Office of Advanced Molecular
Detection (OAMD) Bioinformatics portal
to extract in silico Sequencing Based
Typing (SBT) information. The app is
the first of its kind hosted on the OAMD
Bioinformatics portal which leverages
OAMD scientific computing resources
and a user friendly graphical interface
to support public health laboratories
(PHLs) in their research and outbreak
investigations of L. pneumophila.
No Computer Programming
Required
Addressing L. pneumophila
Genomics in a Snapshot
SBT for L. pneumophila is a technique
used during outbreak investigations
to cluster environmental and clinical
isolates. A curated international database
of sequence types (STs) is available
allowing investigators to identify where
other strains with similar STs may have
been isolated. 1,2 With the increased use of
whole genome sequencing (WGS) during
L. pneumophila outbreak investigations, the
extraction of SBT information is useful
in providing a preliminary clustering
analysis to determine if isolates may or
may not be associated with an outbreak.
There are other methods such as whole
genome multi-locus sequence typing
(wgMLST) that can provide a higher
level of resolution, but they are typically
time and computationally expensive.
The SBT analysis provides an initial
assessment of genome relatedness and
requires a small fraction of the genome.
PublicHealthLabs
@APHL
One app requirement was to
minimize the number of steps the
user had to perform in order to
generate data to assist with their
L. pneumophila research study or
outbreak investiga