NGS Guidelines
·
We recommend that the likelihood,
detectability, and severity of harm of potential errors should be determined at
each step. Anticipated potential errors specific to the detection of somatic
variants in tumor tissue by NGS should be addressed. Potential errors should be
addressed through assay design, method validation, and/or quality controls.
·
Sample preparation:
·
Assess tumour
cellularity
·
Another concurrent test (ex. flow, CBC
for PB and BM)
·
Pathologist review (appropriately trained
and certified)
·
Macrodissection or microdissection
·
Mutant allele fractions (including silent
mutations) allow for more precise estimates
·
Stochastic bias is also a concern when
working with small samples, as the number of genome equivalents present in the
sample may be insufficient to consistently detect variants with low allele
burden.
·
Increase input, multiple displacement
amplification, single-molecule barcoding
·
Sensitivity control
·
Measure DNA yield - DNA yield is a
potential source of error
·
Optimization of the entire extraction
procedure is often necessary to minimize transfers and loss of material through
multiple steps
·
DNA purity and integrity is a potential
source of error
·
Deamination or depurination
is a potential source of error
·
DNA obtained from older FFPE blocks (eg, >3 years) often shows evidence of deamination, which
can significantly increase background noise in the final NGS reads, depending
on the sequencing method used
·
Treatment with uracil N-glycolase can be helpful with such samples,37 but this may require increasing input DNA into the
library step and should be validated thoroughly before being adopted routinely.
·
Consider Ung treatment, duplex reads
·
Confirm all positives with orthogonal
method during validation
·
Contamination is a potential source of
error
·
It is critical to avoid
cross-contamination between samples
·
change scalpel blades between tissue dissections
·
wipe work surfaces frequently with bleach
·
ensure that samples are handled only one at a
time.
·
Use a no template control during
validation to detect contamination
·
Library preparation:
·
Optimize and monitor DNA library preparation
to assess DNA purity and integrity
·
Hybrid capture NGS
·
Amplification-based NGS
·
Stochastic bias is a potential source of
error
·
Amplification errors are a potential
source of error
·
It is important to keep in mind the
possible impact of amplification errors and content bias related to the library
method used
·
Because potential sources of error can be
addressed through assay design (in addition to method validation and quality
controls), these should be considered early in the design phase of test
development.
·
High-fidelity polymerase, duplex reads
·
Confirm all positives with orthogonal method
during validation
·
Capture bias is a potential source of
error
·
Optimize enrichment, long-range PCR
·
Define minimum coverage, back-fill with
orthogonal method during validation
·
Primer bias and allele dropout are
potential sources of error
·
Assess causes of false negatives, design
overlapping regions
·
Bioinformatically flag homozygosity of rare variants
·
Sequencing Platform:
·
Recommend that laboratory directors consider
the following during clinical NGS platform selection:
·
size of the panel (number of genes and
the extent of gene coverage);
·
expected testing volume;
·
required test turnaround time;
·
availability of bioinformatics support;
·
provider’s degree of technological
innovation,
·
platform flexibility, and scalability;
·
laboratory resources, technical expertise
·
manufacturer’s level of technical support
·
Illumina and Ion showed equal performance
in detection of somatic variants in DNA derived from FFPE tumour
samples using amplicon-based commercial panels (with the caveat associated with
Ion sequencer’s ability to accurately detect homopolymer
tracts)
·
Illumina:
·
Pros:
·
high
versatility and scalability to perform a wide spectrum of assays from small and
targeted panels to highly comprehensive
·
Cons:
·
Higher DNA and RNA input requirements
(except Ion Torrent series)
·
Longer sequencing time (except Ion
Torrent series)
·
Require more comprehensive bioinformatics
support
·
Higher cost of instruments (except Ion
Torrent)
·
Ion Torrent series may be the platform of
choice for many institutions to run small gene panels (<50 genes) and on
samples with limited amount of DNA or RNA (ie, biopsy
specimens).
·
However, Ion Torrent series have
increased error rate in homopolymer regions and have
low scalability
·
Panel Design:
·
It is recommended to include only those genes
that have sufficient scientific evidence for the disease diagnosis,
prognostication, or treatment [eg, professional practice
guidelines, published scientific literature, test registries (eg, National Center for Biotechnology Information Genetic
Testing Registry, http://www.ncbi.nlm.nih.gov/gtr
and Eurogen Tests, http://www.eurogentest.org/index.php?idZ160
, both last accessed January 8, 2016)].
·
The scientific evidence used to support
NGS panel design should be documented in the validation protocol.
·
Panels designed for diagnosis and patient
prognostication are usually tumor specific, tend to be smaller in size, and
include only those genes that are directly implicated in the oncobiology of the tumor.
·
The size of the panel may affect sequencing
reagent cost, depth of sequencing, laboratory productivity, and complexity of
analytical and clinical interpretation.
·
Data Analysis:
·
The range of software tools and type of
validation required depends on the assay design
·
Base calling:
·
Read alignment:
·
Variant identification:
·
Each of the 4 main classes of sequence
variants (SNVs, indels, CNAs, and SVs) require a
different computational approach for sensitive and specific identification
·
Published comparisons of various
bioinformatics tools for SNV detection may be helpful
·
Indels:
·
Alignment of indel-containing sequence
reads is technically challenging, and algorithms specifically designed for the
task are required.
·
One such specialized approach is called “local realignment”
·
Probabilistic modeling based on mapped sequence reads can be used
to identify indels that are up to 20 bp, but these methods do not provide an acceptable
sensitivity for detection of larger indels, such as
FLT3 internal tandem duplications that may exceed 300 bp
in length
·
Split-read analysis approaches to indel
detection use algorithms that can appropriately map the two ends of a read that
is interrupted (or split) by insertion or deletion. These algorithms can also
manage reads that have been trimmed (soft-clipped) because of misalignments
caused by indels
·
CNAs:
·
Assuming deep enough sequencing coverage, the relative change in
DNA content will be reflected in the number of reads mapping within the region
of the CNA after normalization to the average read depth across the same sample
·
Analysis of allele frequency at commonly occurring SNVs can be a
useful indicator of CNAs or loss of heterozygosity in NGS data
·
SVs:
·
The breakpoints for interchromosomal
and intrachromosomal rearrangements are usually
located in noncoding DNA sequences, introns of genes, often in highly repetitive
regions, and therefore are difficult to both capture and to map to the
reference genome.
·
In addition, SV breakpoints often contain superimposed sequence
variation ranging from small indels to fragments from
several chromosomes.
·
Discordant mate-pair methods (with analysis of associated
soft-clipped reads) and split-read methods can be used to identify SVs, and
often provide single base accuracy for the localization of the breakpoint,
which is a significant advantage in that such precise localization of the breakpoint
facilitates orthogonal validation by PCR
·
Multiple tools should be evaluated to determine which has optimal
performance characteristics for the particular assay under consideration,
because, depending on the design of capture probes and specific sequence of the
target regions, different SV detection tools have large differences in sensitivity
or specificity.
·
Detection of SVs using RNA (cDNA) as starting material uses
different bioinformatics approaches, especially when it is performed using
amplification-based sequencing.
·
fused transcripts are aligned to a gene reference of targeted chimeric
fusion transcripts.
·
Be aware that many popular NGS analysis programs
are designed for constitutional genome analysis with algorithms that may ignore
SNVs with variant allel frequencies (VAFs) falling
outside the expected range for homozygous and heterozygous variants.
·
Variant annotation:
References:
·
Jennings et al. Guidelines for validation of next-generation
sequencing-based oncology panels: a joint consensus recommendation of the
Association for Molecular Pathology and College of American Pathologists. J Mol Diag 2017;19(3):341-65. (currently
at p. 348 heading “Optimization and Familiarization Process”)