Software Tools for mtDNA Analysis

NIJ 2014-DN-BX-K022 and 2015-DN-BX-K025

Existing software has not allowed for effective alignment of mitochondrial (mt) DNA sequence data generated using a massively parallel sequencing (MPS) approach, combined with the ability to perform a detailed assessment of the data.  The regions of sequence that are typically difficult to align are homopolymeric stretches, isolated patterns of SNPs (single nucleotide polymorphisms), and INDELs (insertions/deletions).  A custom software solution, GeneMarker® HTS, was developed (in collaboration with SoftGenetics, Inc., http://softgenetics.com/GeneMarkerHTS.php) and evaluated to address these limitations, and to provide a user-friendly interface for forensic practitioners and others interested in mtDNA analysis of MPS data.  GeneMarker® HTS generates an exportable consensus mtDNA sequence that produces phylogenetically correct SNP and INDEL calls using a customizable motif-based alignment algorithm.

Sequence data from 500 individuals, with various alignment asymmetries and levels of heteroplasmy, were used to assess the software.  Accuracy in producing mtDNA haplotypes, the ability to correctly identify low-level heteroplasmic sequence variants, and the user-based features of the software were all evaluated.  Analyzed sequences yielded correct mtDNA haplotypes, and heteroplasmic variants were properly identified with minimal manual interpretation.  The software is easy to use, offers numerous user-defined parameters for filtering the data that address the interests of researchers and practitioners, and provides multiple options for viewing and navigating through the data.

Below is an image of mtDNA MPS data aligned to the reference sequence for the control region of the mtgenome between nucleotide positions 16024-16569 and 1-576.  The top panel is a listing of the Major SNP and INDEL profile, and the bottom panel is the minor panel of possible sites of mtDNA heteroplasmy.  Our findings were published in Forensic Science International: Genetics (Holland et al. 2017).