This pipeline was developped to investigate the effect numts have on mitochondrial DNA NGS reads. It is implemented on the HAMSTR webpage, available through Imperial College London's network. Three main aspects were investigated to understand how they could lead to confounding when mapping mitochondrial reads.
Different aligners produce different results, so we decided to compare a few commonly used programs. We implemented 6 aligners in this pipeline: BWA, SOAP3-dp, NovoAlign, NextGenMap, Bowtie 2, and SMALT.
In order to investigate how discarding reads mapping to numts impacts variants, we built the pipeline as a 2 steps mapping process: reads mapping to numts regions in the nucleus are not taken into account in the first step, but remapped to the mitochondria in the second.
Since mitochondria are circular molecules, mapping mitochondrial reads requires specific dispositions. In this pipeline, we map mitochondrial reads to two mitochondrial genomes: one is an unaltered reference mitochondrion (taken from Revised Cambridge Reference Sequence of the human mitochondrial DNA (rCRS), the other is the same reference but shifted by 8000 bases. Variants are then called on the two files using mitoCaller and the resulting outputs combined.