Primer Trimming for Targeted Resequencing

The E92K mutation identified in 9 patients collocated with a primer hybridization site for one of the pair overlapping amplicons covering the 4th exon of the CFTR. This led to an excess of reference nucleotides in the mutation site, which significantly increased probability of wild type nucleotide homozygote and decreased probability of heterozygote, reducing calling quality and mutant allele frequency. As a consequence, this mutation was not detected neither by Samtools nor by GATK, although these are two the most commonly used variant callers for germline mutation detection. 

Identification of the mutation was possible only after primer trimming step. Though it can be done with trimmomatic and cutadapt that are capable to remove technical sequences from fastq files, we suppose that primer and adapter trimming are two drastically different tasks and they should not be performed by the same tools. Particularly, adaptors are truly technical sequences, while primers are parts of genome. This means they can be successfully aligned along with the insert, which helps properly align mutations (basically small deletions and insertions) near insert ends. Thus primer trimming is better be performed after the alignment, while adaptor trimming – before. Therefore, we created software that removes primers from SAM or fastq files after the alignment to amplicon sequences. Following primer trimming, E92K mutation was successfully identified with both variant callers in all 9 samples.

Студент:
   Тимофей Проданов
Куратор:
   Максим Иванов
Время выполнения проекта: Feb 2016 — May 2016
Файлы:
   prodanov_28052016.pdf