Characteristics and filtering of low-frequency artificial short deletion variations based on nanopore sequencing

文献类型: 外文期刊

第一作者: Ye, Fuqiang

作者: Ye, Fuqiang;Han, Yifang;Yang, Xiaohong;Zhu, Juanjuan;Zhang, Xiaomin;Zhang, Jiarong;Xie, Zihan;Yang, Tingting;Ni, Ming;Zhang, Jiarong;Yang, Tingting;Xie, Zihan;Ren, Zilin;Ren, Zilin

作者机构:

关键词: nanopore sequencing; low-frequency deletions; filtering

期刊名称:GIGASCIENCE ( 影响因子:3.9; 五年影响因子:11.1 )

ISSN: 2047-217X

年卷期: 2025 年 14 卷

页码:

收录情况: SCI

摘要: Background Nanopore sequencing is characterized by high portability and long reads, albeit accompanied by systematic errors causing short deletions. Few tools can filter low-frequency artificial deletions, especially in single samples.Results To solve this problem, we first synthesized or purchased 17 DNA/RNA standards for nanopore sequencing with R9 and R10 flowcells to obtain benchmarking datasets. False-positive (FP) deletions were prevalent (75.86%-96.26%), while the majority (62.07%-79.68%) were located in homopolymeric regions. The 10-mer base-quality scores (Q scores) and sequencing speeds flanking the FP homopolymeric deletions marginally differed from the true-positive (TP) deletions. We thus investigated the raw current signals after normalizing them by length. We found more significant differences in current signals between the reads with and without FP deletions. Indexes including the MRPP A (Multiple Response Permutation Procedure, statistic A), the accumulative difference of normalized current signals, and the Q score were tested for the power of distinguishing between FP and TP deletions. MRPP A outperformed the other indexes in homopolymeric regions and achieved the highest accuracy of 76.73% for challenging 1-base homopolymeric deletions. When sequencing depth was low, the Q score performed better than MRPP A. We developed Delter (Deletion filter) to filter low-frequency FP deletions of nanopore sequencing in single samples, which removed 60.98% to 100% of artificial homopolymeric deletions in real samples.Conclusions Low-frequency artificial short deletion variations, especially the most challenging homopolymeric deletions, could be effectively filtered by Delter using normalized current signals or Q scores according to the employed sequencing strategies.

分类号:

  • 相关文献
作者其他论文 更多>>