Accurate circular consensus long-read sequencing improves variant detection and assembly of a human genome

文献类型: 外文期刊

第一作者: Wenger, Aaron M.

作者: Wenger, Aaron M.;Peluso, Paul;Rowell, William J.;Hall, Richard J.;Concepcion, Gregory T.;Topfer, Armin;Qian, Yufeng;Rank, David R.;Hunkapiller, Michael W.;Chang, Pi-Chuan;Kolesnikov, Alexey;DePristo, Mark A.;Carroll, Andrew;Ebler, Jana;Marschall, Tobias;Ebler, Jana;Marschall, Tobias;Ebler, Jana;Fungtammasan, Arkarachai;Chin, Chen-Shan;Olson, Nathan D.;Zook, Justin M.;Alonge, Michael;Schate, Michael C.;Mahmoud, Medhat;Sedlazeck, Fritz J.;Phillippy, Adam M.;Koren, Sergey;Myers, Gene;Ruan, Jue;Li, Heng

作者机构:

期刊名称:NATURE BIOTECHNOLOGY ( 影响因子:54.908; 五年影响因子:50.516 )

ISSN: 1087-0156

年卷期: 2019 年 37 卷 10 期

页码:

收录情况: SCI

摘要: The DNA sequencing technologies in use today produce either highly accurate short reads or less-accurate long reads. We report the optimization of circular consensus sequencing (CCS) to improve the accuracy of single-molecule real-time (SMRT) sequencing (PacBio) and generate highly accurate (99.8%) long high-fidelity (HiFi) reads with an average length of 13.5 kilobases (kb). We applied our approach to sequence the well-characterized human HG002/NA24385 genome and obtained precision and recall rates of at least 99.91% for single-nucleotide variants (SNVs), 95.98% for insertions and deletions <50 bp (indels) and 95.99% for structural variants. Our CCS method matches or exceeds the ability of short-read sequencing to detect small variants and structural variants. We estimate that 2,434 discordances are correctable mistakes in the 'genome in a bottle' (GIAB) benchmark set. Nearly all (99.64%) variants can be phased into haplotypes, further improving variant detection. De novo genome assembly using CCS reads alone produced a contiguous and accurate genome with a contig N50 of >15 megabases (Mb) and concordance of 99.997%, substantially outperforming assembly with less-accurate long reads.

分类号:

  • 相关文献
作者其他论文 更多>>