BSAlign: A Library for Nucleotide Sequence Alignment

文献类型: 外文期刊

第一作者: Shao, Haojing

作者: Shao, Haojing;Ruan, Jue

作者机构:

关键词: Pairwise alignment; Edit distance; Striped vectorization; Banded dynamic programming; F evaluation

期刊名称:GENOMICS PROTEOMICS & BIOINFORMATICS ( 影响因子:11.5; 五年影响因子:10.3 )

ISSN: 1672-0229

年卷期: 2024 年 22 卷 2 期

页码:

收录情况: SCI

摘要: Increasing the accuracy of the nucleotide sequence alignment is an essential issue in genomics research. Although classic dynamic programming (DP) algorithms (e.g., Smith-Waterman and Needleman-Wunsch) guarantee to produce the optimal result, their time complexity hinders the application of large-scale sequence alignment. Many optimization efforts that aim to accelerate the alignment process generally come from three perspectives: redesigning data structures [e.g., diagonal or striped Single Instruction Multiple Data (SIMD) implementations], increasing the number of parallelisms in SIMD operations (e.g., difference recurrence relation), or reducing search space (e.g., banded DP). However, no methods combine all these three aspects to build an ultra-fast algorithm. In this study, we developed a Banded Striped Aligner (BSAlign) library that delivers accurate alignment results at an ultra-fast speed by knitting a series of novel methods together to take advantage of all of the aforementioned three perspectives with highlights such as active F-loop in striped vectorization and striped move in banded DP. We applied our new acceleration design on both regular and edit distance pairwise alignment. BSAlign achieved 2-fold speed-up than other SIMD-based implementations for regular pairwise alignment, and 1.5-fold to 4-fold speed-up in edit distance-based implementations for long reads. BSAlign is implemented in C programing language and is available at https://github.com/ruanjue/bsalign.

分类号:

  • 相关文献
作者其他论文 更多>>