Near-complete assembly and comprehensive annotation of the wheat Chinese Spring genome

文献类型: 外文期刊

第一作者: Wang, Zijian

作者: Wang, Zijian;Tan, Kaiwen;Xin, Beibei;Lai, Jinsheng;Chen, Jian;Miao, Lingfeng;Guo, Weilong;Xin, Beibei;Lai, Jinsheng;Ni, Zhongfu;Sun, Qixin;Chen, Jian;Miao, Lingfeng;Guo, Weilong;Ni, Zhongfu;Sun, Qixin;Miao, Lingfeng;Guo, Weilong;Lai, Jinsheng;Ni, Zhongfu;Sun, Qixin;Appels, Rudi;Jia, Jizeng;Lu, Fei;Fu, Xiangdong;Lu, Fei;Fu, Xiangdong

作者机构:

关键词: wheat genome; Chinese Spring; near-complete assembly; seed storage proteins; tandem repeats; centromeres

期刊名称:MOLECULAR PLANT ( 影响因子:24.1; 五年影响因子:25.8 )

ISSN: 1674-2052

年卷期: 2025 年 18 卷 5 期

页码:

收录情况: SCI

摘要: A complete reference genome assembly is crucial for biological research and genetic improvement. Owing to its large size and highly repetitive nature, there are numerous gaps in the globally used wheat Chinese Spring (CS) genome assembly. In this study, we generated a 14.46 Gb near-complete assembly of the CS genome, with a contig N50 of over 266 Mb and an overall base accuracy of 99.9963%. Among the 290 gaps that remained (26, 257, and 7 gaps from the A, B, and D subgenomes, respectively), 278 were extremely high-copy tandem repeats, whereas the remaining 12 were transposable-element-associated gaps. Four chromosome assemblies were completely gap-free, including chr1D, chr3D, chr4D, and chr5D. Extensive annotation of the nearcomplete genome revealed 151 405 high-confidence genes, of which 59180 were newly annotated, including 7602 newly assembled genes. Except for the centromere of chr1B, which has a gap associated with superlong GAA repeat arrays, the centromeric sequences of all of the remaining 20 chromosomes were completely assembled. Our near-complete assembly revealed that the extent of tandem repeats, such as simplesequence repeats, was highly uneven among different subgenomes. Similarly, the repeat compositions of the centromeres also varied among the three subgenomes. With the genome sequences of all six types of seed storage proteins (SSPs) fully assembled, the expression of u-gliadin was found to be contributed entirely by the B subgenome, whereas the expression of the other five types of SSPs was most abundant from the D subgenome. The near-complete CS genome will serve as a valuable resource for genomic and functional genomic research and breeding of wheat as well as its related species.

分类号:

  • 相关文献
作者其他论文 更多>>