您好,欢迎访问浙江省农业科学院 机构知识库!

Spatial-Spectral 1DSwin Transformer With Groupwise Feature Tokenization for Hyperspectral Image Classification

文献类型: 外文期刊

作者: Xu, Yifei 1 ; Xie, Yixuan 1 ; Li, Bicheng 1 ; Xie, Chuanqi 2 ; Zhang, Yongchuan 3 ; Wang, Aichen 4 ; Zhu, Li 1 ;

作者机构: 1.Xi An Jiao Tong Univ, Sch Software, Xian 710054, Shaanxi, Peoples R China

2.Zhejiang Acad Agr Sci, State Key Lab Managing Biot & Chem Threats Qual &, Hangzhou 310021, Peoples R China

3.Chongqing Jiaotong Univ, Chongqing Smart City Inst, Chongqing 400074, Peoples R China

4.Jiangsu Univ, Key Lab Modern Agr Equipment & Technol, Zhenjiang 212013, Peoples R China

关键词: 1-D shifted window-based multihead self-attention (1DSW-MSA); 1-D window-based MSA (1DW-MSA); cross-block normalized connection (CNC); hyperspectral image (HSI) classification; Swin Transformer

期刊名称:IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING ( 2022影响因子:8.2; 五年影响因子:8.8 )

ISSN: 0196-2892

年卷期: 2023 年 61 卷

页码:

收录情况: SCI

摘要: The hyperspectral image (HSI) classification aims to assign each pixel to a land-cover category. It is receiving increasing attention from both industry and academia. The main challenge lies in capturing reliable and informative spatial and spectral dependencies concealed in the HSI for each class. To address the challenge, we propose a spatial-spectral 1DSwin (SS1DSwin) Transformer with groupwise feature tokenization for HSI classification. Specifically, we reveal local and hierarchical spatial-spectral relationships from two different perspectives. It mainly consists of a groupwise feature tokenization module (GFTM) and a 1DSwin Transformer with cross-block normalized connection module (TCNCM). For GFTM, we reorganize an image patch into overlapping cubes and further generate groupwise token embeddings with multihead self-attention (MSA) to learn the local spatial-spectral relationship along the spatial dimension. For TCNCM, we adopt the shifted windowing strategy when acquiring the hierarchical spatial-spectral relationship along the spectral dimension with 1-D window-based MSA (1DW-MSA) and 1-D shifted window-based MSA (1DSW-MSA) and leverage cross-block normalized connection (CNC) to adaptively fuse the feature maps from different blocks. In SS1DSwin, we apply these two modules in order and predict the class label for each pixel. To test the effectiveness of the proposed method, extensive experiments are conducted on four HSI datasets, and the results indicate that SS1DSwin outperforms several current state-of-the-art methods. The source code of the proposed method is available at https://github.com/Minato252/SS1DSwin.

  • 相关文献
作者其他论文 更多>>