跳到主要导航 跳到搜索 跳到主要内容

融 合 字 典 学 习 与 视 觉 转 换 器 的 高 分 遥 感 影 像场 景 分 类 方 法

  • Xiaojun He
  • , Xuan Liu*
  • , Xian Wei
  • *此作品的通讯作者

科研成果: 期刊稿件文章同行评审

摘要

Classification methods of remote sensing scene images are mostly based on traditional machine learning or convolutional neural networks. The feature extraction capability of such methods is extremely limited, particularly for optical remote sensing images with large interclass similarity, complex spatial information, and various geometric structures, there are problems such as loss of feature information and low classification accuracy. To overcome these problems, we propose a high-resolution remote sensing scene image classification method that combines dictionary learning and Vision Transformer (ViT). This method can not only mine the long-distance dependencies inside the images but can also use dictionary learning to capture the deep nonlinear structural information of images to improve classification accuracy. Through extensive experiments performed on the RSSCN7, NWPU-RESISC45, and Aerial Image Data Set (AID) public remote sensing image datasets trained from scratch on the PyTorch deep learning framework, the effectiveness of the proposed method is verified; the results show that the classification accuracy of the proposed method for the mentioned datasets is 1. 763 percentage points, 1. 321 percentage points, and 3. 704 percentage points higher than that of the original visual converter model, respectively. Moreover, the proposed method outperforms other advanced scene classification methods.

投稿的翻译标题Classification Method of High-Resolution Remote Sensing Scene Image Based on Dictionary Learning and Vision Transformer
源语言繁体中文
文章编号1410019
期刊Laser and Optoelectronics Progress
60
14
DOI
出版状态已出版 - 2023
已对外发布

关键词

  • Vision Transformer
  • dictionary learning
  • high-resolution remote sensing image
  • remote sensing image scene classification

指纹

探究 '融 合 字 典 学 习 与 视 觉 转 换 器 的 高 分 遥 感 影 像场 景 分 类 方 法' 的科研主题。它们共同构成独一无二的指纹。

引用此