结合倒置特征金字塔和U-Net的高光谱图像分类

Translated title of the contribution: Hyperspectral image classification using an inverted feature pyramid network with U-Net

Songyang Cheng, Yujie Xiong, Yao Yao, Qingli Li

Research output: Contribution to journalArticlepeer-review

3 Scopus citations

Abstract

Objective: Terrain classification is an important research task in the field of earth observation using remote sensing technology. The hyperspectral image has rich spectral information; thus, it can be applied to the classification of remote sensing image. With the rapid development of the hyperspectral technology, the hyperspectral remote sensing image processing and analyzing technology has attracted wide attention of academia. The hyperspectral images have dozens or even hundreds of continuous narrow spectral bands compared with the traditional panchromatic band and multi-spectral remote sensing image, which provides detailed spectral and spatial feature information. Accordingly, these images have been widely used in various aspects, such as precision agriculture, city planning, and military defense. Hyperspectral images have high dimensional data, and redundancy and noise exist; thus, transformed data must be utilized for image processing. In the application of hyperspectral image classification, the manner by which to effectively represent the features of hyperspectral image is the most critical step in current studies. In this work, we propose an approach for hyperspectral image classification by using an inverted feature pyramid network and U-Net. Method: The dimension of the hyperspectral remote sensing image data is high. Principal component analysis (PCA) method plays a significant role in transforming useful information in the images to the most important k characteristic, thus reducing the amount of data and enhancing the data features. After PCA, the data are segmented and collected by means of sliding window. The surrounding area of each pixel is defined as a patch, which is regarded as the input of the proposed network. The category of the pixel is the ground truth label. In the first stage, U-Net is used to extract spatial features of hyperspectral image at the pixel level. The left side of the network is the contraction path, which corresponds to the encoder part of the classic encoder-decoder. The right side of the network is the extension path, which can be regarded as a decoder. The feature maps in the extension path are the result of combining two parts of a feature map along two dimensions, making the acquired features more visible. In the first part, the feature maps from the same layer of contraction path and the feature maps from the upper layer of extension path are simultaneously fed to the attention mechanism. The feature region of this part has a higher weight value. The second part is obtained by deconvolution of the feature graph from the upper layer of the extension path. In a layered way, these feature maps with rich spatial information are fused with feature maps containing rich semantic information obtained by inverted feature pyramid network layers. Therefore, the obtained feature maps have reliable spatial and strong semantic information. Finally, the weight value of the effective features in the image is increased, and the region of irrelevant background is suppressed owing to the attention mechanism. Thus, the classification result of hyperspectral image is acquired. Result: We conduct experiments to evaluate the effectiveness of the proposed method and attempt to investigate the influence of PCA retained principal component number and the size of input data for the performance of classification. We conduct contrast experiments on four publicly available hyperspectral image datasets to demonstrate the performance of the proposed method: Indian Pines, Pavia University, Salinas, and Urban. Experimental results show that the proposed method for hyperspectral image classification is effective, and the best PCA retained principal component numbers are 3, 20, 10, and 3. Meanwhile, the best input sizes of the proposed model are 64, 32, 32, and 64. We obtain 98.91%, 99.85%, 99.99%, and 87.43% overall classification accuracy rates, 98.07%, 99.39%, 99.09%, and 78.30% average classification accuracy rates, and 0.987, 0.998, 0.999, and 0.831 Kappa values for the four hyperspectral image datasets, respectively, which are higher than those of the other classification algorithms. Conclusion: Hyperspectral images are capable of accurately presenting the rich terrain information contained in the specific region with the help of hundreds of continuous and subdivided spectral bands; however, useless information exists in each spectral band. The mechanism by which to effectively extract the key terrain information from the data of hyperspectral images and utilize them for classification is the most important and difficult problem. We propose to combine U-Net and the inverted pyramid network for hyperspectral image classification. First, we reduce the dimension of hyperspectral image data with the help of PCA method. We adopt the method of sliding window to build patches after the data dimension is reduced. These patches are fed into the model. U-Net is regarded as the backbone of the proposed network, and it aims to extract the characteristics of a hyperspectral image. Then, the rich characteristics of the spatial information are fused with the features from the inverted pyramid network. Subsequently, the abundant spectral and spatial information is obtained. The utilization of attention mechanism allows the model to effectively focus on spectral and spatial information and reduce the influence of signal-to-noise to classification performance. Experimental results show that the proposed method can be applied to hyperspectral image classification tasks with limited training samples and achieve good classification results. The classification accuracy of a hyperspectral image can also be improved by properly handling the input data. In our future work, we will attempt to investigate the manner by which to make the model's structure less complex while maintaining high hyperspectral image classification performance with less training data samples.

Translated title of the contributionHyperspectral image classification using an inverted feature pyramid network with U-Net
Original languageChinese (Traditional)
Pages (from-to)1994-2008
Number of pages15
JournalJournal of Image and Graphics
Volume26
Issue number8
DOIs
StatePublished - 16 Aug 2021

Fingerprint

Dive into the research topics of 'Hyperspectral image classification using an inverted feature pyramid network with U-Net'. Together they form a unique fingerprint.

Cite this