TY - JOUR
T1 - Learning to Align via Wasserstein for Person Re-Identification
AU - Zhang, Zhizhong
AU - Xie, Yuan
AU - Li, Ding
AU - Zhang, Wensheng
AU - Tian, Qi
N1 - Publisher Copyright:
© 1992-2012 IEEE.
PY - 2020
Y1 - 2020
N2 - Existing successful person re-identification (Re-ID) models often employ the part-level representation to extract the fine-grained information, but commonly use the loss that is particularly designed for global features, ignoring the relationship between semantic parts. In this paper, we present a novel triplet loss that emphasizes the salient parts and also takes the consideration of alignment. This loss is based on the crossing-bing matching metric that also known as Wasserstein Distance. It measures how much effort it would take to move the embeddings of local features to align two distributions, such that it is able to find an optimal transport matrix to re-weight the distance of different local parts. The distributions in support of local parts is produced via a new attention mechanism, which is calculated by the inner product between high-level global feature and local features, representing the importance of different semantic parts w.r.t. identification. We show that the obtained optimal transport matrix can not only distinguish the relevant and misleading parts, and hence assign different weights to them, but also rectify the original distance according to the learned distributions, resulting in an elegant solution for the mis-alignment issue. Besides, the proposed method is easily implemented in most Re-ID learning system with end-to-end training style, and can obviously improve their performance. Extensive experiments and comparisons with recent Re-ID methods manifest the competitive performance of our method.
AB - Existing successful person re-identification (Re-ID) models often employ the part-level representation to extract the fine-grained information, but commonly use the loss that is particularly designed for global features, ignoring the relationship between semantic parts. In this paper, we present a novel triplet loss that emphasizes the salient parts and also takes the consideration of alignment. This loss is based on the crossing-bing matching metric that also known as Wasserstein Distance. It measures how much effort it would take to move the embeddings of local features to align two distributions, such that it is able to find an optimal transport matrix to re-weight the distance of different local parts. The distributions in support of local parts is produced via a new attention mechanism, which is calculated by the inner product between high-level global feature and local features, representing the importance of different semantic parts w.r.t. identification. We show that the obtained optimal transport matrix can not only distinguish the relevant and misleading parts, and hence assign different weights to them, but also rectify the original distance according to the learned distributions, resulting in an elegant solution for the mis-alignment issue. Besides, the proposed method is easily implemented in most Re-ID learning system with end-to-end training style, and can obviously improve their performance. Extensive experiments and comparisons with recent Re-ID methods manifest the competitive performance of our method.
KW - Person re-identification
KW - Wasserstein distance
KW - convolutional neural network
KW - deep metric learning
UR - https://www.scopus.com/pages/publications/85088111048
U2 - 10.1109/TIP.2020.2998931
DO - 10.1109/TIP.2020.2998931
M3 - 文章
AN - SCOPUS:85088111048
SN - 1057-7149
VL - 29
SP - 7104
EP - 7116
JO - IEEE Transactions on Image Processing
JF - IEEE Transactions on Image Processing
M1 - 9110779
ER -