TY - JOUR
T1 - GEN
T2 - Generative Equivariant Networks for Diverse Image-to-Image Translation
AU - Shamsolmoali, Pourya
AU - Zareapoor, Masoumeh
AU - Das, Swagatam
AU - Garcia, Salvador
AU - Granger, Eric
AU - Yang, Jie
N1 - Publisher Copyright:
© 2022 IEEE.
PY - 2023/2/1
Y1 - 2023/2/1
N2 - Image-to-image (I2I) translation has become a key asset for generative adversarial networks. Convolutional neural networks (CNNs), despite having a significant performance, are not able to capture the spatial relationships among different parts of an object and, thus, do not qualify as the ideal representative model for image translation tasks. As a remedy to this problem, capsule networks have been proposed to represent patterns for a visual object in such a way that preserves hierarchical spatial relationships. The training of capsules is constrained by learning all pairwise relationships between capsules of consecutive layers. This design would be prohibitively expensive both in time and memory. In this article, we present a new framework for capsule networks to provide a full description of the input components at various levels of semantics, which can successfully be applied to the generator-discriminator architectures without incurring computational overhead compared to the CNNs. To successfully apply the proposed capsules in the generative adversarial network, we put forth a novel Gromov-Wasserstein (GW) distance as a differentiable loss function that compares the dissimilarity between two distributions and then guides the learned distribution toward target properties, using optimal transport (OT) discrepancy. The proposed method - which is called generative equivariant network (GEN) - is an alternative architecture for GANs with equivariance capsule layers. The proposed model is evaluated through a comprehensive set of experiments on I2I translation and image generation tasks and compared with several state-of-the-art models. Results indicate that there is a principled connection between generative and capsule models that allows extracting discriminant and invariant information from image data.
AB - Image-to-image (I2I) translation has become a key asset for generative adversarial networks. Convolutional neural networks (CNNs), despite having a significant performance, are not able to capture the spatial relationships among different parts of an object and, thus, do not qualify as the ideal representative model for image translation tasks. As a remedy to this problem, capsule networks have been proposed to represent patterns for a visual object in such a way that preserves hierarchical spatial relationships. The training of capsules is constrained by learning all pairwise relationships between capsules of consecutive layers. This design would be prohibitively expensive both in time and memory. In this article, we present a new framework for capsule networks to provide a full description of the input components at various levels of semantics, which can successfully be applied to the generator-discriminator architectures without incurring computational overhead compared to the CNNs. To successfully apply the proposed capsules in the generative adversarial network, we put forth a novel Gromov-Wasserstein (GW) distance as a differentiable loss function that compares the dissimilarity between two distributions and then guides the learned distribution toward target properties, using optimal transport (OT) discrepancy. The proposed method - which is called generative equivariant network (GEN) - is an alternative architecture for GANs with equivariance capsule layers. The proposed model is evaluated through a comprehensive set of experiments on I2I translation and image generation tasks and compared with several state-of-the-art models. Results indicate that there is a principled connection between generative and capsule models that allows extracting discriminant and invariant information from image data.
KW - Capsule networks
KW - disentangle representation
KW - generative model
KW - image-to-image (I2I) translation
UR - https://www.scopus.com/pages/publications/85132521411
U2 - 10.1109/TCYB.2022.3166761
DO - 10.1109/TCYB.2022.3166761
M3 - 文章
C2 - 35522633
AN - SCOPUS:85132521411
SN - 2168-2267
VL - 53
SP - 874
EP - 886
JO - IEEE Transactions on Cybernetics
JF - IEEE Transactions on Cybernetics
IS - 2
ER -