Abstract
Fine-grained facial expression aims at changing the expression of an image without altering facial identity. Most current expression manipulation methods are based on a discrete expression label, which mainly manipulates holistic expression with details neglected. To handle the above mentioned problems, we propose an end-to-end expression-guided generative adversarial network (EGGAN), which synthesizes an image with expected expression given continuous expression label and structured latent code. In particular, an adversarial autoencoder is used to translate a source image into a structured latent space. The encoded latent code and the target expression label are input to a conditional GAN to synthesize an image with the target expression. Moreover, a perceptual loss and a multiscale structural similarity loss are introduced to preserve facial identity and global shape during expression manipulation. Extensive experiments demonstrate that our approach can edit fine-grained expressions, and synthesize continuous intermediate expressions between source and target expressions.
| Original language | English |
|---|---|
| Pages (from-to) | 42-51 |
| Number of pages | 10 |
| Journal | IEEE Multimedia |
| Volume | 28 |
| Issue number | 3 |
| DOIs | |
| State | Published - 2021 |
| Externally published | Yes |
Keywords
- Generative models
- fine-grained expression labels
- structured latent space