TEAdapter: Supply Vivid Guidance for Controllable Text-to-Music Generation

Jialing Zou, Jiahao Mei, Xu Dong Nan, Jinghua Li, Daoguo Dong, Liang He

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Although current text-guided music generation technology can cope with simple creative scenarios, achieving finegrained control over individual text-modality conditions remains challenging as user demands become more intricate. Accordingly, we introduce the TEAcher Adapter (TEAdapter), a compact plugin designed to guide the generation process with diverse control information provided by users. In addition, we explore the controllable generation of extended music by leveraging TEAdapter control groups trained on data of distinct structural functionalities. In general, we consider controls over global, elemental, and structural levels. Experimental results demonstrate that the proposed TEAdapter enables multiple precise controls and ensures high-quality music generation. Our module is also lightweight and transferable to any diffusion model architecture. Available code and demos will be found soon at https://github.com/Ashley1101/TEAdapter.

Original languageEnglish
Title of host publication2024 IEEE International Conference on Multimedia and Expo, ICME 2024
PublisherIEEE Computer Society
ISBN (Electronic)9798350390155
DOIs
StatePublished - 2024
Event2024 IEEE International Conference on Multimedia and Expo, ICME 2024 - Niagra Falls, Canada
Duration: 15 Jul 202419 Jul 2024

Publication series

NameProceedings - IEEE International Conference on Multimedia and Expo
ISSN (Print)1945-7871
ISSN (Electronic)1945-788X

Conference

Conference2024 IEEE International Conference on Multimedia and Expo, ICME 2024
Country/TerritoryCanada
CityNiagra Falls
Period15/07/2419/07/24

Keywords

  • Additional plugins
  • Controllability enhancement
  • Music generation

Fingerprint

Dive into the research topics of 'TEAdapter: Supply Vivid Guidance for Controllable Text-to-Music Generation'. Together they form a unique fingerprint.

Cite this