Hybrid Expert Knowledge and Self-Supervised Learning for Diagnostic Modeling of Adductor Spasmodic and Primary Myotonic Dysphonia

Zhou Du, Hang Chen, Huijun Ding*, Jun Du, Zhen Chen

*Corresponding author for this work

Research output: Contribution to journalConference articlepeer-review

Abstract

Dysphonia encompasses a broad spectrum of vocal disorders with diverse etiologies, among which adductor spasmodic dysphonia (ADSD) and primary muscle tension dysphonia (pMTD) are particularly challenging to diagnose. Currently, the primary diagnostic method relies on subjective auditory perception by highly experienced clinicians. To alleviate the scarcity of diagnostic resources, this study develops a deep learning-based approach for automatically diagnosing ADSD and pMTD using patients' speech data. Our contributions are: (1) designing a convolutional neural network (CNN)-based diagnostic model that leverages handcrafted features derived from expert knowledge and (2) incorporating self-supervised learning (SSL) to extract more discriminative representations as input from raw waveforms adaptively. This marks the first application of deep learning techniques to ADSD and pMTD diagnostic modeling, achieving a classification accuracy of 83.3% on our newly constructed dataset.

Original languageEnglish
Pages (from-to)3543-3547
Number of pages5
JournalProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
DOIs
StatePublished - 2025
Event26th Interspeech Conference 2025 - Rotterdam, Netherlands
Duration: 17 Aug 202521 Aug 2025

Keywords

  • diagnostic speech processing
  • self-supervised learning
  • speech disorder classification

Fingerprint

Dive into the research topics of 'Hybrid Expert Knowledge and Self-Supervised Learning for Diagnostic Modeling of Adductor Spasmodic and Primary Myotonic Dysphonia'. Together they form a unique fingerprint.

Cite this