TY - JOUR
T1 - PPGSpeech
T2 - A Wearable Silent Speech Interface Leveraging Neck-worn Photoplethysmography
AU - Hu, Lingde
AU - Zhang, Wenbo
AU - Zhang, Wenkang
AU - He, Yu
AU - Choi, Seokmin
AU - Gao, Yang
AU - Chauhan, Jagmohan
AU - Jin, Zhanpeng
N1 - Publisher Copyright:
© 2014 IEEE.
PY - 2025
Y1 - 2025
N2 - Silent speech interfaces (SSIs) promise private and noise-immune communication, but current solutions often sacrifice user comfort, mobility, or privacy. This paper introduces PPGSpeech, a novel SSI that overcomes these limitations by pioneering the use of photoplethysmography (PPG) acquired from a comfortable, necklace-style wearable device. Our core discovery is that subtle neck muscle movements during silent articulation induce distinct, measurable modulations in the underlying PPG signal. To harness this phenomenon, we developed a complete end-to-end system featuring (1) a custom neck-worn sensor for multi-wavelength PPG acquisition, (2) a deep learning pipeline that converts 1D PPG signals into 2D time-frequency images via Continuous Wavelet Transform (CWT) and classifies them using a lightweight CNN, and (3) a Pix2Pix GAN model to reconstruct audible speech from the captured signals. In a 16-participant study covering a vocabulary of 15 commands and four confounding actions, our user-dependent model achieved a recognition accuracy of 81.41% ± 9.74. Furthermore, our speech reconstruction achieved a Mean Opinion Score (MOS) of 3.48 and a Word Correct Rate (WCR) of 60.67%, demonstrating that the PPG signal is sufficiently rich to recover intelligible speech. By establishing the viability of neck-based PPG for silent speech, PPGSpeech offers a discreet, privacy-preserving, and continuously wearable paradigm for next-generation human-computer interaction.
AB - Silent speech interfaces (SSIs) promise private and noise-immune communication, but current solutions often sacrifice user comfort, mobility, or privacy. This paper introduces PPGSpeech, a novel SSI that overcomes these limitations by pioneering the use of photoplethysmography (PPG) acquired from a comfortable, necklace-style wearable device. Our core discovery is that subtle neck muscle movements during silent articulation induce distinct, measurable modulations in the underlying PPG signal. To harness this phenomenon, we developed a complete end-to-end system featuring (1) a custom neck-worn sensor for multi-wavelength PPG acquisition, (2) a deep learning pipeline that converts 1D PPG signals into 2D time-frequency images via Continuous Wavelet Transform (CWT) and classifies them using a lightweight CNN, and (3) a Pix2Pix GAN model to reconstruct audible speech from the captured signals. In a 16-participant study covering a vocabulary of 15 commands and four confounding actions, our user-dependent model achieved a recognition accuracy of 81.41% ± 9.74. Furthermore, our speech reconstruction achieved a Mean Opinion Score (MOS) of 3.48 and a Word Correct Rate (WCR) of 60.67%, demonstrating that the PPG signal is sufficiently rich to recover intelligible speech. By establishing the viability of neck-based PPG for silent speech, PPGSpeech offers a discreet, privacy-preserving, and continuously wearable paradigm for next-generation human-computer interaction.
KW - Neck-worn Sensor
KW - PPG
KW - Silent Speech Recognition
KW - Wearable
UR - https://www.scopus.com/pages/publications/105024423157
U2 - 10.1109/JIOT.2025.3639152
DO - 10.1109/JIOT.2025.3639152
M3 - 文章
AN - SCOPUS:105024423157
SN - 2327-4662
JO - IEEE Internet of Things Journal
JF - IEEE Internet of Things Journal
ER -