TY - GEN
T1 - A 28-nW Noise-Robust Voice Activity Detector with Background Aware Feature Extraction
AU - Yang, Jingsen
AU - Lyu, Liangjian
AU - Dong, Zirui
AU - Ren, Heyu
AU - Shi, C. J.Richard
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - In light of the increasing number of Internet of Things (loT) devices, such as intelligent vehicles and smart assistants, it has become imperative to develop low-power Voice Activity Detection (VAD) devices. The always-on VAD devices detect the voice to wake up the target system, thus dominating the standby power consumption of the loT devices. The typical VAD consists of a feature extractor and a neural-network-based classifier. The algorithm using the frequency-domain features which can be obtained by modulation frequency [1], fast Fourier transform (FFT) [2], and analog filter banks, can achieve high detection accuracy. However, the feature extractor induces high power consumption due to the complex operations. Alternatively, the time-domain analog VADs [5]-[6] achieve low power consumption, due to the lack of a frequency extractor, but also suffers from reduced accuracy that the audio amplitude is interfered with noise easily, especially in noisy environments with a signal-to-noise ratio (SNR) is lower than OdB. In summary, achieving high accuracy and low power consumption simultaneously in VAD devices is a critical challenge.
AB - In light of the increasing number of Internet of Things (loT) devices, such as intelligent vehicles and smart assistants, it has become imperative to develop low-power Voice Activity Detection (VAD) devices. The always-on VAD devices detect the voice to wake up the target system, thus dominating the standby power consumption of the loT devices. The typical VAD consists of a feature extractor and a neural-network-based classifier. The algorithm using the frequency-domain features which can be obtained by modulation frequency [1], fast Fourier transform (FFT) [2], and analog filter banks, can achieve high detection accuracy. However, the feature extractor induces high power consumption due to the complex operations. Alternatively, the time-domain analog VADs [5]-[6] achieve low power consumption, due to the lack of a frequency extractor, but also suffers from reduced accuracy that the audio amplitude is interfered with noise easily, especially in noisy environments with a signal-to-noise ratio (SNR) is lower than OdB. In summary, achieving high accuracy and low power consumption simultaneously in VAD devices is a critical challenge.
UR - https://www.scopus.com/pages/publications/85182279321
U2 - 10.1109/A-SSCC58667.2023.10347926
DO - 10.1109/A-SSCC58667.2023.10347926
M3 - 会议稿件
AN - SCOPUS:85182279321
T3 - 2023 IEEE Asian Solid-State Circuits Conference, A-SSCC 2023
BT - 2023 IEEE Asian Solid-State Circuits Conference, A-SSCC 2023
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 19th IEEE Asian Solid-State Circuits Conference, A-SSCC 2023
Y2 - 5 November 2023 through 8 November 2023
ER -