Abstract
Head-related transfer functions (HRTFs) are important for spatial audio reproduction in immersive systems. Most existing data-driven methods focus on personalized HRTF estimation of monaural spectral factors. These methods ignore the importance of binaural cues, which are essential for binaural reproduction and perception. Moreover, the significant differences among various HRTF datasets in aspects such as measurement setup limit the potential of data-driven methods. This paper proposes a binaural cue generation method (BiCG), which utilizes an implicit neural network (INN) to estimate interaural level differences (ILDs) and interaural time differences (ITDs). Experimental results show that our method outperforms existing neural field methods in terms of binaural cue generation quality across datasets. We also evaluate various data preprocessing methods, and experimental results show that extreme smoothing improves binaural cue generation performance across datasets. The work provides new insights into enhancing HRTF modeling.
| Original language | English |
|---|---|
| Journal | Proceedings - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing |
| DOIs | |
| State | Published - 2025 |
| Event | 2025 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2025 - Hyderabad, India Duration: 6 Apr 2025 → 11 Apr 2025 |
Keywords
- binaural cue
- data preprocessing
- head-related transfer function
- implicit neural network
- spatial audio