TY - JOUR
T1 - Automatic identification of curve shapes with applications to ultrasonic vocalization
AU - Gao, Zhikun
AU - Tang, Yanlin
AU - Wang, Huixia Judy
AU - Wu, Guangying K.
AU - Lin, Jeff
N1 - Publisher Copyright:
© 2020 Elsevier B.V.
PY - 2020/8
Y1 - 2020/8
N2 - Like human beings, many animals produce sounds for communication and social interactions. The vocalizations of mice have the characteristics of songs, consisting of syllables of different types determined by the frequency modulations and structure variations. To characterize the impact of social environments and genotypes on vocalizations, it is important to identify the patterns of syllables based on the shapes of frequency contours. Using existing hypothesis testing methods to determine the shape classes would require testing various null and alternative hypotheses for each curve, and is impractical for vocalization studies where the interest is on a large number of frequency contours. A new penalization-based method is proposed, which provides function estimation and automatic shape identification simultaneously. The method estimates the functional curve through quadratic B-spline approximation, and captures the shape feature by penalizing the positive and negative parts of the first two derivatives of the spline function in a group manner. It is shown that under some regularity conditions, the proposed method can identify the correct shape with probability approaching one, and the resulting nonparametric estimator can achieve the optimal convergence rate. Simulation shows that the proposed method gives more stable curve estimation and more accurate curve classification than the unconstrained B-spline estimator, and it is competitive to the shape-constrained estimator assuming prior knowledge of the curve shape. The proposed method is applied to the motivating vocalization study to examine the effect of Methyl-CpG binding protein 2 gene on the vocalizations of mice during courtship.
AB - Like human beings, many animals produce sounds for communication and social interactions. The vocalizations of mice have the characteristics of songs, consisting of syllables of different types determined by the frequency modulations and structure variations. To characterize the impact of social environments and genotypes on vocalizations, it is important to identify the patterns of syllables based on the shapes of frequency contours. Using existing hypothesis testing methods to determine the shape classes would require testing various null and alternative hypotheses for each curve, and is impractical for vocalization studies where the interest is on a large number of frequency contours. A new penalization-based method is proposed, which provides function estimation and automatic shape identification simultaneously. The method estimates the functional curve through quadratic B-spline approximation, and captures the shape feature by penalizing the positive and negative parts of the first two derivatives of the spline function in a group manner. It is shown that under some regularity conditions, the proposed method can identify the correct shape with probability approaching one, and the resulting nonparametric estimator can achieve the optimal convergence rate. Simulation shows that the proposed method gives more stable curve estimation and more accurate curve classification than the unconstrained B-spline estimator, and it is competitive to the shape-constrained estimator assuming prior knowledge of the curve shape. The proposed method is applied to the motivating vocalization study to examine the effect of Methyl-CpG binding protein 2 gene on the vocalizations of mice during courtship.
KW - Curve classification
KW - Nonparametric regression
KW - Penalization
KW - Shape identification
KW - Ultrasonic vocalization
UR - https://www.scopus.com/pages/publications/85082523229
U2 - 10.1016/j.csda.2020.106956
DO - 10.1016/j.csda.2020.106956
M3 - 文章
AN - SCOPUS:85082523229
SN - 0167-9473
VL - 148
JO - Computational Statistics and Data Analysis
JF - Computational Statistics and Data Analysis
M1 - 106956
ER -