TY - JOUR
T1 - An improved stochastic EM algorithm for large-scale full-information item factor analysis
AU - Zhang, Siliang
AU - Chen, Yunxiao
AU - Liu, Yang
N1 - Publisher Copyright:
© 2018 The British Psychological Society
PY - 2020/2/1
Y1 - 2020/2/1
N2 - In this paper, we explore the use of the stochastic EM algorithm (Celeux & Diebolt (1985) Computational Statistics Quarterly, 2, 73) for large-scale full-information item factor analysis. Innovations have been made on its implementation, including an adaptive-rejection-based Gibbs sampler for the stochastic E step, a proximal gradient descent algorithm for the optimization in the M step, and diagnostic procedures for determining the burn-in size and the stopping of the algorithm. These developments are based on the theoretical results of Nielsen (2000, Bernoulli, 6, 457), as well as advanced sampling and optimization techniques. The proposed algorithm is computationally efficient and virtually tuning-free, making it scalable to large-scale data with many latent traits (e.g. more than five latent traits) and easy to use for practitioners. Standard errors of parameter estimation are also obtained based on the missing-information identity (Louis, 1982, Journal of the Royal Statistical Society, Series B, 44, 226). The performance of the algorithm is evaluated through simulation studies and an application to the analysis of the IPIP-NEO personality inventory. Extensions of the proposed algorithm to other latent variable models are discussed.
AB - In this paper, we explore the use of the stochastic EM algorithm (Celeux & Diebolt (1985) Computational Statistics Quarterly, 2, 73) for large-scale full-information item factor analysis. Innovations have been made on its implementation, including an adaptive-rejection-based Gibbs sampler for the stochastic E step, a proximal gradient descent algorithm for the optimization in the M step, and diagnostic procedures for determining the burn-in size and the stopping of the algorithm. These developments are based on the theoretical results of Nielsen (2000, Bernoulli, 6, 457), as well as advanced sampling and optimization techniques. The proposed algorithm is computationally efficient and virtually tuning-free, making it scalable to large-scale data with many latent traits (e.g. more than five latent traits) and easy to use for practitioners. Standard errors of parameter estimation are also obtained based on the missing-information identity (Louis, 1982, Journal of the Royal Statistical Society, Series B, 44, 226). The performance of the algorithm is evaluated through simulation studies and an application to the analysis of the IPIP-NEO personality inventory. Extensions of the proposed algorithm to other latent variable models are discussed.
KW - Gibbs sampler
KW - full-information item factor analysis
KW - multidimensional item response theory
KW - proximal gradient descent
KW - rejection sampling
KW - stochastic EM algorithm
UR - https://www.scopus.com/pages/publications/85057747452
U2 - 10.1111/bmsp.12153
DO - 10.1111/bmsp.12153
M3 - 文章
C2 - 30511445
AN - SCOPUS:85057747452
SN - 0007-1102
VL - 73
SP - 44
EP - 71
JO - British Journal of Mathematical and Statistical Psychology
JF - British Journal of Mathematical and Statistical Psychology
IS - 1
ER -