跳到主要导航 跳到搜索 跳到主要内容

Parallel randomized block coordinate descent for neural probabilistic language model with high-dimensional output targets

  • Xin Liu
  • , Junchi Yan*
  • , Xiangfeng Wang
  • , Hongyuan Zha
  • *此作品的通讯作者

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

Training a large probabilistic neural network language model, with typical high-dimensional output is excessively timeconsuming, which is one of the main reasons that more simplified models such as n-gram is often more popular despite the inferior performance. In this paper a Chinese neural probabilistic language model is trained using the Fudan Chinese Language Corpus. As hundreds of thousands of distinct words have been tokenized from the raw corpus, the model contains tens of millions of parameters. To address the challenge, popular parallel computing platform MPI (Message Passing Interface) based on cluster is employed to implement the parallel neural network language model. Specifically, we propose a new method termed as Parallel Randomized Block Coordinate Descent (PRBCD) to train this model cost-effectively. Different from traditional coordinate descent method, our new method could be employed in network with multiple layers, allowing scaling up the gradients with respect to hidden units proportionally based on sampled parameters. We empirically show that our PRBCD is stable and is well suited for language models, which contain only a few layers while often have a large amount of parameters and extremely high-dimensional output targets.

源语言英语
主期刊名Pattern Recognition - 7th Chinese Conference, CCPR 2016, Proceedings
编辑Tieniu Tan, Xilin Chen, Xuelong Li, Jian Yang, Hong Cheng, Jie Zhou
出版商Springer Verlag
334-348
页数15
ISBN(印刷版)9789811030048
DOI
出版状态已出版 - 2016

出版系列

姓名Communications in Computer and Information Science
663
ISSN(印刷版)1865-0929

指纹

探究 'Parallel randomized block coordinate descent for neural probabilistic language model with high-dimensional output targets' 的科研主题。它们共同构成独一无二的指纹。

引用此