A protein classification method based on latent semantic analysis

Yuan Yongsheng, Lin Lei, Dong Qiwen, Wang Xiaolong, Li Minghui

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

5 Scopus citations

Abstract

In this paper a new method that uses Latent Semantic Analysis (LSA) to denote a protein sequence is proposed for researching the protein classification problem. A protein is vectorized according to its content of biological words: patterns and motifs, which are generated by utilizing TEIRESIAS algorithm and MEME/MAST system respectively. More precise description vectors of proteins are obtained through employing LSA. Those vectors are used to classify proteins combined with the Support Vector Machine (SVM). Experiments of family-level protein classification on Structural Classification of Proteins database show that the performance of this method is better than that of the other state-of-the-arts methods.

Original languageEnglish
Title of host publicationProceedings of the 2005 27th Annual International Conference of the Engineering in Medicine and Biology Society, IEEE-EMBS 2005
Pages7738-7741
Number of pages4
StatePublished - 2005
Externally publishedYes
Event2005 27th Annual International Conference of the Engineering in Medicine and Biology Society, IEEE-EMBS 2005 - Shanghai, China
Duration: 1 Sep 20054 Sep 2005

Publication series

NameAnnual International Conference of the IEEE Engineering in Medicine and Biology - Proceedings
Volume7 VOLS
ISSN (Print)0589-1019

Conference

Conference2005 27th Annual International Conference of the Engineering in Medicine and Biology Society, IEEE-EMBS 2005
Country/TerritoryChina
CityShanghai
Period1/09/054/09/05

Fingerprint

Dive into the research topics of 'A protein classification method based on latent semantic analysis'. Together they form a unique fingerprint.

Cite this