Biclustering via structured regularized matrix decomposition

Yan Zhong*, Jianhua Z. Huang

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

4 Scopus citations

Abstract

Biclustering is a machine learning problem that deals with simultaneously clustering of rows and columns of a data matrix. Complex structures of the data matrix such as overlapping biclusters have challenged existing methods. In this paper, we first provide a unified formulation of biclustering that uses structured regularized matrix decomposition, which synthesizes various existing methods, and then develop a new biclustering method called BCEL based on this formulation. The biclustering problem is formulated as a penalized least-squares problem that approximates the data matrix X by a multiplicative matrix decomposition UVT with sparse columns in both U and V. The squared ℓ1 , 2-norm penalty, also called the exclusive Lasso penalty, is applied to both U and V to assist identification of rows and columns included in the biclusters. The penalized least-squares problem is solved by a novel computational algorithm that combines alternating minimization and the proximal gradient method. A subsampling based procedure called stability selection is developed to select the tuning parameters and determine the bicluster membership. BCEL is shown to be competitive to existing methods in simulation studies and an application to a real-world single-cell RNA sequencing dataset.

Original languageEnglish
Article number37
JournalStatistics and Computing
Volume32
Issue number3
DOIs
StatePublished - Jun 2022

Keywords

  • Biclustering
  • Squared ℓ-norm
  • Stability selection
  • Structured sparsity

Fingerprint

Dive into the research topics of 'Biclustering via structured regularized matrix decomposition'. Together they form a unique fingerprint.

Cite this