Redundancy Elimination in Distributed Matrix Computation

Zihao Chen, Baokun Han, Chen Xu, Weining Qian, Aoying Zhou

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

3 Scopus citations

Abstract

As matrix computation becomes increasingly prevalent in large-scale data analysis, distributed matrix computation solutions have emerged. These solutions support query interfaces of linear algebra expressions, which often contain redundant subexpressions, i.e., common and loop-constant subexpressions. Hence, existing compilers rewrite queries to eliminate such redundancy. However, due to the large search space, they fail to find all redundant subexpressions, especially for matrix multiplication chains. Furthermore, redundancy elimination may change the original execution order of operators, and have negative impacts. To reduce the large search space and avoid the negative impacts, we propose automatic elimination and adaptive elimination, respectively. In particular, automatic elimination adopts a block-wise search that exploits the properties of matrix computation for speed-up. Adaptive elimination employs a cost model and a dynamic programming-based method to generate efficient plans for redundancy elimination. Finally, we implement ReMac atop SystemDS, eliminating redundancy in distributed matrix computation. In our experiments, ReMac is able to generate efficient execution plans at affordable overhead costs, and outperforms state-of-the-art solutions by an order of magnitude.

Original languageEnglish
Title of host publicationSIGMOD 2022 - Proceedings of the 2022 International Conference on Management of Data
PublisherAssociation for Computing Machinery
Pages573-586
Number of pages14
ISBN (Electronic)9781450392495
DOIs
StatePublished - Jun 2022
Event2022 ACM SIGMOD International Conference on the Management of Data, SIGMOD 2022 - Hybrid, Philadelphia, United States
Duration: 12 Jun 202217 Jun 2022

Publication series

NameProceedings of the ACM SIGMOD International Conference on Management of Data
ISSN (Print)0730-8078

Conference

Conference2022 ACM SIGMOD International Conference on the Management of Data, SIGMOD 2022
Country/TerritoryUnited States
CityHybrid, Philadelphia
Period12/06/2217/06/22

Keywords

  • distributed system
  • matrix computation
  • redundancy elimination

Fingerprint

Dive into the research topics of 'Redundancy Elimination in Distributed Matrix Computation'. Together they form a unique fingerprint.

Cite this