Skip to main navigation Skip to search Skip to main content

DSA: Scalable distributed sequence alignment system using SIMD Instructions

  • Bo Xu
  • , Changlong Li
  • , Hang Zhuang
  • , Jiali Wang
  • , Qingfeng Wang
  • , Jinhong Zhou
  • , Xuehai Zhou
  • University of Science and Technology of China

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Sequence alignment algorithms are a basic and critical component of many bioinformatics fields. With rapid development of sequencing technology, the fast growing reference database volumes and longer length of query sequence become new challenges for sequence alignment. However, the algorithms have prohibitively high time and space complexity. In this paper, we present DSA, a scalable distributed sequence alignment system that employs Apache Spark to process sequences data in a horizontally scalable distributed environment, and leverages data parallel strategy based on Single Instruction Multiple Data (SIMD) instruction to parallelize the algorithms in each core of worker node. The experimental results demonstrate that 1) DSA has outstanding performance and achieves up to 201x speedup over SparkSW. 2) DSA has excellent scalability and achieves near linear speedup when increasing the number of nodes in cluster.

Original languageEnglish
Title of host publicationProceedings - 2017 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, CCGRID 2017
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages758-761
Number of pages4
ISBN (Electronic)9781509066100
DOIs
StatePublished - 10 Jul 2017
Externally publishedYes
Event17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, CCGRID 2017 - Madrid, Spain
Duration: 14 May 201717 May 2017

Publication series

NameProceedings - 2017 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, CCGRID 2017

Conference

Conference17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, CCGRID 2017
Country/TerritorySpain
CityMadrid
Period14/05/1717/05/17

Keywords

  • Alluxio
  • Apache Spark
  • Distributed sequence alignment
  • SIMD instruction
  • Scalability

Fingerprint

Dive into the research topics of 'DSA: Scalable distributed sequence alignment system using SIMD Instructions'. Together they form a unique fingerprint.

Cite this