Nebula: A Scalable Privacy-Preserving Machine Learning System in Ant Financial

  • Cen Chen
  • , Bingzhe Wu
  • , Li Wang
  • , Chaochao Chen
  • , Jin Tan
  • , Lei Wang
  • , Jun Zhou
  • , Benyu Zhang

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

4 Scopus citations

Abstract

With the rapid growth of data volume, data-driven machine learning models have become a necessary part of many industrial applications. Intuitively, the more high-quality data used for training leads to better model performance. However, in reality, data are usually scattered and isolated in different organizations or companies. Such a "data isolation" problem stimulates both academia and industry to explore the collaborative learning paradigm to build better models jointly with multiple data sources. Despite the potential performance gains, this learning paradigm inevitably faces privacy issues, especially for the Fintech domain where data are sensitive by nature. In this paper, we present a privacy-preserving collaborative learning system in Ant Financial, named Nebula. Our system aims to facilitate privacy-preserving collaborative model training for industrial-scale applications. Our system is built upon a ring-allreduce MPI based distributed framework. On top of that, with some optimization strategies and novel sharing scheme, our system is able to scale up to tens of millions of data samples with hundreds of thousands of features and achieve more than 100x speedup compared with the existing state-of-the-art implementations.

Original languageEnglish
Title of host publicationCIKM 2020 - Proceedings of the 29th ACM International Conference on Information and Knowledge Management
PublisherAssociation for Computing Machinery
Pages3369-3372
Number of pages4
ISBN (Electronic)9781450368599
DOIs
StatePublished - 19 Oct 2020
Externally publishedYes
Event29th ACM International Conference on Information and Knowledge Management, CIKM 2020 - Virtual, Online, Ireland
Duration: 19 Oct 202023 Oct 2020

Publication series

NameInternational Conference on Information and Knowledge Management, Proceedings

Conference

Conference29th ACM International Conference on Information and Knowledge Management, CIKM 2020
Country/TerritoryIreland
CityVirtual, Online
Period19/10/2023/10/20

Keywords

  • collaborative learning
  • privacy-preserving machine learning

Fingerprint

Dive into the research topics of 'Nebula: A Scalable Privacy-Preserving Machine Learning System in Ant Financial'. Together they form a unique fingerprint.

Cite this