Domain-specific modelware: To make the machine learning model reusable and reproducible

Hui Zhao, Jimin Liang, Xuezhen Yin, Lingfeng Yang, Peili Yang, Yuhang Wang

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

3 Scopus citations

Abstract

Machine learning task is a routine process including data collection, feature engineering, model training, hyper-parameters tuning, model evaluation and model deployment. The process is usually complex, iterated and time-consuming. Commonly, researchers seldom start building the machine model from scratch. They may select some well-known and well-trained models in similar task domains as the reference models. Then they try to tune the hyper-parameters and accelerate the iteration. Thus, some models are often reused and need to be reproduced by using new training dataset. Moreover, understanding the model and the iteration is more necessary. This scenario is very similar to that of software reuse. In this poster, we propose Modelware and argue the need of Modelware to make the machine learning model reusable and reproducible. We define the Modelware which is the reused object and develop a model repository to provide the model lineage management and model visit tool. The big data for building model is managed collaboratively so that the model can be reproduced. The iteration process to obtain the final optimized model is abstracted and implemented using a lightweight workflow. Finally, we take two different classification tasks as the demonstration.

Original languageEnglish
Title of host publicationProceedings of the 12th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, ESEM 2018
PublisherIEEE Computer Society
ISBN (Electronic)9781450358231
DOIs
StatePublished - 11 Oct 2018
Event12th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, ESEM 2018 - Oulu, Finland
Duration: 11 Oct 201812 Oct 2018

Publication series

NameInternational Symposium on Empirical Software Engineering and Measurement
ISSN (Print)1949-3770
ISSN (Electronic)1949-3789

Conference

Conference12th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, ESEM 2018
Country/TerritoryFinland
CityOulu
Period11/10/1812/10/18

Keywords

  • Machine learning
  • Model repository
  • Model specification
  • Modelware
  • Reproduce
  • Reuse

Fingerprint

Dive into the research topics of 'Domain-specific modelware: To make the machine learning model reusable and reproducible'. Together they form a unique fingerprint.

Cite this