CatBoost-enhanced EWMA chart: monitoring high-dimensional categorical data streams

  • Zhiwen Fang
  • , Yan Li
  • , Fugee Tsung
  • , Dongdong Xiang*
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

With the rapid development of modern sensor technology, data flow characterised by high-dimension and category frequently appear, which poses a great challenge to traditional statistical process control (SPC) tools. In this study, by making full use of the information provided by the historical out-of-control (OC) data, we construct a Phase II EWMA control scheme based on the probabilities of in-control (IC) state from the gradient boosting with categorical features support (CatBoost). Comprehensive simulation analyses are performed to examine the characteristics of the proposed control chart under various scenarios relative to some existing multivariate control charts. The simulation findings indicate that the proposed control chart demonstrates greater efficiency versus its competitors across numerous categorical data situations. In addition, we illustrate the practicality and efficacy of the proposed control chart through a case study involving gene sequences.

Original languageEnglish
JournalInternational Journal of Production Research
DOIs
StateAccepted/In press - 2025

Keywords

  • CatBoost algorithm
  • EWMA
  • Multivariate statistical process control
  • OC information
  • categorical data

Fingerprint

Dive into the research topics of 'CatBoost-enhanced EWMA chart: monitoring high-dimensional categorical data streams'. Together they form a unique fingerprint.

Cite this