Uses of selection strategies in both spectral and sample spaces for classifying hard and soft blueberry using near infrared data

  • Menghan Hu
  • , Guangtao Zhai*
  • , Yu Zhao
  • , Zhaodi Wang
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

19 Scopus citations

Abstract

In the current work, we attempt to leverage the fewer wavelengths and samples to develop a classification model for classifying hard and soft blueberries using near infrared (NIR) data. To do this, random frog selection and active learning approaches are used in the spectral space and the sample queue, respectively. To reduce the spectral number, a random frog spectral selection approach was applied to collect wavelengths informative with hardness. Prediction model based on 22 selected spectra gave slightly better results than that based on the full spectra. In terms of the selection operation in the sample space, the query by committee was validated to be suitable for blueberry hardness classification with the accuracy, precision and recall of 78%, 74% and 98% when taking only 25 sample queries. Its standard deviation curves of performance metrics are also located in regions of low values (around 0.05) and fluctuated steadily in shape, winning over those of the other 4 active learning strategies and random method. In summary, the respective uses of random frog and query by committee in the NIR spectral vector and the sample queue showed the considerable potential for establishing a simple but robust classifier for hard and soft blueberries with very low labeling cost.

Original languageEnglish
Article number6671
JournalScientific Reports
Volume8
Issue number1
DOIs
StatePublished - 1 Dec 2018
Externally publishedYes

Fingerprint

Dive into the research topics of 'Uses of selection strategies in both spectral and sample spaces for classifying hard and soft blueberry using near infrared data'. Together they form a unique fingerprint.

Cite this