TY - GEN
T1 - Chinese Herbal Recognition Databases Using Human-In-The-Loop Feedback
AU - Wu, Nan
AU - Zhou, Yujun
AU - Xu, Hao
AU - Wu, Xinjiao
N1 - Publisher Copyright:
© 2021 Association for Computing Machinery. All rights reserved.
PY - 2021/10/19
Y1 - 2021/10/19
N2 - Traditional Chinese medicine identification plays an important role in the development of traditional Chinese medicine. Traditional Chinese medicine identification mostly relies on researchers' experience, so traditional Chinese medicine identification is still challenging. Using the computer identification of traditional Chinese medicine seems an effective method, but no dataset can train models. The lack of a dataset is the challenge of traditional Chinese medicine identification by use computers. This paper proposes a method for constructing a Chinese medicine dataset based on human-in-the-loop. This method uses a manual intervention labeling method to realize a labeling mode that saves labour resources. First, we use a web crawler to collect data from the Internet, then use a pre-model to remove some irrelevant data, next, we iterative data annotation based on the classification confidence, finally, we will obtain a dataset named CH42 that annotation by humancomputer collaboration. Besides, we designed a backbone network for explicitly modeling interdependencies between channels. The CH42 contains 42 types of Chinese medicine data, a total of 6,112 pictures, the model automatically labeled about 64% of the data. We sampled 6 sets of data and found 6 mislabeled data from 1458 pictures. The model labeling accuracy rate is about 98.6%..
AB - Traditional Chinese medicine identification plays an important role in the development of traditional Chinese medicine. Traditional Chinese medicine identification mostly relies on researchers' experience, so traditional Chinese medicine identification is still challenging. Using the computer identification of traditional Chinese medicine seems an effective method, but no dataset can train models. The lack of a dataset is the challenge of traditional Chinese medicine identification by use computers. This paper proposes a method for constructing a Chinese medicine dataset based on human-in-the-loop. This method uses a manual intervention labeling method to realize a labeling mode that saves labour resources. First, we use a web crawler to collect data from the Internet, then use a pre-model to remove some irrelevant data, next, we iterative data annotation based on the classification confidence, finally, we will obtain a dataset named CH42 that annotation by humancomputer collaboration. Besides, we designed a backbone network for explicitly modeling interdependencies between channels. The CH42 contains 42 types of Chinese medicine data, a total of 6,112 pictures, the model automatically labeled about 64% of the data. We sampled 6 sets of data and found 6 mislabeled data from 1458 pictures. The model labeling accuracy rate is about 98.6%..
KW - Dataset
KW - Deep learning
KW - Human-in-the-loop
KW - Traditional chinese medicine identification
UR - https://www.scopus.com/pages/publications/85121520787
U2 - 10.1145/3487075.3487114
DO - 10.1145/3487075.3487114
M3 - 会议稿件
AN - SCOPUS:85121520787
T3 - ACM International Conference Proceeding Series
BT - CSAE 2021 - Proceedings of the 5th International Conference on Computer Science and Application Engineering
A2 - Emrouznejad, Ali
PB - Association for Computing Machinery
T2 - 5th International Conference on Computer Science and Application Engineering, CSAE 2021
Y2 - 19 October 2021 through 21 October 2021
ER -