TY - JOUR
T1 - MT-YOLOv5
T2 - 4th International Conference on Physics, Mathematics and Statistics, ICPMS 2021
AU - Ning, Zixin
AU - Wu, Xinjiao
AU - Yang, Jing
AU - Yang, Yanqin
N1 - Publisher Copyright:
© Published under licence by IOP Publishing Ltd.
PY - 2021/7/27
Y1 - 2021/7/27
N2 - Table detection is an important task of optical character recognition(OCR). At present, table detection for desktop applications has basically reached commercial requirements. With the advancement of informatization, personal demand for table detection has gradually increased. There is an urgent need to establish a table detection method that can be deployed on handheld devices. This paper proposes a mobile terminal table detection model based on YOLOv5. First, we used YOLOv5 as the main framework of the model. However, considering the problem of connection redundancy in the backbone of YOLOv5, on the basis of retaining the YOLOv5 multi-scale detection head, we replaced the backbone of YOLOv5 with the same excellent Mobilenetv2. In addition, considering the non-linear defects of the lightweight model, we use deformable convolution to make up for it. This paper has been evaluated on the ICDAR 2019 dataset, and the results show that compared with the baseline model, the model reduces the number of parameters by half and increases the detection speed by 47%. At the same time, the model can reach 35.25 FPS on ordinary Android phones.
AB - Table detection is an important task of optical character recognition(OCR). At present, table detection for desktop applications has basically reached commercial requirements. With the advancement of informatization, personal demand for table detection has gradually increased. There is an urgent need to establish a table detection method that can be deployed on handheld devices. This paper proposes a mobile terminal table detection model based on YOLOv5. First, we used YOLOv5 as the main framework of the model. However, considering the problem of connection redundancy in the backbone of YOLOv5, on the basis of retaining the YOLOv5 multi-scale detection head, we replaced the backbone of YOLOv5 with the same excellent Mobilenetv2. In addition, considering the non-linear defects of the lightweight model, we use deformable convolution to make up for it. This paper has been evaluated on the ICDAR 2019 dataset, and the results show that compared with the baseline model, the model reduces the number of parameters by half and increases the detection speed by 47%. At the same time, the model can reach 35.25 FPS on ordinary Android phones.
UR - https://www.scopus.com/pages/publications/85112439692
U2 - 10.1088/1742-6596/1978/1/012010
DO - 10.1088/1742-6596/1978/1/012010
M3 - 会议文章
AN - SCOPUS:85112439692
SN - 1742-6588
VL - 1978
JO - Journal of Physics: Conference Series
JF - Journal of Physics: Conference Series
IS - 1
M1 - 012010
Y2 - 19 May 2021 through 21 May 2021
ER -