2022, 14(2): 110-115. doi: 10.16670/j.cnki.cn11-5823/tu.2022.02.16
基于深度学习的变电站钢结构图纸标题栏文字检测与识别
国网上海市电力公司,上海 200120 |
Text Detection and Recognition of Drawing Title Bar of Substation Steel Structure Based on Deep Learning
State Grid Shanghai Municipal Electric Power Company, Shanghai 200120, China |
引用本文: 秦辞海, 顾万里. 基于深度学习的变电站钢结构图纸标题栏文字检测与识别[J]. 土木建筑工程信息技术, 2022, 14(2): 110-115. doi: 10.16670/j.cnki.cn11-5823/tu.2022.02.16
Citation: Cihai Qin, Wanli Gu. Text Detection and Recognition of Drawing Title Bar of Substation Steel Structure Based on Deep Learning[J]. Journal of Information Technologyin Civil Engineering and Architecture, 2022, 14(2): 110-115. doi: 10.16670/j.cnki.cn11-5823/tu.2022.02.16
摘要:为实现变电站工程建设中钢结构与电力设备的配套控制管理,需要从大量的钢结构图纸标题栏中识别相关信息,并与实物进行匹配。针对标题栏中字体模糊、表格形式多样、信息量混杂等问题,提出了基于深度学习CNN+RNN模型的文本检测和CRNN模型的文字识别方法。对现有钢结构变电站工程施工现场钢结构数据集的检测与识别显示,该方法的检测精确率达到80%以上,识别准确率达到90%以上,均优于其他文本检测与识别方法。工程应用结果表明,该方法有效解决了因文字的大小、字体、颜色与排列方式等差异引起的特征提取困难,提高了变电站钢结构图纸标题栏文字识别的准确率。
Abstract: In order to realize the control and management of steel structure and power equipment in the substation engineering construction, it is necessary to identify the relevant information from the title bar of a large number of steel structure drawings, and subsequently contrast them with the real structures. To deal with the blurriness of word, diversity of table and confusion of information, a deep learning method combining the CNN+RNN text detection model and the CRNN character recognition model is being proposed. Carrying out the detection and recognition experiments in the existing data set of steel structures, the detection precision reaches over 80% and the recognition accuracy reaches over 90%, which is superior to other detection and recognition methods. The results of the engineering application show that this method can effectively reduce the difficulty in feature extraction caused by the differences in arrangement, size, font and color of text, which can improve the accuracy of text recognition in title bar of steel structure drawings of substations.
[1] |
Krizhevsky A, Sutskever I, Hinton G E. Imagenet classification with deep convolutional neural networks[J]. Communications of the ACM, 2017, 60(6): 84-90.doi: 10.1145/3065386 |
[2] |
周翔, 陈会, 张锴, 等. 复杂背景下的图像文本区域定位方法研究[J]. 计算机工程与应用, 2013, 49(12): 101-105.doi: 10.3778/j.issn.1002-8331.1110-0134 |
[3] |
黄娜君, 汪慧兰, 朱强军, 等. 基于ROI和CNN的交通标志识别研究[J]. 无线电通信技术, 2018, 044(002): 160-164. |
[4] |
He P, Huang W, He T, et al. Single Shot Text Detector with Regional Attention[C]. 2017 IEEE International Conference on Computer Vision(ICCV). IEEE, 2017. |
[5] |
Shi B, Bai X, Belongie S. Detecting oriented text in natural images by linking segments[C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017: 2550-2558. |
[6] |
Zhang Z, Zhang C, Shen W, et al. Multi-oriented Text Detection with Fully Convolutional Networks[C]. 2016 IEEE Conference on Computer Vision and Pattern Recognition(CVPR). IEEE, 2016. |
[7] |
Wang T, Wu D J, Coates A, et al. End-to-end text recognition with convolutional neural networks[C]. Proceedings of the 21st international conference on pattern recognition(ICPR2012). IEEE, 2012: 3304-3308. |
[8] |
Lecun Y, Bottou L. Gradient-based learning applied to document recognition[J]. Proceedings of the IEEE, 1998, 86(11): 2278-2324.doi: 10.1109/5.726791 |
[9] |
Dubey A K, Jain V. Automatic facial recognition using VGG16 based transfer learning model[J]. Journal of Information and Optimization Sciences, 2020, 1-8. |
[10] |
Gers F A, Schraudolph N N, Schmidhuber J. Learning Precise Timing with LSTM Recurrent Networks[J]. Journal of Machine Learnig Research, 2003, 3(1): p. 115-143. |
[11] |
Shi B, Bai X, Yao C. An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 1-1.doi: 10.1109/TPAMI.2017.2701373 |
[12] |
Shi B, Yao C, Liao M, et al. ICDAR2017 Competition on Reading Chinese Text in the Wild(RCTW-17)[C]//2017 14th IAPR International Conference on Document Analysis and Recognition(ICDAR). IEEE, 2017. |
[13] |
孙凯, 姚旭峰, 黄钢. 基于机器学习的白细胞六分类研究[J]. 软件, 2020, 41(10): 98-101+134. |
[14] |
Nayef N, Yin F, Bizid I, et al. ICDAR2017 Robust Reading Challenge on Multi-Lingual Scene Text Detection and Script Identification-RRC-MLT[C]//2017 14th IAPR International Conference on Document Analysis and Recognition(ICDAR). IEEE, 2017. |
[15] |
Wenhao He, Xu-Yao Zhang, Fei Yin, Cheng-Lin Liu. Deep Direct Regression for Multi-Oriented Scene Text Detection[J]. arXiv preprint arXiv: 1703.08289v1. |
[16] |
Yao C, Bai X, Liu W, et al. Detecting texts of arbitrary orientations in natural images[C]. Computer Vision & Pattern Recognition. IEEE, 2012. |
[17] |
Song Y, Cui Y, Hu Han, et al. Scene Text Detection via Deep Semantic Feature Fusion and Attention-based Refinement[C]. 2018 24th International Conference on Pattern Recognition(ICPR). 2018. |
计量
- PDF下载量(37)
- 文章访问量(1863)
- HTML全文浏览量(866)