2022, 14(4): 1-8. doi: 10.16670/j.cnki.cn11-5823/tu.2022.04.01
基于自然语言处理的结构设计规范分类方法
1. | 大连交通大学土木工程学院, 大连 116028 |
2. | 大石桥建筑设计院有限公司, 营口 115100 |
3. | 卡迪夫大学工学院, 英国威尔士卡迪夫CF24 3 AA |
Classification of Structural Design Specification Based on Natural Language Processing
1. | School of Civil Engineering, Dalian Jiaotong University, Dalian 116028, China |
2. | Dashiqiao Architectural Design Institute Co., Ltd., Yingkou 115100, China |
3. | Cardiff School of Engineering, Cardiff University, Cardiff CF24 3 AA, UK |
引用本文: 张吉松, 张庆森, 赵丽华, 刘鑫, 任国乾. 基于自然语言处理的结构设计规范分类方法[J]. 土木建筑工程信息技术, 2022, 14(4): 1-8. doi: 10.16670/j.cnki.cn11-5823/tu.2022.04.01
Citation: Jisong Zhang, Qingsen Zhang, Lihua Zhao, Xin Liu, Guoqian Ren. Classification of Structural Design Specification Based on Natural Language Processing[J]. Journal of Information Technologyin Civil Engineering and Architecture, 2022, 14(4): 1-8. doi: 10.16670/j.cnki.cn11-5823/tu.2022.04.01
摘要:规范转译是BIM模型合规性审查的重要步骤,也是实现设计审查自动化和智能化的技术基础和前提条件。规范转译第一步是将设计规范自动分类为预定义类别,以便为后续文本分析和规则提取做准备。然而,由于结构设计领域语料库缺乏,导致设计规范自动分类技术有待开发,因此,依据《混凝土结构设计规范》和《建筑抗震设计规范》,创建结构设计语料库,按照IFC实体名称目录,通过Python语言编程,基于机器学习的文本分类算法,提出一种结构设计规范自动分类方法。过程的实现包括:数据准备和文本预处理;特征提取和选择;分类器的训练、测试和评估。结果表明:该分类方法可以有效实现结构设计规范的自动分类,分类器对测试规范的精确率和召回率可达到75%和83%。
Abstract: Specification translation is an important step of BIM compliance checking, and it is also the technical basis and prerequisite for realizing automatic and intelligent code compliance checking. The first step of specification translation is to automatically classify design specifications into predefined categories for preparing the subsequent text analysis and rule extraction. However, due to the lack of corpus in the field of structural design, the automatic classification technology of design specifications needs to be developed. Therefore, based on the "Code for Concrete Structure Design" and "Code for Seismic Design of Buildings", a structural design corpus is created. According to IFC entity name catalog, an automatic classification method of structural design code is proposed by Python language programming and text classification algorithm based on machine learning. The process can be divided into three steps: data preparation and text preprocessing; feature extraction and selection; training, testing and evaluation of classifiers. The results show that the classification method can effectively realize the automatic classification of structural design specifications, and the accuracy and recall rate of the classifier to the test specifications can reach 75%and 83%.
[1] |
林佳瑞, 郭建锋. 基于BIM的合规性自动审查[J]. 清华大学学报(自然科学版), 2020, 60(10): 873-879. |
[2] |
刘洪. 基于BIM的结构设计规范审查方法研究[D]. 重庆大学, 2017. |
[3] |
FENVES S.J. Tabular decision logic for structural design[J]. Journal of Structural Engineering, 1966, 92(ST6): 473-490. |
[4] |
EASTMAN C, LEE J-M, JEONG Y-S, et al. Automatic rule-based checking of building designs[J]. Automation in Construction, 2009, 18(2009): 1011-1033. |
[5] |
SALAMA D M, EL-GOHARY N M. Semantic modeling for automated compliance checking[C]//. Internation-al Workshop on Computing in Civil Engineering. Miami, USA: ASCE, 2011, 641-648. |
[6] |
PARK S, LEE H, LEE S, et al. Rule checking method-centered approach to represent building permit require-ments[C]//. Proceedings of the 32nd International Symposium on Autom-ation and Robotics in Construction. 2015. |
[7] |
SIJIE ZHANG, FRANK BOUKAMP, JOCHEN TEIZER. Ontology-based semantic modeling of construction safety knowledge: Towards automated safety planning for job hazard analysis(JHA)[J]. Automation in Construction. 2015. |
[8] |
ZANNI M A, SOETANTO R, RUIKAR K. Defining the sustainable building design process: Methods for BIM execution planning in the UK[J]. International Journal of Energy Sector Management, 2014, 8(4): 562-587.doi: 10.1108/IJESM-04-2014-0005 |
[9] |
YURCHYSHYAN A, ZARLI A. An ontology-based approach for forma-lisation and semantic organisation of conformance requirements in construction[J]. Automation in Con-struction, 2009, 18(2009): 1084-1098. |
[10] |
ISMAILA, STRUGB, LUSARCYKG. Building Knowledge Extraction from BIM/IFC Date for Analysis in Graph Databases[M]. Springer, Cham, 2018. |
[11] |
LIS. CAIHB. KAMATVR. Integrating natural language processing an spatial reasoning for utility compliance checking[J]. Journal of Construction Engineering and Management, 2016, 142(12): 4016074.doi: 10.1061/(ASCE)CO.1943-7862.0001199 |
[12] |
TIWARY U S, SIDDIQUI T. Natural language processing and information retrieval[M]. New York: Oxford University Press, 2008, 3-21. |
[13] |
HANIKA KASHYAP, BALA BUKSH. Combining Naive Bayes and Modified Maximum Entropy Classifiers for Text Classification[J]. International Journal of Information Technology and Computer Science(IJITCS), 2016, 8(9). |
[14] |
朱文峰. 基于支持向量机与神经网络的文本分类算法研究[D]. 南京邮电大学, 2019. |
[15] |
YING YI, MURSITAMA TN, Shidarta, et al. Effectiveness of the News Text Classification Test Using the Naive Bayes' Classification Text Mining Method[J]. Journal of Physics: Conference Series, 2021, 1764(1): 012105.doi: 10.1088/1742-6596/1764/1/012105 |
[16] |
SALAMA D M, EL-GOHARY N M. Semantic text classification for supporting automated compliance checking in construction[J]. Journal of Computing in Civil Engineering, 2013, 30(1): 04014106. |
[17] |
KIRAN R, KUMAR P, BHASKER B. OSLCFit(Organic Simultaneous LSTM and CNN Fit): A Novel Deep Learning Based Solution for Sentiment Polarity Classification of Reviews[J]. Expert Systems With Applications, 2020, 157: 113488.doi: 10.1016/j.eswa.2020.113488 |
[18] |
中华人民共和国住房和城乡建设部. 混凝土结构设计规范: GB 50010—2010[S]. 北京: 中国建筑工业出版社, 2011. |
[19] |
中华人民共和国住房和城乡建设部, 中华人民共和国国家质量监督检验检疫总局. 建筑抗震设计规范: GB 50011—2010[S]. 北京: 中国建筑工业出版社, 2010. |
[20] |
AUWELS P and TERKAJ W, EXPRESS to OWL for construction industry: Towards a remommendable and usable ifcOWL ontology[J]. Automation in Construction, 2016, 63(2016): 100-133. |
[21] |
You S-J, Yang D, Eastman C M. Relational DB implementation of STEP based product model[C]//. CIB World Building Congress 200 4, 2004. |
[22] |
KANG H-S, LEE G. Development of an object-relational IFC server[C]//. ICCEM/ICCP M, 2009. |
[23] |
许晓昕, 李安贵. 一种基于TFIDF的网络聊天关键词提取算法[J]. 计算机技术与发展, 2006, 16(3): 122-123. |
[24] |
李航. 统计学习方法[M]. 北京: 清华大学出版社, 2012. |
计量
- PDF下载量(40)
- 文章访问量(2102)
- HTML全文浏览量(1061)