基于自然语言处理的结构设计规范分类方法

张吉松; 张庆森; 赵丽华; 刘鑫; 任国乾

doi:10.16670/j.cnki.cn11-5823/tu.2022.04.01

基于自然语言处理的结构设计规范分类方法

Classification of Structural Design Specification Based on Natural Language Processing

摘要

摘要: 规范转译是BIM模型合规性审查的重要步骤，也是实现设计审查自动化和智能化的技术基础和前提条件。规范转译第一步是将设计规范自动分类为预定义类别，以便为后续文本分析和规则提取做准备。然而，由于结构设计领域语料库缺乏，导致设计规范自动分类技术有待开发，因此，依据《混凝土结构设计规范》和《建筑抗震设计规范》，创建结构设计语料库，按照IFC实体名称目录，通过Python语言编程，基于机器学习的文本分类算法，提出一种结构设计规范自动分类方法。过程的实现包括：数据准备和文本预处理；特征提取和选择；分类器的训练、测试和评估。结果表明：该分类方法可以有效实现结构设计规范的自动分类，分类器对测试规范的精确率和召回率可达到75%和83%。

Abstract: Specification translation is an important step of BIM compliance checking, and it is also the technical basis and prerequisite for realizing automatic and intelligent code compliance checking. The first step of specification translation is to automatically classify design specifications into predefined categories for preparing the subsequent text analysis and rule extraction. However, due to the lack of corpus in the field of structural design, the automatic classification technology of design specifications needs to be developed. Therefore, based on the "Code for Concrete Structure Design" and "Code for Seismic Design of Buildings", a structural design corpus is created. According to IFC entity name catalog, an automatic classification method of structural design code is proposed by Python language programming and text classification algorithm based on machine learning. The process can be divided into three steps: data preparation and text preprocessing; feature extraction and selection; training, testing and evaluation of classifiers. The results show that the classification method can effectively realize the automatic classification of structural design specifications, and the accuracy and recall rate of the classifier to the test specifications can reach 75%and 83%.

HTML全文

参考文献(24)

施引文献

资源附件(0)