Paper Reading AI Learner

Construction material classification on imbalanced datasets for construction monitoring automation using Vision Transformer architecture

2021-08-21 15:29:56
Maryam Soleymani, Mahdi Bonyani, Hadi Mahami, Farnad Nasirzadeh

Abstract

Nowadays, automation is a critical topic due to its significant impacts on the productivity of construction projects. Utilizing automation in this industry brings about great results, such as remarkable improvements in the efficiency, quality, and safety of construction activities. The scope of automation in construction includes a wide range of stages, and monitoring construction projects is no exception. Additionally, it is of great importance in project management since an accurate and timely assessment of project progress enables managers to quickly identify deviations from the schedule and take the required actions at the right time. In this stage, one of the most important tasks is to daily keep track of the project progress, which is very time-consuming and labor-intensive, but automation has facilitated and accelerated this task. It also eliminated or at least decreased the risk of many dangerous tasks. In this way, the first step of construction automation is to detect used materials in a project site automatically. In this paper, a novel deep learning architecture is utilized, called Vision Transformer (ViT), for detecting and classifying construction materials. To evaluate the applicability and performance of the proposed method, it is trained and tested on three large imbalanced datasets, namely Construction Material Library (CML) and Building Material Dataset (BMD), used in the previous papers, as well as a new dataset created by combining them. The achieved results revealed an accuracy of 100 percent in all parameters and also in each material category. It is believed that the proposed method provides a novel and robust tool for detecting and classifying different material types.

Abstract (translated)

URL

https://arxiv.org/abs/2108.09527

PDF

https://arxiv.org/pdf/2108.09527.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot