Structural Cracks Don’t Stand a Chance Against This Dual-Module AI

A new dual-module system combining YOLOv8 and the Swin Transformer shows that AI can detect structural cracks more quickly and accurately than humans. This offers a significant upgrade in how building safety inspections are carried out.

Cracked concrete wall, broken building structure.

Study: Improved Dual-Module YOLOv8 Algorithm for Building Crack Detection. Image Credit: Bowonpat Sakaew/Shutterstock.com

Bridging the Gaps in Current Inspection Methods

Cracks are among the earliest and most critical signs of structural deterioration. Identifying them accurately is essential for preventive maintenance and structural health monitoring.

Traditional inspection methods, often manual, are time-consuming, risky, and increasingly impractical, especially with the rise of high-density urban construction and towering buildings.

To address these challenges, computer vision and digital image processing technologies are stepping in. These methods offer greater speed, consistency, and scalability, especially as cities grow taller and denser. By analyzing visual data, they can quickly pinpoint cracks and deliver essential metrics to assess the integrity of building structures.

Deep learning, in particular, has become central to efforts aimed at automating crack detection. However, there remains a gap between academic models and real-world engineering applications.

This study is hoping to plug that gap by pushing the field forward with an AI-driven system designed to be both efficient and precise.

Inside the Dual-Module Crack Detection System

In a study published in Buildings, researchers proposed a dual-module crack detection system powered by YOLOv8 and the Swin Transformer. The system was trained on a diverse dataset of crack images sourced from online platforms, on-site photography, and open-access image repositories.

To enhance the detection of fine cracks without significantly increasing computational load, the team introduced a Swin Transformer-based windowed multi-head self-attention mechanism. This helped the system focus more effectively on small and subtle crack features.

For segmentation, an improved U-Net architecture was used. With a rotating-split method, it extracted detailed crack shapes and accurately measured widths - key parameters in structural assessment.

Training was conducted using the Ultralytics YOLOv8 framework on PyTorch 1.12.0, powered by an NVIDIA RTX 3090 GPU and an Intel i5-13500H CPU. The YOLOv8n model served as the base, trained using stochastic gradient descent with a 0.937 momentum and a cosine-decay learning rate schedule.

Input images were standardized to 640×640 pixels, and the dataset was augmented using techniques like random flips, Mosaic augmentation, and scaling to enhance model generalization.

Performance Evaluation and Key Metrics

To assess the model’s performance, the study looked at accuracy, precision, recall, F1 score, IoU (Intersection over Union), and mean Average Precision (mAP), with mAP serving as the primary benchmark.

Detected cracks were also categorized by orientation: horizontal, vertical, diagonal, and complex patterns. This classification provided a clearer picture of the nature and directionality of the structural issues.

The dual-module system achieved a mAP of 97.14 %, with over 99 % mAP for horizontal and vertical cracks. Accuracy reached 98.17 %, recall hit 99.02 %, and the F1 score landed at 98.34 %.

In segmentation, the enhanced U-Net yielded a Dice coefficient of 91.95 %, an average symmetric surface distance of 0.5618, and an IoU of 86.87 %. These results outperformed baseline models like standard U-Net and Attention U-Net, thanks to the use of pixel-level and spatial attention modules.

Why This Approach Stands Out

Traditional CNNs have limitations in capturing long-range dependencies and global context, which are key for detecting fine, elongated cracks in complex images. By integrating the Swin Transformer into the YOLOv8 framework, the researchers expanded the model's receptive field and boosted its resilience to background noise.

Segmentation was treated as a post-detection refinement step rather than a separate task. This approach preserved real-time detection speeds while adding critical geometric information, such as crack width and orientation.

Compared to previous models like YOLOv3, YOLOv5, YOLOv7, Vision Transformer (ViT), and Faster R-CNN, the proposed system demonstrated superior detection accuracy and faster inference, confirming its technical and practical advantages.

Practical Implications and Future Directions

The study confirms that combining YOLOv8 with Swin Transformer modules can significantly boost the performance of automated crack detection systems. The segmentation outputs are visually interpretable, offering engineers clear, actionable insights during structural assessments.

With its high accuracy and efficiency, the model provides a practical solution for real-time fracture detection, which is crucial for maintenance planning and safety evaluations. It also lays the groundwork for future innovations in smart infrastructure monitoring, where automated systems continuously track and assess structural health.

Journal Reference

Zuo, X., Almutairi, A. D., Saeed, M. K., & Dai, Y. (2026). Improved Dual-Module YOLOv8 Algorithm for Building Crack Detection. Buildings. DOI: 10.3390/buildings16020461, https://www.mdpi.com/2075-5309/16/2/461

Disclaimer: The views expressed here are those of the author expressed in their private capacity and do not necessarily represent the views of AZoM.com Limited T/A AZoNetwork the owner and operator of this website. This disclaimer forms part of the Terms and conditions of use of this website.

Samudrapom Dam

Written by

Samudrapom Dam

Samudrapom Dam is a freelance scientific and business writer based in Kolkata, India. He has been writing articles related to business and scientific topics for more than one and a half years. He has extensive experience in writing about advanced technologies, information technology, machinery, metals and metal products, clean technologies, finance and banking, automotive, household products, and the aerospace industry. He is passionate about the latest developments in advanced technologies, the ways these developments can be implemented in a real-world situation, and how these developments can positively impact common people.

Citations

Please use one of the following formats to cite this article in your essay, paper or report:

  • APA

    Dam, Samudrapom. (2026, January 27). Structural Cracks Don’t Stand a Chance Against This Dual-Module AI. AZoBuild. Retrieved on January 27, 2026 from https://www.azobuild.com/news.aspx?newsID=23977.

  • MLA

    Dam, Samudrapom. "Structural Cracks Don’t Stand a Chance Against This Dual-Module AI". AZoBuild. 27 January 2026. <https://www.azobuild.com/news.aspx?newsID=23977>.

  • Chicago

    Dam, Samudrapom. "Structural Cracks Don’t Stand a Chance Against This Dual-Module AI". AZoBuild. https://www.azobuild.com/news.aspx?newsID=23977. (accessed January 27, 2026).

  • Harvard

    Dam, Samudrapom. 2026. Structural Cracks Don’t Stand a Chance Against This Dual-Module AI. AZoBuild, viewed 27 January 2026, https://www.azobuild.com/news.aspx?newsID=23977.

Tell Us What You Think

Do you have a review, update or anything you would like to add to this news story?

Leave your feedback
Your comment type
Submit

While we only use edited and approved content for Azthena answers, it may on occasions provide incorrect responses. Please confirm any data provided with the related suppliers or authors. We do not provide medical advice, if you search for medical information you must always consult a medical professional before acting on any information provided.

Your questions, but not your email details will be shared with OpenAI and retained for 30 days in accordance with their privacy principles.

Please do not ask questions that use sensitive or confidential information.

Read the full Terms & Conditions.