By Nidhi DhullReviewed by Susha Cheriyedath, M.Sc.Nov 26 2024
A recent study published in Developments in the Built Environment introduced a machine learning (ML) approach to assessing common cyber risks in construction projects.
The proposed method featured three key components: a Monte Carlo-simulated dataset designed to predict cyber risks, an ML-driven analysis to identify and evaluate critical risk factors, and a greedy optimization algorithm aimed at efficiently addressing high-risk factors.
Background
Digital advancements have transformed the construction sector, boosting efficiency and productivity. However, these innovations have also introduced new cybersecurity vulnerabilities, leading to delays, financial losses, and reputational damage in construction projects. Compared to other industries, construction has been slower in adopting robust cybersecurity measures and has seen a rise in cyber incidents over the past decade.
The industry faces five key types of cyber risks. Ransomware attacks target critical assets like blueprints and financial records, while phishing schemes trick individuals into sharing sensitive information. Insider attacks involve malicious actions from within the organization, data breaches expose confidential digital data, and supply chain attacks exploit weaknesses in interconnected systems.
To tackle these risks, project managers need predictive tools that can forecast cyber threats at every stage of a project. With these tools, managers can take proactive steps to mitigate or even prevent risks. A dynamic risk assessment tool that uses real-time project data and predictive models to evaluate cyber risk levels offers an effective way to stay ahead of these challenges, ensuring projects are protected and can run smoothly.
Methods
The development of a machine learning (ML)-centric approach for construction risk analysis followed a structured, multi-step process. The first step involved identifying feature sources for the ML models derived from construction-specific risk factors. Next, datasets were generated using Monte Carlo simulations and an ensemble labeling method that combined fault tree analysis with criteria-based labeling. This ensured robust and reliable datasets for model training.
A two-phase strategy guided the model development process. The first phase identified the best-performing ML model for each specific risk, while the second phase optimized the weight combinations for the different labeling methods. Following this, an ML-based feature analysis was conducted to determine the most significant risk factors influencing outcomes. Finally, a greedy optimization algorithm was developed to create effective risk reduction strategies.
The integration of these steps resulted in a dynamic cyber risk assessment tool. This tool featured three key modules: trained ML models for risk degree prediction, a risk factor analysis module based on feature importance insights, and a risk reduction strategy module driven by the optimization algorithm. Together, these modules provided a comprehensive framework for analyzing and mitigating construction risks.
To ensure the identified risk factors were comprehensive, relevant, and accurate, a rigorous methodology was applied. This included conducting a systematic literature review, employing the Delphi method to gather expert evaluations, and administering a detailed questionnaire survey. These steps provided a solid foundation for the ML model's development and application.
The approach was validated through a case study conducted on a real construction project in the United Arab Emirates, executed by a leading engineering and contracting firm.
Results and Discussion
Among the analyzed risk factors, insider attack risk consistently achieved near-perfect determination coefficient (R2) values across all ML models, indicating an almost linear relationship and making basic models suitable for prediction. Similarly, supply chain attack risk demonstrated high R2 values with simpler models, confirming linearity. Complex models, however, overfitted in cases of obvious linearity and underperformed on test data. Therefore, simpler models were chosen for these risks.
For ransomware attack risk, all ML models achieved high R2 values, but complex models, such as neural networks, outperformed simpler ones, suggesting more non-linearity. A similar pattern was observed for phishing and data breach risks. Data breaches exhibited the lowest R² values across all models, indicating significant non-linearity and the limited effectiveness of even sophisticated models. Thus, complex models were more suitable for these risks.
The results highlighted that almost all identified risks exhibited some degree of non-linearity, reflecting their complex relationships with cyber risks. This emphasizes the need for construction project managers to thoroughly understand these risk factors to analyze cyber risks effectively. Different cyber risks demonstrated unique non-linear relationships with their corresponding factors, suggesting that effectively addressing each type of cyber risk may require specifically tailored strategies.
The ML models successfully predicted the incidence of cyber risks in two expert-labeled projects and a real construction project. These outcomes further demonstrated the efficacy and validity of the models. Notably, the models were able to predict cyber risk status at any stage of a construction project, enabling project managers to employ immediate risk reduction strategies. These strategies were guided by the model’s greedy optimization algorithm to maximize resource allocation efficiency.
Conclusion
This study successfully developed an ML-based approach to assess five common cyber risks in construction projects: ransomware, insider attacks, data breaches, phishing, and supply chain attacks. The approach showed promise in predicting and addressing these risks by tailoring models to suit the unique characteristics of each type.
However, the lack of an existing dataset posed a significant challenge. To overcome this, the researchers created a simulated dataset using defined probability distributions. While these simulations were carefully reviewed and validated by experts, they may not fully reflect the complexity of real-world scenarios, which could introduce variability in the results.
Looking ahead, the team plans to refine the probability distributions through sensitivity analyses and expand their pool of expert reviewers to improve the accuracy of the simulations. They are also working closely with local companies to gather real-world data, which will help validate the models and make them even more reliable for practical use. This ongoing effort aims to bridge the gap between theoretical modeling and real-world application, ensuring the approach remains relevant and effective.
Journal Reference
Yao, D., & de Soto, G. (2024). Assessing Cyber Risks in Construction Projects: A Machine Learning-Centric Approach. Developments in the Built Environment, 100570. DOI: 10.1016/j.dibe.2024.100570, https://www.sciencedirect.com/science/article/pii/S2666165924002515
Disclaimer: The views expressed here are those of the author expressed in their private capacity and do not necessarily represent the views of AZoM.com Limited T/A AZoNetwork the owner and operator of this website. This disclaimer forms part of the Terms and conditions of use of this website.
Article Revisions
- Nov 27 2024 - Title changed from "Machine Learning Assesses Cyber Risks in Construction" to "Assessing Cyber Risks in Construction with Machine Learning"