Researchers have developed a hybrid machine learning model combining Gradient Boosting Regression Trees with Bayesian Optimization to accurately predict the compressive strength of self-compacting concrete made with recycled aggregates. This model offers a faster, cost-effective alternative to traditional lab testing.
Study: Self compacting concrete with recycled aggregate compressive strength prediction based on gradient boosting regression tree with Bayesian optimization hybrid model. Image Credit: Parilov/Shutterstock.com
The recent study, published in Scientific Reports, outlines how this model improves both accuracy and efficiency in estimating compressive strength (CS), addressing a long-standing challenge in sustainable construction.
Background
Self-compacting concrete (SCC) is commonly used in projects with complex reinforcement or intricate formwork, thanks to its high workability. However, determining its 28-day CS typically relies on costly, time-consuming lab tests. This challenge is further complicated when recycled aggregates (RA) are incorporated to support eco-friendly construction, introducing variability that traditional methods struggle to account for.
While machine learning (ML) has shown promise in concrete strength prediction, most prior research has been limited to standard regression or selected ML models, often using small datasets and excluding recycled content. These approaches fall short in capturing the complex interactions among SCC components—especially when RA is involved.
Predicting the CS of SCC with recycled aggregates (RASCC) is particularly difficult due to material inconsistencies and the shortcomings of traditional mix design techniques. Additionally, many ML models lack integration with optimization methods like Bayesian Optimization (BO), which can significantly enhance predictive performance.
Methods
To develop the model, researchers compiled a dataset of 603 RASCC samples, each with eight input variables. To help prevent overfitting, 10 % of the data was held out for testing.
Irrelevant or redundant data points were removed using feature selection techniques, and the remaining data was split into training and testing sets. The hybrid BO-GBRT model was then trained on this refined dataset. A five-fold cross-validation process helped fine-tune the model's hyperparameters for optimal performance.
Once trained, the model was evaluated on unseen data to verify its reliability. To understand how input features influenced the results, the researchers used Shapley Additive Explanations (SHAP). Model accuracy was assessed using key performance metrics: Mean Absolute Error (MAE), Mean Absolute Percentage Error (MAPE), Mean Squared Error (MSE), Root Mean Squared Error (RMSE), and the coefficient of determination (R2).
Results and Discussion
The BO-GBRT model consistently achieved low RMSE values during cross-validation, indicating strong generalizability and minimal risk of overfitting. Predicted CS values closely matched actual results in both training and testing datasets, confirming the model’s accuracy and reliability.
Compared to other ML models such as Support Vector Regression (SVR) and K-nearest neighbors (KNN), BO-GBRT outperformed them all, delivering lower MAE and RMSE values and a higher R2. These results highlight the advantages of incorporating Bayesian Optimization into the modeling process.
The SHAP analysis revealed which inputs had the greatest impact on CS predictions. Cement, coarse aggregates, mineral admixtures, and chemical admixtures contributed positively, while water and recycled fine aggregates tended to reduce compressive strength. These findings emphasize the importance of carefully optimizing material proportions to improve structural performance.
To make the model more accessible, the team developed a user-friendly web application using Streamlit. This tool enables civil engineers, researchers, and construction professionals to input material properties and receive instant CS predictions. Accessible via any web browser, the app helps bridge the gap between advanced ML techniques and real-world applications.
Conclusion and Future Outlook
This study successfully demonstrates how combining GBRT with Bayesian Optimization can accurately predict the compressive strength of RASCC—offering a faster, less expensive, and more consistent alternative to traditional testing. The model stands as a practical tool for advancing sustainable construction practices while highlighting the value of machine learning in civil engineering.
Looking ahead, there’s potential to enhance the model further. Exploring advanced metaheuristic algorithms—such as the Lightning Search Algorithm or Nuclear Reaction Optimization—could improve robustness and accuracy. Including additional variables like concrete density, slump, curing conditions, or temperature might also refine predictions. Lastly, applying other interpretability tools alongside SHAP could provide deeper insights into model behavior.
Journal Reference
Abood, E. A. et al. (2025). Self compacting concrete with recycled aggregate compressive strength prediction based on gradient boosting regression tree with Bayesian optimization hybrid model. Scientific Reports, 15(1). DOI: 10.1038/s41598-025-11161-0. https://www.nature.com/articles/s41598-025-11161-
Disclaimer: The views expressed here are those of the author expressed in their private capacity and do not necessarily represent the views of AZoM.com Limited T/A AZoNetwork the owner and operator of this website. This disclaimer forms part of the Terms and conditions of use of this website.