*Important notice: This news reports on an unedited version of the paper which has been accepted. and is awaiting final editing. Scientific Reports sometimes publishes preliminary scientific reports that are not fully edited and, therefore, should not be regarded as conclusive or treated as established information.
AI-driven framework designs geopolymer concrete mixes with high accuracy, reducing reliance on trial-and-error. This approach enables low-carbon construction materials while maintaining strength and performance.
Study: Two stage AI framework for strength prediction and generative LLM for geopolymer concrete. Image Credit: helloRuby/Shutterstock
The global construction sector faces a critical challenge due to the high carbon footprint of Portland cement. In a recent study published in Scientific Reports, researchers introduced a two-stage artificial intelligence (AI) framework for the design of geopolymer concrete (GPC).
This innovative system achieved high predictive accuracy, with a coefficient of determination (R2) of 0.9648 for compressive strength. By combining machine learning (ML) with generative large language models (LLMs), this approach shifts the focus from analysis to active material design, significantly reducing reliance on time-consuming trial and error.
GPC: A Sustainable Alternative
Traditional Portland cement concrete is a major source of global greenhouse gas emissions. This has driven interest in low-carbon alternatives such as GPC. GPC utilizes industrial byproducts such as fly ash and ground-granulated blast-furnace slag (GGBFS) as binders, which are activated with alkaline solutions to form a durable inorganic polymer matrix.
This material can match or exceed the strength of conventional concrete while substantially reducing environmental impact. However, GPC mix design is complex, depending on multiple interacting factors, including precursor ratios, alkaline concentration, curing temperature, and age. Elevated curing temperatures (around 65 °C) can accelerate strength development, allowing up to 70% of the final strength to be reached within 24 hours.
Two-Stage AI Framework for Mix Design
Researchers developed a two-stage AI framework to address the inverse design problem for geopolymer concrete. They used a dataset of 820 mix designs collected from scientific literature. Data preprocessing techniques included removing outliers using the Interquartile Range (IQR) method and augmenting Gaussian noise to enhance model generalization.
In the first phase, three predictive frameworks were trained to estimate compressive strength from key input variables, including precursor composition, alkaline molarity, curing temperature, and mix age. These models included a Genetic Algorithm (GA) optimized extra gradient boosting (XGBoost), a TabTransformer for tabular deep learning, and an Artificial Neural Network (ANN) utilizing the Levenberg-Marquardt algorithm. The genetic algorithm optimized hyperparameters such as learning rate and tree depth to enhance performance.
Download the PDF of this page here
In the second phase, a fine-tuned generative model based on Facebook’s OPT-350M (350 million parameters) was developed to interpret and generate structured mix designs. The system operates through a dual-iteration process: during the first iteration, the LLM generates a coherent, human-readable mix design based on the target compressive strength. Subsequently, a specialized XGBoost pipeline replaces the initial estimates with high-precision predictions, ensuring that the outputs are valid and linguistically interpretable.
Performance Evaluation and Key Findings
The outcomes demonstrated significant advancements in material design. In the predictive phase, the GA-optimized XGBoost model achieved a root-mean-square error (RMSE) of 2.8823 MPa and a mean absolute error (MAE) of 1.9053 MPa, explaining approximately 96.48% of the variance in compressive strength. Residual plots and Taylor diagrams confirmed stable, homoscedastic performance across normal and high-strength ranges.
The TabTransformer reached an R2 of 0.9453, highlighting the effectiveness of self-attention mechanisms for tabular material data. Feature importance analysis identified sodium hydroxide molarity aand slag (GGBFS) dosage as the most influential variables.
In the generative phase, the hybrid LLM achieved high accuracy, with text quality metrics showing a bidirectional encoder representation from transformers score (BERTScore) of 0.9754 and a recall-oriented understudy for gisting evaluation-longest common subsequence (ROUGE-L) score of 0.8794, confirming strong alignment with reference designs.
Numerical predictions for components such as fly ash (R2 = 0.9858), GGBFS (R2 = 0.9867), and coarse aggregates (R2 = 0.9896) demonstrated near-perfect agreement, confirming that the system can generate mix designs that are both chemically valid and ready for laboratory validation.
Practical Applications in Modern Construction
This AI framework has considerable practical value for modern construction. Engineers can input target performance metrics, such as the required 28-day compressive strength, and receive a complete mix design instantly, significantly reducing reliance on lengthy trial-and-error testing. The approach shortens development cycles for sustainable materials, enabling faster adoption of geopolymer concrete in large-scale projects and supporting the use of local industrial by-products to reduce material costs and transport-related emissions.
Future Directions for Sustainable Material Design
In summary, this study represents a shift from merely predicting material properties to actively designing new materials. By combining generative AI with numerical modeling, the model evolves into a design partner capable of discovering high-performance formulations. The approach enables faster and more reliable development of sustainable materials.
Future work should further expand the model to include durability factors like fire resistance and long-term shrinkage, as well as broader validation. Improving dataset quality will further strengthen reliability. As the construction sector moves toward decarbonization, such AI-driven systems can accelerate the development of low-carbon infrastructure worldwide.
Journal Reference
Aravind Unni, M.S., Akhil, V.M. & Philip, S. (2026). Two stage AI framework for strength prediction and generative LLM for geopolymer concrete. Sci Rep. DOI: 10.1038/s41598-026-49329-x, https://www.nature.com/articles/s41598-026-49329-x
Disclaimer: The views expressed here are those of the author expressed in their private capacity and do not necessarily represent the views of AZoM.com Limited T/A AZoNetwork the owner and operator of this website. This disclaimer forms part of the Terms and conditions of use of this website.