Neural Networks and LLMs Accelerate Discovery of Low-Carbon Cement Alternatives

A new study has used large language models and neural networks to identify and predict the reactivity of over 14,000 potential cementitious materials, offering a scalable approach to discovering low-carbon clinker substitutes.

AI logo on a chip, with a dark blue and black background.
Study: Data-driven material screening of secondary and natural cementitious precursors. Image Credit: Anggalih Prasetya/Shutterstock.com

Published in Communications Materials, the study demonstrates how neural networks and large language models (LLMs) can be used to systematically map reactivity variations and expand the library of potential cementitious materials. By analyzing the chemical makeup of 14,000 materials extracted from 88,000 academic papers, researchers identified promising secondary and natural cementitious precursors and evaluated their reactivity and pozzolanicity through machine learning.

Background

Reducing the greenhouse gas (GHG) footprint of cement and concrete production is a pressing priority, and clinker substitution is one of the most effective strategies to achieve this. Cementitious precursors—materials that react with water to form cement-like hydration products—can replace a significant portion of clinker, with substitutes like fly ash and slag capable of reducing GHG intensity by up to 50 %.

The cement industry has set an ambitious goal to reduce the global clinker-to-cement mass ratio from 76 % to 52 % by 2050. Realizing this target requires a broader range of substitutes, which in turn demands reliable ways to evaluate the reactivity of diverse and often heterogeneous materials. Current methods, however, rely on time-consuming and resource-heavy experiments.

To address this, the study proposes a data-driven approach powered by machine learning to assess reactivity and pozzolanic potential at scale, dramatically accelerating the screening process for viable materials.

Methods

The research team began by filtering 5.7 million academic papers to isolate around 88,000 focused on cement and concrete. From this corpus, they built two vector databases—one for 3 million sentences and another for 104,000 structured tables—using high-dimensional embeddings to capture semantic meaning and enable efficient, targeted retrieval.

LLM agents, using a retrieval-augmented generation method, were then deployed to extract key data points such as chemical compositions and material names. The models employed included general-purpose transformers like all-mpnet-base-v2 and all-MiniLM-L6-v2, as well as the materials science-specific MatSciBERT.

Data extraction from retrieved tables was handled by fine-tuned versions of GPT-3.5 and Mistral, with GPT-3.5 ultimately chosen based on accuracy against a hand-labeled dataset. Challenges like vague or abbreviated material names were addressed using sentence-level semantic matching and metadata filters.

The machine learning pipeline itself was built in Python 3.8.8 and incorporated widely used libraries including TensorFlow, Scikit-learn, LightGBM, and NumPy. The LLM workflows were supported by PyTorch, Ludwig, LangChain, and SentenceTransformers.

Results and Discussion

The mapped reactivity profiles revealed a diverse range of promising clinker substitutes, including both industrial by-products and naturally occurring materials.

Agricultural wastes such as sugarcane bagasse ash and rice husk ash demonstrated strong pozzolanic behavior, while materials like tree bark ash appeared to function more as hydraulic precursors. Other identified materials included waste glass, municipal solid waste ashes, mine tailings, and construction and demolition debris like recycled ceramics and concrete.

The study also spotlighted 25 types of natural rocks, eight of which showed significant reactivity when mechanically activated. These included pumice, ignimbrites, rhyolite, opaline shales, trachyte, and tuffs—many of which are abundant in rift zones and seismic regions, offering geographically diverse sourcing opportunities.

Additionally, the research found that some crystalline rocks, such as anorthosite, could become reactive through amorphization techniques like vitrification. Clastic rocks rich in silica or volcanic content, often with higher amorphous fractions, also exhibited favorable reactivity.

These findings suggest that the availability of viable clinker substitutes is far broader than previously thought. However, integrating these materials into industrial-scale use will require not only reactivity validation but also supply chain analysis, durability testing, and cost evaluations.

Conclusion and Future Directions

By combining the strengths of LLMs and machine learning, the study introduced a powerful method for rapidly screening and evaluating cementitious materials. A multi-headed neural network model predicted three key reactivity metrics—heat release, bound water, and Ca(OH)2 consumption—based on features like chemical composition, particle size, and phase content.

Among the materials tested, roughly 5 %–25 % of rock samples, such as silicic tuff, pumice, and shale, released more than 200 J/g of heat, suggesting substantial cementitious potential. Many of these materials are globally available in volcanic or tectonic regions, offering a scalable path toward reducing the carbon footprint of cement production.

While the study makes a compelling case for the use of AI-driven material screening, further experimental work is essential to validate predictions and ensure performance consistency. Future research could extend these models to include activation pathways such as vitrification or calcination, creating a more comprehensive framework for optimizing sustainable binder materials.

Journal Reference

Mahjoubi, S., Venugopal, V., Manav, I. B., AzariJafari, H., Kirchain, R. E., & Olivetti, E. A. (2025). Data-driven material screening of secondary and natural cementitious precursors. Communications Materials6(1). DOI: 10.1038/s43246-025-00820-4, https://www.nature.com/articles/s43246-025-00820-4

Disclaimer: The views expressed here are those of the author expressed in their private capacity and do not necessarily represent the views of AZoM.com Limited T/A AZoNetwork the owner and operator of this website. This disclaimer forms part of the Terms and conditions of use of this website.

Nidhi Dhull

Written by

Nidhi Dhull

Nidhi Dhull is a freelance scientific writer, editor, and reviewer with a PhD in Physics. Nidhi has an extensive research experience in material sciences. Her research has been mainly focused on biosensing applications of thin films. During her Ph.D., she developed a noninvasive immunosensor for cortisol hormone and a paper-based biosensor for E. coli bacteria. Her works have been published in reputed journals of publishers like Elsevier and Taylor & Francis. She has also made a significant contribution to some pending patents.  

Citations

Please use one of the following formats to cite this article in your essay, paper or report:

  • APA

    Dhull, Nidhi. (2025, June 10). Neural Networks and LLMs Accelerate Discovery of Low-Carbon Cement Alternatives. AZoBuild. Retrieved on June 10, 2025 from https://www.azobuild.com/news.aspx?newsID=23818.

  • MLA

    Dhull, Nidhi. "Neural Networks and LLMs Accelerate Discovery of Low-Carbon Cement Alternatives". AZoBuild. 10 June 2025. <https://www.azobuild.com/news.aspx?newsID=23818>.

  • Chicago

    Dhull, Nidhi. "Neural Networks and LLMs Accelerate Discovery of Low-Carbon Cement Alternatives". AZoBuild. https://www.azobuild.com/news.aspx?newsID=23818. (accessed June 10, 2025).

  • Harvard

    Dhull, Nidhi. 2025. Neural Networks and LLMs Accelerate Discovery of Low-Carbon Cement Alternatives. AZoBuild, viewed 10 June 2025, https://www.azobuild.com/news.aspx?newsID=23818.

Tell Us What You Think

Do you have a review, update or anything you would like to add to this news story?

Leave your feedback
Your comment type
Submit

While we only use edited and approved content for Azthena answers, it may on occasions provide incorrect responses. Please confirm any data provided with the related suppliers or authors. We do not provide medical advice, if you search for medical information you must always consult a medical professional before acting on any information provided.

Your questions, but not your email details will be shared with OpenAI and retained for 30 days in accordance with their privacy principles.

Please do not ask questions that use sensitive or confidential information.

Read the full Terms & Conditions.