Science Cast

Beyond Chemical QA: Evaluating LLM's Chemical Reasoning with Modular Chemical Operations

librarianMay 28, 2025 12:43pm

Views (5)
Comments (0)

Export Citation

Voice is AI-generated

Connected to paperThis paper is a preprint and has not been certified by peer review

Beyond Chemical QA: Evaluating LLM's Chemical Reasoning with Modular Chemical Operations

arXivPDFMay 27, 2025 12:00am

Authors

Hao Li, He Cao, Bin Feng, Yanjun Shao, Xiangru Tang, Zhiyuan Yan, Li Yuan, Yonghong Tian, Yu Li

Abstract

While large language models (LLMs) with Chain-of-Thought (CoT) reasoning excel in mathematics and coding, their potential for systematic reasoning in chemistry, a domain demanding rigorous structural analysis for real-world tasks like drug design and reaction engineering, remains untapped. Current benchmarks focus on simple knowledge retrieval, neglecting step-by-step reasoning required for complex tasks such as molecular optimization and reaction prediction. To address this, we introduce ChemCoTBench, a reasoning framework that bridges molecular structure understanding with arithmetic-inspired operations, including addition, deletion, and substitution, to formalize chemical problem-solving into transparent, step-by-step workflows. By treating molecular transformations as modular "chemical operations", the framework enables slow-thinking reasoning, mirroring the logic of mathematical proofs while grounding solutions in real-world chemical constraints. We evaluate models on two high-impact tasks: Molecular Property Optimization and Chemical Reaction Prediction. These tasks mirror real-world challenges while providing structured evaluability. By providing annotated datasets, a reasoning taxonomy, and baseline evaluations, ChemCoTBench bridges the gap between abstract reasoning methods and practical chemical discovery, establishing a foundation for advancing LLMs as tools for AI-driven scientific innovation.

TwitterandLinkedIn

0 comments

Add comment

Beyond Chemical QA: Evaluating LLM's Chemical Reasoning with Modular Chemical Operations

Beyond Chemical QA: Evaluating LLM's Chemical Reasoning with Modular Chemical Operations

AI-powered Paper ChatBeta

AI-powered Paper ChatBeta

0 comments