Please login to view abstract download link
Leveraging the convergence of large language models (LLMs) and test-time scaling techniques, we introduce a novel approach to computational material design using data distillation and budget forcing. Inspired by the “s1: Simple Test-Time Scaling” paradigm, our methodology extracts high-quality training data from pivotal research in rate-dependent plasticity and nickel alloys. Specifically, two dedicated models (“gemini-2.0-flash-thinking-exp-01-21” and “gemini-2.5-pro-preview-03-25”) are utilized with carefully designed prompts to distill research questions, methodological reasoning traces, and concise analytical solutions from the literature. The resulting dataset, albeit limited to 43 samples at this initial stage, serves as the foundation for supervised fine-tuning (SFT) of a Qwen-2.5-7B-Instruct model. By incorporating budget forcing—a mechanism that enforces extended reasoning through forced output generation—our preliminary experiments reveal significant improvements in deriving rigorous mathematical formulations, logical reasoning, and coherent derivations compared to baseline performance. While current results do not yet rival state-of-the-art models, our findings demonstrate promising potential. Future work will focus on expanding the training corpus and scaling the model to 32B parameters, ultimately aiming to develop an interactive LLM framework capable of synthesizing methodologies from literature, critiquing methodological discrepancies, and integrating numerical implementation with real-time user feedback.