Quantum many-body physics calculations with large language models

Abstract Large language models (LLMs) have demonstrated abilities to perform complex tasks in multiple domains, including mathematical and scientific reasoning. We demonstrate that with carefully designed prompts, LLMs can accurately carry out key calculations in research papers in theoretical physi...

Full description

Saved in:
Bibliographic Details
Main Authors: Haining Pan, Nayantara Mudur, William Taranto, Maria Tikhanovskaya, Subhashini Venugopalan, Yasaman Bahri, Michael P. Brenner, Eun-Ah Kim
Format: Article
Language:English
Published: Nature Portfolio 2025-01-01
Series:Communications Physics
Online Access:https://doi.org/10.1038/s42005-025-01956-y
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Abstract Large language models (LLMs) have demonstrated abilities to perform complex tasks in multiple domains, including mathematical and scientific reasoning. We demonstrate that with carefully designed prompts, LLMs can accurately carry out key calculations in research papers in theoretical physics. We focus on a broadly-used approximation method in quantum physics: the Hartree-Fock method, requiring an analytic multi-step calculation deriving approximate Hamiltonian and corresponding self-consistency equations. To carry out the calculations using LLMs, we design multi-step prompt templates that break down the analytic calculation into standardized steps with placeholders for problem-specific information. We evaluate GPT-4’s performance in executing the calculation for 15 papers from the past decade, demonstrating that, with the correction of intermediate steps, it can correctly derive the final Hartree-Fock Hamiltonian in 13 cases. Aggregating across all research papers, we find an average score of 87.5 (out of 100) on the execution of individual calculation steps. We further use LLMs to mitigate the two primary bottlenecks in this evaluation process: (i) extracting information from papers to fill in templates and (ii) automatic scoring of the calculation steps, demonstrating good results in both cases.
ISSN:2399-3650