GRACE: Generative Reasoning Agentic Control Environment for Autonomous High-Energy Physics Analysis

Hong Joo Ryoo, Johns Hopkins University

0000-0002-4032-0215

ACCESS Allocation Request PHY260006

Abstract: GRACE is an autonomous agent system designed to perform the full simulation and design scope of experimental physicist responsibilities in High-Energy Physics (HEP): hypothesis formation, experimental design, detector simulation, data analysis, statistical inference, result interpretation, and proposing experimental configurations based on best simulation results. The system leverages Large Language Models (LLMs) to reason through complex physics problems from first principles, eliminating the need for hardcoded domain-specific logic while maintaining scientific rigor. Below is an outline of goals and use cases. Thus far, we are in the process of developing the base architecture and require compute resources for benchmarking, storage, testing, and related purposes. Research Goals: 1. Validate autonomous physics analysis capabilities on publicly available CERN Open Data, benchmarking against known published results 2. Demonstrate generalization across different HEP experiments (ATLAS Higgs analysis, GlueX photoproduction) without experiment-specific code changes 3. Develop fidelity-tiered simulation pipelines (T0-T3) that autonomously escalate from fast approximations to full detector simulations based on physics requirements, with selective escalation to full Geant4 workflows only when physics sensitivity requires it. Planned Use of ACCESS Resources: - GPU Computing: Local LLM inference using Ollama with open-weight models (Mistral, Qwen, LLaMA) for autonomous reasoning and decision-making during physics analysis workflows - CPU Computing: Monte Carlo event generation (Pythia8), detector simulation (Delphes/Geant4), and statistical analysis pipelines - Storage: CERN Open Data datasets (~10-50 GB), simulation outputs, and model checkpoints - Benchmarking: Evaluate multi-agent reasoning performance (hypothesis arena, adversarial debate) across different compute configurations Software Requirements: - Containers: Singularity/Apptainer for HPC compatibility - HEP Simulation Stack: Pythia8 (8.312), Delphes (3.5.0), Geant4 (11.2), ROOT - ML/LLM Stack: Python 3.12+, PyTorch, Ollama, scikit-learn - Analysis: NumPy, pandas, matplotlib Credit Estimate: we request 400,000 credits.

Allocations:

2026 NCSA Delta GPU 2,999.0 GPU Hours
The estimated value of these awarded resources is $1,581.04. The allocation of these resources represents a considerable investment by the NSF in advanced computing infrastructure for the U.S. The dollar value of the allocation is estimated from the NSF awards supporting the allocated resources.
There are no other allocations for this project.

Other Titles:

There are no prior titles for this project.