`ChemGP`¶

Author:: Rohit Goswami

Citation¶

If you use ChemGP, please cite the tutorial review:

Primary citation

R. Goswami, “Bayesian Optimization with Gaussian Processes to Accelerate Stationary Point Searches,” arXiv:2603.10992, Mar. 2026. doi:10.48550/arXiv.2603.10992

ChemGP builds on methods and analysis from these related publications:

[1]

Rohit Goswami. Bayesian Optimization with Gaussian Processes to Accelerate Stationary Point Searches. March 2026. arXiv:2603.10992, doi:10.48550/arXiv.2603.10992.

[2]

Rohit Goswami, Miha Gunde, and Hannes Jónsson. Enhanced climbing image nudged elastic band method with hessian eigenmode alignment. January 2026. arXiv:2601.12630, doi:10.48550/arXiv.2601.12630.

[3]

Rohit Goswami, Maxim Masterov, Satish Kamath, Alejandro Pena-Torres, and Hannes Jónsson. Efficient Implementation of Gaussian Process Regression Accelerated Saddle Point Searches with Application to Molecular Reactions. Journal of Chemical Theory and Computation, July 2025. doi:10.1021/acs.jctc.5c00866.

[4]

Rohit Goswami and Hannes Jónsson. Adaptive Pruning for Increased Robustness and Reduced Computational Overhead in Gaussian Process Accelerated Saddle Point Searches. ChemPhysChem, November 2025. doi:10.1002/cphc.202500730.

[5]

Rohit Goswami. Bayesian hierarchical models for quantitative estimates for performance metrics applied to saddle search algorithms. AIP Advances, 15(8):085210, August 2025. doi:10.1063/5.0283639.

[6]

Rohit Goswami. Two-dimensional RMSD projections for reaction path visualization and validation. MethodsX, pages 103851, March 2026. doi:10.1016/j.mex.2026.103851.

Overview¶

Gaussian Process accelerated optimization for computational chemistry.

ChemGP provides GP-surrogate methods that reduce the number of expensive electronic structure evaluations (oracle calls) needed for geometry optimization, saddle point search, and minimum energy path finding.

The core library (chemgp-core) is written in Rust for performance and reproducibility.

Why GP acceleration?¶

In computational chemistry, evaluating a potential energy surface (PES) through density functional theory or coupled cluster methods is the dominant cost. A single DFT gradient evaluation on a 50-atom system takes minutes; a NEB calculation with 7 images and 100 iterations needs 700 such evaluations.

A Gaussian Process learns a surrogate of the PES from a handful of true evaluations. The surrogate is cheap to query (microseconds vs minutes), so the optimizer runs mostly on the surrogate and calls the true oracle only when the GP uncertainty is high. The result: the same geometry optimization that took 200 oracle calls with gradient descent takes 9 with a GP surrogate.

This works because PES are smooth (the Born-Oppenheimer approximation guarantees this) and because each oracle call returns both the energy and its gradient. For a system with D Cartesian degrees of freedom, each call provides 1 + D observations (one scalar energy plus D gradient components), giving the GP an information-dense training signal.

When GP acceleration helps (and when it does not)¶

GP surrogate methods are most effective when:

The oracle is expensive (DFT, coupled cluster, ML potentials with large models)
The PES is smooth (typical of ground-state electronic structure)
Gradients are available (the 1 + D information density is essential)

GP methods are not helpful when:

The oracle is cheap (empirical force fields, simple pair potentials) since GP training and prediction overhead may exceed the oracle cost
The PES has discontinuities or sharp features (phase transitions, level crossings)
Gradients are unavailable (energy-only GPs converge much more slowly)

Kernels¶

Two kernel types cover different use cases, unified under the Kernel enum:

MolInvDistSE: Molecular kernel operating on inverse interatomic distances. Provides rotational and translational invariance by construction, with pair-type-specific length scales. Use for molecular systems where invariance under rigid-body motion matters.
CartesianSE: Squared exponential operating directly on coordinates. Use for analytical test surfaces (Muller-Brown, model potentials) or when rotational invariance is not needed.

Choosing: for real molecules, always use MolInvDistSE. The CartesianSE kernel is for 2D/3D test surfaces where coordinates have direct physical meaning and there is no concept of interatomic distances.

Methods¶

ChemGP provides four GP-accelerated optimization methods. Each targets a different problem in computational chemistry:

Minimization: Find a local minimum of the PES (equilibrium geometry). Use when you know the system is near a minimum and want to relax it. Converges in 7 oracle calls on Muller-Brown (vs 34 direct GD) and 9 on LEPS (vs 200).
Dimer: Find a first-order saddle point (transition state) by following the lowest curvature mode uphill. Use when you have an approximate transition state geometry and want to refine it. Converges in ~13 oracle calls vs ~45 standard.
NEB: Find the minimum energy path between two known states (reactant and product). Use when you know both endpoints and want the reaction pathway and barrier height. OIE variant converges in ~49 oracle calls vs ~127 standard.
OTGPD: Adaptive variant of the GP-Dimer that automatically adjusts the GP trust threshold. Matches GP-Dimer efficiency with less manual tuning.

Which method should I use?¶

I want to…	Use	Tutorial
Relax a geometry to a minimum	Minimization	Minimization
Find a transition state	Dimer or OTGPD	Dimer, OTGPD
Find a reaction pathway	NEB	NEB
Refine a saddle from NEB	Dimer (with NEB orient)	Dimer

Unified architecture¶

All methods share four mechanisms that together make GP acceleration practical. Each solves a specific problem:

FPS subset selection :: The GP covariance matrix is N*(1+D) x N*(1+D) for N training points in D dimensions. Cholesky factorization costs O(N³\*(1+D)³). FPS selects the K most informative points, keeping K small enough for fast training while covering the region of interest.
Trust region clipping :: A GP extrapolating far from its training data can predict arbitrary nonsense. Trust regions (EMD for molecules, Euclidean for Cartesian surfaces) clip proposed steps to regions where the GP has data coverage.
RFF approximation :: Random Fourier Features replace the O(N³) exact GP with an O(D_rff³) linear model for inner-loop predictions. Hyperparameters are still trained exactly on the FPS subset; only the prediction model uses the approximation.
LCB exploration :: Lower Confidence Bound adds a variance penalty to prevent the optimizer from getting stuck in regions where the GP is confident but wrong. Adapted per method: standard for minimization, perpendicular-force variance for NEB OIE, not applicable for dimer (always evaluates midpoint).

Learning path¶

The tutorials progress from fundamentals to production use:

GP Basics – GP regression, kernels, posterior, covariance blocks
Minimization – GP-guided minimization, LCB, trust regions
Molecular Kernels – Invariant features, pair-type length scales
Hyperparameter Training – MAP-NLL, SCG, log-space, NLL landscape
Scalability – FPS subset selection, RFF approximation
Dimer Method – Saddle point search, GP-Dimer, OTGPD
NEB – Minimum energy paths, AIE vs OIE, LCB scoring
Constant Kernel – sigma_c²for molecular systems
Production Minimization – GP minimize on real PES via RPC
Production Saddle Search – GP-Dimer + OTGPD on real PES
Production NEB – Full pipeline on real PES

New to GPs? Start with tutorial 1. Want production use? Jump to tutorial 9. Building on ChemGP? See Architecture and Kernel Design.

Getting started¶

Getting Started

Tutorials¶

Tutorials

Reference¶

Reference

ChemGP¶

Citation¶

Overview¶

Why GP acceleration?¶

When GP acceleration helps (and when it does not)¶

Kernels¶

Methods¶

Which method should I use?¶

Unified architecture¶

Learning path¶

Getting started¶

Tutorials¶

Reference¶

`ChemGP`¶