Geometry optimizations in Python#
The tblite Python package allows to run extended tight binding (xTB) calculations directly in Python. This tutorial demonstrates how to set up and run a geometry optimizations using GFN2-xTB.
Installing the package#
To start create a new Python environment using the mamba package manager. We specify the packages we want to install in our environment file:
name: xtb
channels:
- conda-forge
dependencies:
- altair
- chemiscope
- ipykernel
- pyberny
- tblite-python
Save the file as environment.yml and create the environment by running
mamba env create -n xtb -f environment.yml
mamba activate xtb
This will create a new environment called xtb and install all the necessary packages. Make sure that tblite is available in your Python environment. You can check this by opening a Python interpreter and importing the package
import tblite.interface as tb
tb.library.get_version()
(0, 6, 0)
First steps#
In the geometry optimization one needs to compute the potential energy and the derivatives like the forces and have a procedure to move on the potential energy surface to find the minima. The xTB calculator can provide the energy and derivatives and can be combined with different geometry optimization procedure. One example would be the pyberny package as a general geometry optimization procedure which we will apply it in this tutorial.
In this tutorial we are using caffeine molecule as a representative input.
%%writefile caffeine.xyz
24
C 1.07317 0.04885 -0.07573
N 2.51365 0.01256 -0.07580
C 3.35199 1.09592 -0.07533
N 4.61898 0.73028 -0.07549
C 4.57907 -0.63144 -0.07531
C 3.30131 -1.10256 -0.07524
C 2.98068 -2.48687 -0.07377
O 1.82530 -2.90038 -0.07577
N 4.11440 -3.30433 -0.06936
C 5.45174 -2.85618 -0.07235
O 6.38934 -3.65965 -0.07232
N 5.66240 -1.47682 -0.07487
C 7.00947 -0.93648 -0.07524
C 3.92063 -4.74093 -0.06158
H 0.73398 1.08786 -0.07503
H 0.71239 -0.45698 0.82335
H 0.71240 -0.45580 -0.97549
H 2.99301 2.11762 -0.07478
H 7.76531 -1.72634 -0.07591
H 7.14864 -0.32182 0.81969
H 7.14802 -0.32076 -0.96953
H 2.86501 -5.02316 -0.05833
H 4.40233 -5.15920 0.82837
H 4.40017 -5.16929 -0.94780
Writing caffeine.xyz
In the full version of our optimization loop, we will have the optimizer make steps, compute the energy and forces with xTB. The results for each step will be saved for further visualization and analysis.
Here, we start with looking to the optimizer and geometry setup. Pyberny optimizer can read the xyz file and Berny optimizer acts as an iterator to provide new geometry steps. However to use the geometry in the xTB calculator we need to convert it from the pyberny format to the tblite format. For xTB we need to separate the element symbols and the cartesian coordinates.
import numpy as np
from berny import Berny, geomlib, angstrom
optimizer = Berny(geomlib.readfile("caffeine.xyz"))
geom = next(optimizer)
elements = [symbol for symbol, _ in geom]
initial_coordinates = np.asarray([coordinate for _, coordinate in geom])
/home/docs/checkouts/readthedocs.org/user_builds/tblite/conda/stable/lib/python3.14/site-packages/berny/species_data.py:6: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81.
from pkg_resources import resource_string
Note
Remember that coordinates in tblite might use a different unit than our optimizer, in this case pyberny uses Angstrom and tblite Bohr. With the provided conversion factor we ensure that the coordinates are in the right unit. While the energy unit Hartree is compatible for us, we need to account for the gradient unit, which is Hartree/Angstrom and convert the gradient accordingly.
xtb = tb.Calculator("GFN2-xTB", tb.symbols_to_numbers(elements), initial_coordinates * angstrom)
results = xtb.singlepoint()
initial_energy = results["energy"]
initial_gradient = results["gradient"]
optimizer.send((initial_energy, initial_gradient / angstrom))
------------------------------------------------------------
cycle total energy energy error density error
------------------------------------------------------------
1 -41.75162462815 -4.2243950E+01 1.9479957E-01
2 -42.11867876399 -3.6705414E-01 7.5972202E-02
3 -42.14180557575 -2.3126812E-02 4.6343403E-02
4 -42.14537345302 -3.5678773E-03 1.2550676E-02
5 -42.14691416500 -1.5407120E-03 6.1305239E-03
6 -42.14742063310 -5.0646810E-04 2.4358092E-03
7 -42.14744770814 -2.7075046E-05 1.0726515E-03
8 -42.14746243612 -1.4727977E-05 3.6117557E-04
9 -42.14746301473 -5.7861259E-07 1.6698189E-04
10 -42.14746312030 -1.0556375E-07 5.4777956E-05
11 -42.14746315750 -3.7209695E-08 2.5307046E-05
12 -42.14746315860 -1.0997638E-09 8.2333730E-06
------------------------------------------------------------
total: 0.053 sec
Full optimization loop#
All the steps up to now can be run in a loop to retrieve the updated coordinates. From there we can update our xTB calculator with the new positions, evaluate the energy and gradient to pass them back to the optimizer.
xtb.set("verbosity", 0)
trajectory = [(initial_energy, initial_gradient, initial_coordinates)]
for geom in optimizer:
coordinates = np.asarray([coordinate for _, coordinate in geom])
xtb.update(positions=coordinates * angstrom)
results = xtb.singlepoint(results)
energy = results["energy"]
gradient = results["gradient"]
optimizer.send((energy, gradient / angstrom))
trajectory.append((energy, gradient, coordinates))
This loop is completed by optimizer if the geometry is converged and the local minimum is reached.
Visualization#
In this step one can visualize the energy, gradient as well as coordinates. Thus, the energy change during the optimization for our example looks like:
import altair as alt
import polars as pl
df = pl.DataFrame(
{
"step": np.arange(len(trajectory)),
"energy": np.asarray([energy for energy, _, _ in trajectory]),
"gradient": np.asarray([np.abs(gradient).mean() for _, gradient, _ in trajectory]),
}
)
base = alt.Chart(df).encode(
x=alt.X("step", axis=alt.Axis(tickCount=len(df)), title="Step"),
)
gradient_line = base.mark_line(color="orange").encode(
y=alt.Y("gradient", title="Mean Gradient (Hartree/Bohr)", scale=alt.Scale(zero=False)),
)
energy_line = base.mark_line(color="blue").encode(
y=alt.Y("energy", title="Energy (Hartree)", scale=alt.Scale(zero=False)),
)
alt.layer(energy_line, gradient_line).resolve_scale(y="independent").properties(
title="Caffeine Optimization with xTB",
width=600,
height=400,
)
Next we want to visualize the geometry change during the optimization, for this we will use the chemiscope package. In this way we can have an interactive visualization of the geometry next to the energy and gradient plot.
import json
import chemiscope
def write_chemiscope_input(
name: str,
elements: list[str],
trajectory: list[tuple[np.ndarray, np.ndarray, np.ndarray]],
) -> dict:
"""Format geometry optimization data for Chemiscope."""
return {
"meta": {"name": name},
"structures": [
{
"size": len(elements),
"names": elements,
"x": coordinates[:, 0].tolist(),
"y": coordinates[:, 1].tolist(),
"z": coordinates[:, 2].tolist(),
}
for _, _, coordinates in trajectory
],
"properties": {
"step": {
"units": "",
"target": "structure",
"values": np.arange(len(trajectory)).tolist(),
},
"energy": {
"units": "Hartree",
"target": "structure",
"values": [energy.item() for energy, _, _ in trajectory],
},
"gradient norm": {
"units": "Hartree/Bohr",
"target": "structure",
"values": [np.abs(gradient).mean().item() for _, gradient, _ in trajectory],
},
},
}
with open("geometry.json", "w") as fd:
json.dump(
write_chemiscope_input("Caffeine Optimization with xTB", elements, trajectory),
fd,
)
# chemiscope.show_input("geometry.json")
Summary#
In this tutorial, we demonstrated how to set up and run geometry optimization calculations using the tblite Python package.
We explored geometry optimization with the pyberny optimizer. Furthermore, we plot the gradient norm (gradient mean absolute value) and energy along with visualizing the geometry.
With this introduction, you can perform extended tight binding (xTB) calculations and visualize the results.