Running Calculations from Python#
The tblite Python package allows running extended tight binding (xTB) calculations directly in Python. This tutorial demonstrates how to set up and run a single-point calculation using GFN2-xTB.
Installing the Package#
To start, create a new Python environment using the mamba package manager. We specify the packages we want to install in our environment file:
name: xtb
channels:
- conda-forge
dependencies:
- ipykernel
- tblite-python
- qcelemental
- polars
Save the file as environment.yml and create the environment by running:
mamba env create -n xtb -f environment.yml
mamba activate xtb
This will create a new environment called xtb and install all the necessary packages. Make sure that tblite is available in your Python environment. You can check this by opening a Python interpreter and importing the package.
import tblite.interface as tb
tb.library.get_version()
(0, 6, 0)
First calculation#
We will run our first calculation directly from Python. To specify the input, we will directly initialize the calculation without reading any external files. For this, we specify our input as Cartesian coordinates in Bohr and atomic numbers for the elements.
import numpy as np
coordinates = np.array([
[ 2.02799738646442, 0.09231312124713, -0.14310895950963],
[ 4.75011007621000, 0.02373496014051, -0.14324124033844],
[ 6.33434307654413, 2.07098865582721, -0.14235306905930],
[ 8.72860718071825, 1.38002919517619, -0.14265542523943],
[ 8.65318821103610, -1.19324866489847, -0.14231527453678],
[ 6.23857175648671, -2.08353643730276, -0.14218299370797],
[ 5.63266886875962, -4.69950321056008, -0.13940509630299],
[ 3.44931709749015, -5.48092386085491, -0.14318454855466],
[ 7.77508917214346, -6.24427872938674, -0.13107140408805],
[10.30229550927022, -5.39739796609292, -0.13672168520430],
[12.07410272485492, -6.91573621641911, -0.13666499342053],
[10.70038521493902, -2.79078533715849, -0.14148379504141],
[13.24597858727017, -1.76969072232377, -0.14218299370797],
[ 7.40891694074004, -8.95905928176407, -0.11636933482904],
[ 1.38702118184179, 2.05575746325296, -0.14178615122154],
[ 1.34622199478497, -0.86356704498496, 1.55590600570783],
[ 1.34624089204623, -0.86133716815647, -1.84340893849267],
[ 5.65596919189118, 4.00172183859480, -0.14131371969009],
[14.67430918222276, -3.26230980007732, -0.14344911021228],
[13.50897177220290, -0.60815166181684, 1.54898960808727],
[13.50780014200488, -0.60614855212345, -1.83214617078268],
[ 5.41408424778406, -9.49239668625902, -0.11022772492007],
[ 8.31919801555568, -9.74947502841788, 1.56539243085954],
[ 8.31511620712388, -9.76854236502758, -1.79108242206824],
])
elements = np.array([6,7,6,7,6,6,6,8,7,6,8,7,6,6,1,1,1,1,1,1,1,1,1,1])
xtb = tb.Calculator("GFN2-xTB", elements, coordinates)
This input allows us to perform an xTB calculation using the singlepoint method, which returns the calculation results to obtain the total energy.
results = xtb.singlepoint()
results["energy"]
------------------------------------------------------------
cycle total energy energy error density error
------------------------------------------------------------
1 -41.75162462696 -4.2243950E+01 1.9479957E-01
2 -42.11867876340 -3.6705414E-01 7.5972202E-02
3 -42.14180557544 -2.3126812E-02 4.6343403E-02
4 -42.14537345276 -3.5678773E-03 1.2550676E-02
5 -42.14691416477 -1.5407120E-03 6.1305240E-03
6 -42.14742063287 -5.0646811E-04 2.4358092E-03
7 -42.14744770792 -2.7075047E-05 1.0726515E-03
8 -42.14746243589 -1.4727977E-05 3.6117558E-04
9 -42.14746301451 -5.7861260E-07 1.6698189E-04
10 -42.14746312007 -1.0556378E-07 5.4777956E-05
11 -42.14746315728 -3.7209681E-08 2.5307046E-05
12 -42.14746315838 -1.0997638E-09 8.2333731E-06
------------------------------------------------------------
total: 0.160 sec
array(-42.14746316)
While it is possible to run xTB calculations this way directly in Python, it will become quickly cumbersome if we want to run many calculations at once. Instead, we want to read our geometry from an input file, for example, an xyz geometry file.
%%writefile caffeine.xyz
24
C 1.07317 0.04885 -0.07573
N 2.51365 0.01256 -0.07580
C 3.35199 1.09592 -0.07533
N 4.61898 0.73028 -0.07549
C 4.57907 -0.63144 -0.07531
C 3.30131 -1.10256 -0.07524
C 2.98068 -2.48687 -0.07377
O 1.82530 -2.90038 -0.07577
N 4.11440 -3.30433 -0.06936
C 5.45174 -2.85618 -0.07235
O 6.38934 -3.65965 -0.07232
N 5.66240 -1.47682 -0.07487
C 7.00947 -0.93648 -0.07524
C 3.92063 -4.74093 -0.06158
H 0.73398 1.08786 -0.07503
H 0.71239 -0.45698 0.82335
H 0.71240 -0.45580 -0.97549
H 2.99301 2.11762 -0.07478
H 7.76531 -1.72634 -0.07591
H 7.14864 -0.32182 0.81969
H 7.14802 -0.32076 -0.96953
H 2.86501 -5.02316 -0.05833
H 4.40233 -5.15920 0.82837
H 4.40017 -5.16929 -0.94780
Overwriting caffeine.xyz
Instead of implementing our own xyz file reader, we will use the qcelemental package, which already provides this functionality for us. Fortunately, the qcelemental library stores the geometry already in Bohr, so we do not need to convert the coordinates to input them in our xTB calculation.
import qcelemental as qcel
molecule = qcel.models.Molecule.from_file("caffeine.xyz")
xtb = tb.Calculator("GFN2-xTB", molecule.atomic_numbers, molecule.geometry)
results = xtb.singlepoint()
results["energy"]
------------------------------------------------------------
cycle total energy energy error density error
------------------------------------------------------------
1 -41.75162462903 -4.2243950E+01 1.9479957E-01
2 -42.11867876410 -3.6705414E-01 7.5972201E-02
3 -42.14180557564 -2.3126812E-02 4.6343403E-02
4 -42.14537345289 -3.5678772E-03 1.2550676E-02
5 -42.14691416487 -1.5407120E-03 6.1305240E-03
6 -42.14742063296 -5.0646809E-04 2.4358092E-03
7 -42.14744770801 -2.7075047E-05 1.0726515E-03
8 -42.14746243599 -1.4727977E-05 3.6117557E-04
9 -42.14746301460 -5.7861254E-07 1.6698189E-04
10 -42.14746312016 -1.0556377E-07 5.4777956E-05
11 -42.14746315737 -3.7209681E-08 2.5307046E-05
12 -42.14746315847 -1.0997709E-09 8.2333729E-06
------------------------------------------------------------
total: 0.049 sec
array(-42.14746316)
Defining the geometry and elements of the system explicitly or by reading from the xyz file gives the same results.
HOMO-LUMO gap#
The HOMO-LUMO gap refers to the energy difference between the Highest Occupied Molecular Orbital (HOMO) and the Lowest Unoccupied Molecular Orbital (LUMO) in a molecule. We continue from our previous session and obtain the orbital energies and occupation numbers to find the highest occupied and the lowest unoccupied orbitals.
orbital_energies = results["orbital-energies"]
orbital_occupations = results["orbital-occupations"]
lumo_index = np.argmax(orbital_occupations)
homo_index = lumo_index - 1
gap = (orbital_energies[lumo_index] - orbital_energies[homo_index]) * qcel.constants.conversion_factor("hartree", "eV")
gap
np.float64(-37.534758861167035)
Note
The gap energy usually reported in eV unit. While xTB energies stored in the atomic units (hartree). Thus, we apply the conversion factor obtained from the qcelemental to change the units accordingly.
Computing ionization potential#
Now that we can evaluate energies we want to extend the evaluation to other properties with xTB. Let’s compute the vertical ionization potential for caffeine with xTB. For computing this property we have two options, first we can get the ionization potential directly from our xTB wavefunction by using the energy of the highest occupied orbital. Second we can compute it as the relative energy of the ionization reaction.
In the first option, the HOMO energy approximates the negative vertical ionization potential.
homo_energy = -orbital_energies[homo_index] * qcel.constants.conversion_factor("hartree", "eV")
homo_energy
np.float64(-17.939194054667027)
In the second option, instead of approximating the ionization potential we can also compute it. The energy of removing an electron can be expressed by the reaction
This reaction energy is the negative ionization potential. To compute this energy with xTB we update our calculator by setting the total charge to +1:
xtb.update(charge=1)
results_ion = xtb.singlepoint()
vertical_ip = (results_ion["energy"] - results["energy"]) * qcel.constants.conversion_factor("hartree", "eV")
vertical_ip
------------------------------------------------------------
cycle total energy energy error density error
------------------------------------------------------------
1 -41.26491540861 -4.1757240E+01 1.9757517E-01
2 -41.55259511871 -2.8767971E-01 8.4795450E-02
3 -41.62346372004 -7.0868601E-02 6.4944962E-02
4 -41.66084952696 -3.7385807E-02 2.1386399E-02
5 -41.65752590773 3.3236192E-03 1.3708261E-02
6 -41.66153978885 -4.0138811E-03 8.1833416E-03
7 -41.66258949461 -1.0497058E-03 4.0653887E-03
8 -41.66278148251 -1.9198790E-04 2.3414709E-03
9 -41.66288309089 -1.0160838E-04 9.2578295E-04
10 -41.66289562609 -1.2535193E-05 4.9559574E-04
11 -41.66289374220 1.8838859E-06 4.9598733E-04
12 -41.66290207288 -8.3306787E-06 1.4008517E-04
13 -41.66290260604 -5.3315708E-07 4.0745840E-05
14 -41.66290259068 1.5358651E-08 4.3706505E-05
15 -41.66290264486 -5.4182102E-08 5.4270828E-06
------------------------------------------------------------
total: 0.064 sec
np.float64(13.185563185757385)
Note
Since we only need to change the total charge in the input, we can update our xTB calculator instead of creating a new one by the update method.
We do find quite a difference in the calculated value and the approximated one. Before we can use the ionization potential computed by xTB we should however correct for the self-interaction error using an empirical determined shift of 4.846 V. This shift should be applied for all ionization potentials computed with xTB. Note the unit of the shift is Volt (V) meaning that the correction should be applied for each added or removed electron.
Note
Since xTB is a semiempirical method it makes some approximations which result in a strong self-interaction for a free electron. This value can be computed exactly from the xTB parameters or determined empirically.[1]
Fukui indices from partial charges#
While the molecular ionization potential is a great descriptor for the whole molecule, xTB also provides many properties, which are atom-resolved. Computing the Fukui index[2] provides a simple descriptor for chemical reactivity we can compute from the partial charges according to the following equations:
where we have the three Fukui indices computed from the partial charges of the neutral (0), cationic (+) and anionic (-) system. To perform this calculation with xTB we go back to our computation environment and update our molecule to a negative total charge:
import polars as pl
xtb.update(charge=-1)
results_neg = xtb.singlepoint()
charges = pl.DataFrame({
"elements": molecule.symbols,
"q(0)": results.get("charges"),
"q(+)": results_ion.get("charges"),
"q(-)": results_neg.get("charges"),
})
fukui = charges.select(
pl.col("elements"),
(pl.col("q(0)") - pl.col("q(-)")).alias("f(+)"),
(pl.col("q(+)") - pl.col("q(0)")).alias("f(-)"),
((pl.col("q(+)") - pl.col("q(-)")) / 2).alias("f(0)"),
)
fukui
------------------------------------------------------------
cycle total energy energy error density error
------------------------------------------------------------
1 -41.90273954542 -4.2395065E+01 1.8739068E-01
2 -42.14847705021 -2.4573750E-01 9.6117214E-02
3 -42.22375508011 -7.5278030E-02 7.0987678E-02
4 -42.30265756007 -7.8902480E-02 3.4694875E-02
5 -42.30270404785 -4.6487774E-05 2.2615887E-02
6 -42.31014040751 -7.4363597E-03 1.1765251E-02
7 -42.31193645193 -1.7960444E-03 3.5463976E-03
8 -42.31192828361 8.1683186E-06 1.3330927E-03
9 -42.31195331613 -2.5032518E-05 5.6195324E-04
10 -42.31195526964 -1.9535106E-06 2.4623256E-04
11 -42.31195551955 -2.4991267E-07 1.1604165E-04
12 -42.31195568233 -1.6278440E-07 3.6907938E-05
13 -42.31195568774 -5.4046438E-09 1.4924097E-05
------------------------------------------------------------
total: 0.057 sec
| elements | f(+) | f(-) | f(0) |
|---|---|---|---|
| str | f64 | f64 | f64 |
| "C" | -0.029669 | -0.025591 | -0.02763 |
| "N" | 0.070509 | 0.060968 | 0.065738 |
| "C" | 0.054677 | 0.025012 | 0.039845 |
| "N" | 0.0796 | 0.068182 | 0.073891 |
| "C" | 0.036333 | 0.020819 | 0.028576 |
| … | … | … | … |
| "H" | 0.03337 | 0.067995 | 0.050683 |
| "H" | 0.033354 | 0.067899 | 0.050627 |
| "H" | 0.02604 | 0.031893 | 0.028967 |
| "H" | 0.038735 | 0.040089 | 0.039412 |
| "H" | 0.038748 | 0.040102 | 0.039425 |
Localizing frontier orbitals#
The xTB methods are using a minimal basis set, which allows to localize many orbital derived properties by Mulliken population analysis. The partial charges we used previously in the Fukui index calculation are one example and are already computed in xTB and therefore readily available. Here we will localize the frontier orbitals to the atomic centers using integrals and matrices we can compute in xTB.
For this we define the population of the occupied and virtual orbitals and compute their Mulliken population:
here C are the orbital coefficients, S the overlap matrix in the atomic orbital basis, A is the atom index, κ/λ are the atomic orbitals, \(N_\text{ao}\) is the number of the atomic orbitals, i is the occupied orbital index and a is the virtual orbital index.
To compute the local response function Λ we compute the difference between the occupied and virtual orbitals and localize them on each of the atoms using the Mulliken population matrices defined before
where ε are the orbital energies, η is a parameter for regularization here set to 0.5eV.
The Fermi level μ is computed by taking the average of the occupied and virtual orbital energies divided by the local response function.
The localized gap δ finally is computed from the localized difference of the orbital energies
We can implement these equations and compute the frontier orbitals based on the orbital eigenvalues, occupation numbers, orbital coefficients and overlap matrix.
def localize_frontier_orbitals(
eigenvalues: np.ndarray, # Orbital energies in eV (nao,)
occupations: np.ndarray, # Orbital occupations (nao,)
coeff: np.ndarray, # Orbital coefficients (nao, nao)
overlap: np.ndarray, # Overlap matrix (nao, nao)
orbital_to_atom: np.ndarray, # Orbital to atom mapping (nao,)
eta: float = 0.5, # eV
):
eps = 1e-12
nao = orbital_to_atom.size
atom_projection = np.zeros((np.max(orbital_to_atom) + 1, nao))
atom_projection[orbital_to_atom, np.arange(nao)] = 1
occ_indices = np.where(occupations > eps)[0] # (n_occ,)
vir_indices = np.where((1.0 - occupations) > eps)[0] # (n_vir)
pop = 2 * (atom_projection @ (coeff * (overlap @ coeff))) # (nat, nao)
pop_vir = pop[:, vir_indices] # (nat, n_vir)
pop_occ = pop[:, occ_indices] # (nat, n_occ)
eps_diff = eigenvalues[vir_indices, None] - eigenvalues[None, occ_indices] # (n_vir, n_occ)
div_eps2 = 1.0 / (eps_diff**2 + eta**2) # (n_vir, n_occ)
div_eps3 = 1.0 / (eps_diff**3 + eta**3) # (n_vir, n_occ)
eps_sum = (eigenvalues[vir_indices, None] + eigenvalues[None, occ_indices]) * div_eps2 # (n_vir, n_occ)
response = np.einsum("xi,xa,ai->x", pop_occ, pop_vir, div_eps2) # (nat,)
mask = response > eps
chemical_potential = np.einsum("xi,xa,ai->x", pop_occ, pop_vir, eps_sum) / 2 # (nat,)
chemical_potential = np.where(mask, chemical_potential / response, 0.0)
local_gap = np.einsum("xi,xa,ai->x", pop_occ, pop_vir, div_eps3) # (nat,)
local_gap = np.where(mask, 1.0 / (local_gap / response + eps) - eta, 0.0)
return chemical_potential, local_gap
Note
The frontier orbital computation is also implemented in the xTB calculator and can be used without any integral calculation in Python. Here we reimplement a simplified version of this computation as an example for localization and population analysis possible in xTB.
Now we can compute these frontier orbitals with xTB. First, we need to explicitly ask to save the integrals during our calculation using the set method of the xTB calculator to change the save-integrals property. Another important point to consider is the charge of the system. Since our example is the caffeine molecule, the charge is updated to zero.
xtb.set("save-integrals", True)
xtb.update(charge=0)
results = xtb.singlepoint(results)
------------------------------------------------------------
cycle total energy energy error density error
------------------------------------------------------------
1 -42.14746315398 -4.2639788E+01 1.4674545E-05
2 -42.14746315797 -3.9875658E-09 6.8163647E-06
------------------------------------------------------------
total: 0.017 sec
Note
Notice only two iterations are needed for the single point since the calculations is restarted from the previous results.
To obtain the required properties we can now query the results we got from our calculation with the updated xTB calculator. Some properties like the orbital to shell and the shell to atom mapping we can directly obtain from the calculator instead of the calculation results.
orbital_map = xtb.get("orbital-map")
shell_map = xtb.get("shell-map")
orbital_to_atom = shell_map[orbital_map]
chemical_potential, local_gap = localize_frontier_orbitals(
eigenvalues=results["orbital-energies"] * qcel.constants.conversion_factor("hartree", "eV"),
occupations=results["orbital-occupations"],
coeff=results["orbital-coefficients"],
overlap=results["overlap-matrix"],
orbital_to_atom=orbital_to_atom,
)
frontier_orbitals = pl.DataFrame({
"element": molecule.symbols,
"Fermi level [eV]": chemical_potential,
"local gap [eV]": local_gap,
"highest occ. AO [eV]": chemical_potential - local_gap / 2,
"lowest unocc. AO [eV]": chemical_potential + local_gap / 2,
})
frontier_orbitals
| element | Fermi level [eV] | local gap [eV] | highest occ. AO [eV] | lowest unocc. AO [eV] |
|---|---|---|---|---|
| str | f64 | f64 | f64 | f64 |
| "C" | -5.951507 | 16.906979 | -14.404997 | 2.501982 |
| "N" | -8.223086 | 9.112507 | -12.77934 | -3.666833 |
| "C" | -8.484906 | 7.311352 | -12.140582 | -4.82923 |
| "N" | -7.680513 | 6.815615 | -11.088321 | -4.272706 |
| "C" | -7.977049 | 8.538737 | -12.246418 | -3.70768 |
| … | … | … | … | … |
| "H" | -5.082449 | 15.394504 | -12.779701 | 2.614803 |
| "H" | -5.080239 | 15.409929 | -12.785204 | 2.624726 |
| "H" | -4.615225 | 17.701538 | -13.465993 | 4.235544 |
| "H" | -4.963728 | 15.193746 | -12.560601 | 2.633145 |
| "H" | -4.960283 | 15.195584 | -12.558075 | 2.637509 |
Summary#
In this tutorial, we demonstrated how to set up and run singlepoint calculations using the tblite Python package. We explored two methods for defining molecular geometry: explicitly in Python and by reading from an xyz file using the qcelemental package. Furthermore, we computed properties based on our xTB calculation results, including
the HOMO-LUMO gap
the vertical ionization potential
Fukui indices using partial charges from calculations with different total charge
localized atomic HOMO-LUMO gaps using integrals computed by xTB
With this introduction, you can perform extended tight binding (xTB) calculations and derive properties from the calculation results.