This tutorial demonstrates how to implement a simple program to solve the Hartree-Fock equations using self-consistent field (SCF) iterations.
Begin by cloning the repository:
git clone https://github.com/yangjunjie0320/hf-tutorial.git
cd hf-tutorialInstall the required dependencies using pip:
pip install -r requirements.txtThe core dependencies are:
numpyscipyh5pyline_profiler(for profiling)
Note: If you wish to use
./data/gen_data.pyto generate the integrals,pyscfis also required.
You can verify the setup by running main.py:
python main.pyYou should see:
Great! We are ready to solve the Hartree-Fock equations...
...
RuntimeError: SCF is not running, fill in the code in the main loop.This confirms that the framework is set up correctly and ready for you to implement the SCF procedure.
Find the function that raises the RuntimeError and implement the SCF procedure in it.
There is a loose style check with ruff, you can run it with:
ruff check .Follow the suggestions to fix the style issues if you want.
We will follow the algorithm described in Szabo and Ostlund, Section 3.4.6 (p. 146) to implement the Hartree-Fock method. The SCF procedure consists of the following steps.
Note: github could not render
\mathbfcorrectly, check the raw markdown file.
Obtain an initial guess of the Fock matrix using the core Hamiltonian (this is called the "1e guess" in some software; you can use a smarter guess if desired). Set
Solve the generalized eigenvalue problem with scipy.linalg.eigh, then select the occupied orbitals and calculate the new density matrix:
Calculate the Coulomb
Add the Coulomb and exchange matrices to the core Hamiltonian to obtain the Fock matrix:
Compute the energy from the density matrix and Fock matrix:
Compute the errors and check for convergence:
- If converged: Finish the calculation and return the energy and density matrix.
- If not converged: Return to Step 2 with the new density matrix.
There might be some cases that SCF is hard to converge, you can try to use DIIS to accelerate the convergence.
This project provides a simple framework for implementing the SCF procedure. The code is organized as follows:
-
main.py: The main entry point of the program. Contains the framework for implementing the SCF procedure:-
solve_rhf(): Template function for implementing the restricted Hartree-Fock (RHF) method -
solve_uhf(): Template function for implementing the unrestricted Hartree-Fock (UHF) method -
main(): Loads molecular integrals and runs the SCF calculation
-
-
sol.py: Reference implementation of the RHF method. Check it if you are stuck. -
data/gen_data.py: Script to generate the one-electron and two-electron integrals for molecules (e.g.,$\text{H}_2$ ,$\text{HeH}^+$ ,$\text{H}_2\text{O}$ ) using PySCF. -
data/plot.ipynb: Jupyter notebook to visualize the potential energy surface by plotting energy vs. internuclear distance. Use as a reference to plot your own results. -
test/test_rhf.py: Test script to verify the correctness of your SCF implementation by comparing against reference energies.
After implementing the self-consistent field (SCF) procedure, you can use it to compute the potential energy surface of molecules. For example, to compute the potential energy surface of the
for r in numpy.linspace(0.5, 3.0, 61):
inp = f"h2-{r:.4f}"
e = main(inp)
print(f"H2: r={r: 6.4f}, e={e: 12.8f}")You can also try other molecules, such as
To visualize the potential energy surface, you can use the matplotlib package. An example of how to plot the data can be found in ./data/plot.ipynb.
The unrestricted Hartree-Fock (UHF) method can be used to study molecules with broken spin-symmetry, such as those with a non-zero magnetic moment or unpaired electrons. In this case, the potential energy surface can be different from that obtained using the restricted Hartree-Fock method.
To implement the UHF method, create a new function solve_uhf that follows the same steps as the restricted Hartree-Fock method, but with the added flexibility of allowing different spin-orbitals to have different occupation numbers.
Important: Even if you have implemented the UHF method, the results may not be consistent with the reference. To obtain the correct dissociation potential energy surface, you may need to carefully break the spin symmetry with a smart initial guess.
Calculating the Coulomb and exchange matrices is the most time-consuming part of the SCF procedure (not only in this tutorial, but also in real-world applications).
You can start with numpy.einsum (setting optimize=True might help). It offers an intuitive way to express the contraction of two tensors, though it may not always be the most efficient approach in terms of performance.
# einsum notation will contract the repeated indices automatically
# J_{\mu \nu} = \sum_{\lambda \sigma} P_{\lambda \sigma} (\mu \nu | \lambda \sigma)
# p ~ mu, q ~ nu, r ~ lambda, s ~ sigma
coul = numpy.einsum("pqrs,rs->pq", eri, dm, optimize=True)
exch = -numpy.einsum("prqs,rs->pq", eri, dm, optimize=True) / 2.0The equivalent Python loop is:
nao = dm.shape[0]
coul = numpy.zeros((nao, nao))
for mu in range(nao):
for nu in range(nao):
for lm in range(nao):
for sg in range(nao):
coul[mu, nu] += eri[mu, nu, lm, sg] * dm[lm, sg]Begin by moving the matrix computation to standalone functions:
def compute_coulomb_matrix_v0(eri, dm):
nao = dm.shape[0]
coul = numpy.zeros((nao, nao))
for mu in range(nao):
for nu in range(nao):
for lm in range(nao):
for sg in range(nao):
coul[mu, nu] += eri[mu, nu, lm, sg] * dm[lm, sg]
return coul
def compute_coulomb_matrix_v1(eri, dm):
nao = dm.shape[0]
coul = numpy.einsum("pqrs,rs->pq", eri, dm, optimize=True)
return coulThese functions can be tested and benchmarked separately, making optimization easier.
You will see that the nested loop is much slower than the einsum version. If you are ambitious, try implementing it with a Python loop first to understand the algorithm (for both Coulomb and exchange matrices).
Then use a profiler like line_profiler to identify the bottlenecks:
kernprof -l -v main.pyTry converting the einsum to more efficient matrix operations using numpy.dot or numpy.tensordot:
def compute_coulomb_matrix_v2(eri, dm):
"""Alternative implementation using reshape and dot"""
nao = dm.shape[0]
eri_2d = eri.reshape(nao * nao, nao * nao)
dm_1d = dm.reshape(nao * nao)
coul_1d = numpy.dot(eri_2d, dm_1d)
return coul_1d.reshape(nao, nao)Note: The reshape approach may not always be faster than
einsumwithoptimize=True, aseinsumuses optimized BLAS routines internally. Benchmark to compare!
For production code or when dealing with large systems, consider using powerful optimization tools:
-
numba: Just-in-time (JIT) compilation for Python code. Simply add decorators to your functions for significant speedups with minimal code changes. -
cython: Compile Python extensions to C for better performance. Requires writing Cython code but offers fine-grained control over optimization. -
C/C++: Native kernel functions for maximum control and performance. Use OpenMP (e.g.,#pragma omp parallel for) for easy parallelization across CPU cores. -
CUDA: GPU kernel functions for massive parallelization.
- Szabo and Ostlund, Modern Quantum Chemistry: Introduction to Advanced Electronic Structure Theory, Dover Publications, New York, 1996
- Helgaker, Jørgensen, and Olsen, Molecular Electronic-Structure Theory, Wiley, New York, 2000