Basis Sets

Introduction

Basis sets are collections of ‘basis functions’ which are used to represent the location of electrons. Basis functions are represented as single particle functions that are used to build molecular orbitals. Generally, the quantum chemistry community uses localized basis sets, while the solid state community will use plane waves for their basis sets.


Localized Basis Sets

Localized basis sets are 1 electron “orbitals” that are centered around atomic positions. These atomic orbitals are combined to form molecular orbitals through linear combinations.

Slater type orbitals (STO)

These localized orbitals were the original type used for atomic orbitals. They were constructed to  cause an exponential drop off at long range and has a cusp centered around the nucleus. They resemble the shape of the analytically solved hydrogen wavefunction. However, STO’s are not as computationally efficient to solve as Gaussian type orbitals.

Gaussian type orbitals (GTO)

Gaussian type orbitals allow for four center electron integrals to be reduced to sums of two center electron integrals over a finite range due to the ‘Gaussian Product Theorem,’ the fact that a product of two gaussians centered at two different points can be represented by a sum of gaussians centered on a point connecting the original points.  This greatly reduces the computational cost compared to STO’s, even if multiple Gaussian orbitals must be combined to create the cusp near the nuclei. For this reason, most modern quantum chemistry programs use GTO as their orbital type.

Minimal basis  sets

The most minimal style basis sets of GTO type are the STO-XG types, where X is an integer (generally in the range of 3-6) These are basis sets where a Slater-type orbital is approximated with combinations of X # of gaussians. It is minimal because only a single STO is approximated by the gaussians for each atomic orbital. Usually, the smallest variant a program will have will be STO-3G. These basis sets are not very good for accurate results because of how few basis functions are used and key features that are missing like polarization and diffuse orbital nature. Personally, I find these basis sets great when testing new code because the great reduction in number of basis functions greatly reduces the time the calculation takes.

Split valence basis sets

Many basis sets are designed to model the core and valence shells with different amounts of basis functions, with the purpose of devoting more basis functions to correctly describing the valence orbitals. A common example of a split valence basis set would be the Pople basis sets that take the X-YZg form, such as 6-31g. In this scheme, X represents the number of gaussians contracted to represent one STO, while YZ shows that it is a double \zeta basis set for valence orbitals. The integers used for Y and Z represent how many gaussian functions comprise each of the valence functions. For example, with 6-31g, we have a double \zeta basis set, with 6 contracted gaussians making our core basis functions, while we have two basis functions for each valence orbital. The first valence function is described with 3 contracted gaussians, while the second function is made of a single gaussian

 Polarization functions

There are times when atomic orbitals benefit from polarization being included into the basis function. This often occurs when atomic orbitals are overlapping and bonds are forming. To accurately model this, including polarization functions into the basis functions will allow the basis orbitals to be polarized in  different directions. This is accomplished by mixing orbital types with higher level angular momentum types. For example, to give an s orbital the ability to polarize, it must be mixed with a p orbital.

Diffuse functions

In diffuse functions, \zeta is very small, which spreads out the orbital. These functions can also be added into basis sets. These diffuse functions are very important if modeling anions, which have larger radii than neutral isoelectronic species.

Outside sources on localized basis sets

This series of slides is a great introduction to basis sets and contains some more technical data. 

Another set of slides from Vesa Hanninen that covers localized basis sets

Site that has a javascript applet for visualizing different atomic orbitals

Plane Wave Basis Sets

The periodic nature of many physical systems lends itself to the implementation of a plane-wave basis set. One major advantage of using a plane-wave basis set is that it is guaranteed to converge to the target wavefunction in a monotonic manner. Furthermore, certain integrals are much easier to carry out with plane-waves as opposed to localized wavefunctions. The kinetic energy operator will be diagonal in reciprocal space, and integrals over real-space operators can be done efficiently using the fast fourier transform.

The coefficients of the plane waves, and thus the size of the basis set, are determined by the reciprocal lattice vectors. The largest coefficient is determined through the energy cutoff, E_{cut} = \frac{1} {2} \left | \vec{G} \right |^2 . This is the key convergence parameter for the plane-wave implementation. Larger values will always correspond to higher accuracy, but as a trade off require more computational time and memory. It is up to the user to decide when suitable accuracy has been reached while also making sure that calculations can be done in a timely manner.

When using a plane-wave basis set, we often make use of pseudopotentials rather than using an all-electron treatment of the system. This is because the system is slowly varying in between the atomic sites, but near the nuclei, varies much more rapidly. As a result, describing those fluctuations requires many plane waves. Using a pseudopotential allows the calculation to converge with less computation time.

As an alternative to pseudopotentials, we can also use a basis set that makes use of plane waves in the interstitial space, where the wavefunctions are slowly varying, but uses other means to treat the system closer to the atoms. The most common method that take this approach is the  Projector Augmented Wave (PAW) method. This method is based on earlier methods that are simpler, but are not as computationally efficient: the Augmented Plane Wave (APW) method and the Linearized Augmented Plane Wave (LAPW) method.

Augmented Plane Wave (APW) Method

The Augmented Plane Wave method was developed by Slater in 1953. J.C. Slater, Physical Review 92, 603 (1953). Link

With the assumption that the effective crystal potential is approximately constant between the ionic cores, we can set some cutoff radius for each atom. Outside of this cutoff radius, we treat the system with a plane wave basis, as explained above, while we treat the inside of the sphere separately.

Inside the chosen cutoff radius, we assume the potential of the free ion, which is spherically symmetric. We can then solve inside the spheres as linear combinations of the solutions to Schrödinger’s Equation. The solutions will be radial equations multiplied by spherical harmonics. Numerically, the angular integrals will be computationally efficient due to the spherical harmonics being orthogonal. We join the solutions inside the spheres to the plane wave solutions outside of the sphere, requiring continuity of functions and derivatives. By demanding that the expectation values of the energy be constant with respect to variations of the coefficients of the expansion inside the spheres, we can obtain the correct coefficients and obtain a single solution.

In general the APW method is not practical for more than simple solids, but it does work in principle.

Linearized Augmented Plane Wave (LAPW) Method

The LAPW method is a natural extension of the APW method. For the original treatment by Andersen, see O.K. Andersen, Phys. Rev. B 12, 3060 (1975). Link

In principle, the problem that we are trying to address is that the using the APW method requires the solution of an energy-dependent secular equation for each band. This is due to the energy dependence in our augmenting function. We can resolve this issue by adding a degree of variational freedom. Now, inside of our augmentation radius, the sphere will be solved with spherical harmonics times radial functions and the derivatives of those same radial functions.

By introducing the derivative of the radial functions into our augmentation basis, we now require more matching conditions at the boundary between the spheres and the interstitial space. This will sometimes require that we use more plane waves for a given degree of convergence in our parameters of interest, but this is worthwhile because our basis is now flexible enough that we no longer need to solve a separate equation for each energy band. We can now use a single diagonalization for the whole system. This flexibility also allows for full-potential methods, which was the state of the art for many electronic calculations in the 1980s, and was used for complex materials and surfaces with elements with d- and f-electrons.

The LAPW method has problems with semi-core states that are extended toward the augmentation radius. This method can be further extended by introducing a term with the second derivative of the radial functions as well, making our basis functions inside the augmentation radius formally local orbitals. This approach is called the LAPW+LO method.

The LAPW method and its expansions are much more flexible than the APW method, but they are also more computationally intensive. It should also be noted that we have introduced a new computational parameter in these methods. We must set our augmentation radius for each of the atoms in our basis. This is, at times, not intuitive to optimize, because the augmentation radius should not necessarily reflect the ionic radius of that element. The primary benefit we receive from these approaches as opposed to pseudopotential methods is that we explicitly include the core electrons in these methods.

Projected Augmented Wave (PAW) Method

The PAW method is the most commonly used augmented wave approach at this time, and while its implementation is similar to the APW and LAPW methods, it addresses the augmentation region in a different manner. For a detailed derivation of the formalism, as well as a discussion of the method’s implementation and truncation errors, see P. E. Blöchl, Phys. Rev. B 50, 17953 (1994). Link

Blöchl originally proposed the PAW method as a means of bridging the gap between existing augmented wave methods and pseudopotential methods. In the PAW method, we set up an augmentation radius in the typical way discussed above, and we use plane waves outside of that radius. Inside the sphere, instead of using a different basis set, as in APW and LAPW, we instead try to rigorously treat the core in an all-electron method by making the plane wave computation more efficient.

Our physical wavefunctions exhibit strong oscillations, which make plane waves a bad choice for our basis. However, if we consider these wavefunctions in the space of all functions orthogonal to the core states, we can transform from this space to another Hilbert space. Transforming from the physical all electron wave functions to the pseudo-space will be done through a linear transformation, and ideally, treating the pseudo-space wavefunctions with plane waves will be computationally efficient. We can then compute expectation values in our efficient pseudo-space rather than using a different basis.

If we take completely general functions for the basis of our physical space and our pseudo-space, we can expand wavefunctions in each space in a partial wave expansion.  We use the same coefficients in each expansion, but leave them undetermined until later. For the transformation to be linear, the coefficients must be scalar products between the pseudo-space wavefunctions and some other functions, which we call projector functions. If we also require that each projector function be orthonormal with respect to the corresponding partial wave in our expansion, we can uniquely determine our expansion.

So for our transformation, we need: 1) the physical space partial waves, which we can obtain by numerically integrating the Schrödinger Equation, 2) one pseudo-space partial wave that coincides with the corresponding physical space partial wave outside of the augmentation region and 3) one projector function, confined to the augmentation region, for each pseudo-space partial wave, which obeys our orthonormality condition.

The core of the PAW method is defining this transformation to a pseudo-space that is more easily treatable with plane waves. We then treat the transformed system inside the augmentation region with plane waves as our basis, enforcing appropriate continuity conditions at the border of the augmentation sphere. In this sense, since we are only using plane waves as our basis set and we are changing the potential close to the nucleus, the PAW method looks like a pseudopotential method. However, we reach this implementation through the augmentation approach and maintain an all-electron treatment of the system, which is why PAW can be thought of as a bridge between the two approaches.

It will also be nice to have some plots showing convergence.