International
Tables for
Crystallography
Volume D
Physical properties of crystals
Edited by A. Authier

International Tables for Crystallography (2013). Vol. D, ch. 2.2, pp. 314-333
doi: 10.1107/97809553602060000912

Chapter 2.2. Electrons

K. Schwarza*

aInstitut für Materialchemie, Technische Universität Wien, Getreidemarkt 9/165-TC, A-1060 Vienna, Austria
Correspondence e-mail: kschwarz@theochem.tuwein.ac.at

The electronic structure of a solid, characterized by its energy band structure, is the fundamental quantity that determines the ground state of the solid and a series of excitations involving electronic states. In the first part of this chapter, several basic concepts are summarized in order to establish the notation used and to repeat essential theorems from group theory and solid-state physics that provide the definitions that are needed in this context (Brillouin zones, symmetry operators, Bloch theorem, space-group symmetry). Next the quantum-mechanical treatment, especially density functional theory, is described and the commonly used methods of band theory are outlined (the linear combination of atomic orbitals, tight binding, pseudo-potential schemes, the augmented plane wave method, the linear augmented plane wave method, the Korringa–Kohn–Rostocker method, the linear combination of muffin-tin orbitals, the Car–Parinello method etc.). The linear augmented plane wave scheme is presented explicitly so that concepts in connection with energy bands can be explained. The electric field gradient is discussed to illustrate a tensorial quantity. In the last section, a few examples illustrate the topics of the chapter.

2.2.1. Introduction

| top | pdf |

The electronic structure of a solid, characterized by its energy band structure, is the fundamental quantity that determines the ground state of the solid and a series of excitations involving electronic states. In this chapter, we first summarize several basic concepts in order to establish the notation used here and to repeat essential theorems from group theory and solid-state physics that provide definitions which we need in this context. Next the quantum-mechanical treatment, especially density functional theory, is described and the commonly used methods of band theory are outlined. One scheme is presented explicitly so that concepts in connection with energy bands can be explained. The electric field gradient is discussed to illustrate a tensorial quantity and a few examples illustrate the topics of this chapter.

2.2.2. The lattice

| top | pdf |

2.2.2.1. The direct lattice and the Wigner–Seitz cell

| top | pdf |

The three unit-cell vectors [{\bf a}_{1}], [{\bf a}_{2}] and [{\bf a}_{3}] define the parallelepiped of the unit cell. We define

  • (i) a translation vector of the lattice (upper case) as a primitive vector (integral linear combination) of all translations[{\bf T}_{n}=n_{1}{\bf a}_{1}+n_{2}{\bf a}_{2}+n_{3}{\bf a}_{3}\,\,\hbox{with }n_{i}\hbox{ integer},\eqno(2.2.2.1)]

  • (ii) but a vector in the lattice (lower case) as[{\bf r}=x_{1}{\bf a}_{1}+x_{2}{\bf a}_{2}+x_{3}{\bf a} _{3}\,\,\hbox{with }x_{i}\hbox{ real}.\eqno(2.2.2.2)]

From the seven possible crystal systems one arrives at the 14 possible space lattices, based on both primitive and non-primitive (body-centred, face-centred and base-centred) cells, called the Bravais lattices [see Chapter 9.1[link] of International Tables for Crystallography, Volume A (2005)[link]]. Instead of describing these cells as parallelepipeds, we can find several types of polyhedra with which we can fill space by translation. A very important type of space filling is obtained by the Dirichlet construction. Each lattice point is connected to its nearest neighbours and the corresponding bisecting (perpendicular) planes will delimit a region of space which is called the Dirichlet region, the Wigner–Seitz cell or the Voronoi cell. This cell is uniquely defined and has additional symmetry properties.

When we add a basis to the lattice (i.e. the atomic positions in the unit cell) we arrive at the well known 230 space groups [see Part 3[link] of International Tables for Crystallography, Volume A (2005)[link]].

2.2.2.2. The reciprocal lattice and the Brillouin zone

| top | pdf |

Owing to the translational symmetry of a crystal, it is convenient to define a reciprocal lattice, which plays a dominating role in describing electrons in a solid. The three unit vectors of the reciprocal lattice [{\bf b}_{i}] are given according to the standard definition by [{\bf a}_{i}{\bf b}_{j}=2\pi\delta_{ij},\eqno(2.2.2.3)]where the factor [2\pi] is commonly used in solid-state physics in order to simplify many expressions. Strictly speaking (in terms of mathematics) this factor should not be included [see Section 1.1.2.4[link] of the present volume and Chapter 1.1[link] of International Tables for Crystallography, Volume B (2001)[link]], since the (complete) reciprocity is lost, i.e. the reciprocal lattice of the reciprocal lattice is no longer the direct lattice. [{\bf b}_{1}=2\pi{{{\bf a}_{2}\times{\bf a}_{3}}\over{{\bf a}_{1}\cdot{\bf a}_{2}\times{\bf a}_{3}}}\,\,\hbox{and cyclic permutations.}\eqno(2.2.2.4)]

In analogy to the direct lattice we define

  • (i) a vector of the reciprocal lattice (upper case) as[{\bf K}_{m}=m_{1}{\bf b}_{1}+m_{2}{\bf b}_{2}+m_{3}{\bf b} _{3}\,\,\hbox{with }m_{i}\hbox{ integer}\semi\eqno(2.2.2.5)]

  • (ii) a vector in the lattice (lower case) as[{\bf k}=k_{1}{\bf b}_{1}+k_{2}{\bf b}_{2}+k_{3}{\bf b} _{3}\,\, \hbox{with }k_{i}\hbox{ real}.\eqno(2.2.2.6)]From (2.2.2.5)[link] and (2.2.2.1)[link] it follows immediately that [{\bf T}_{n}{\bf K}_{m}=2\pi N\,\,\hbox{with }N\hbox{ an integer.}\eqno(2.2.2.7)]

A construction identical to the Wigner–Seitz cell delimits in reciprocal space a cell conventionally known as the first Brillouin zone (BZ), which is very important in the band theory of solids. There are 14 first Brillouin zones according to the 14 Bravais lattices.

2.2.3. Symmetry operators

| top | pdf |

The concepts of symmetry operations in connection with a quantum-mechanical treatment of the electronic states are essential for an understanding of the electronic structure. In this context the reader is referred, for example, to the book by Altmann (1994[link]).

For the definition of symmetry operators we use in the whole of this chapter the active picture, which has become the standard in solid-state physics. This means that the whole configuration space is rotated, reflected or translated, while the coordinate axes are kept fixed.

A translation is given by[\eqalignno{{\bf r}^\prime &={\bf r}+{\bf T}&(2.2.3.1)\cr t{\bf r} &={\bf r}+{\bf T},&(2.2.3.2)}%fd2.2.3.2]where t on the left-hand side corresponds to a symmetry (configuration-space) operator.

2.2.3.1. Transformation of functions

| top | pdf |

Often we are interested in a function (e.g. a wavefunction) [f({\bf r})] and wish to know how it transforms under the configuration operator g which acts on [{\bf r}]. For this purpose it is useful to introduce a function-space operator [\widetilde{g}] which defines how to modify the function in the transformed configuration space so that it agrees with the original function [f({\bf r)}] at the original coordinate [{\bf r}]:[\widetilde{g}f(g{\bf r})=f({\bf r}).\eqno(2.2.3.3)]This must be valid for all points [{\bf r}] and thus also for [g^{-1}{\bf r}], leading to the alternative formulation [\widetilde{g}f({\bf r})=f(g^{-1}{\bf r}).\eqno(2.2.3.4)]The symmetry operations form a group G of configuration-space operations [g_{i}] with the related group [\widetilde{G}] of the function-shape operators [\widetilde{g}_{i}]. Since the multiplication rules [g_{i}g_{j}=g_{k}\rightarrow\widetilde{g}_{i}\widetilde{g}_{j}=\widetilde {g}_{k}\eqno(2.2.3.5)]are preserved, these two groups are isomorphic.

2.2.3.2. Transformation of operators

| top | pdf |

In a quantum-mechanical treatment of the electronic states in a solid we have the following different entities: points in configuration space, functions defined at these points and (quantum-mechanical) operators acting on these functions. A symmetry operation transforms the points, the functions and the operators in a clearly defined way.

Consider an eigenvalue equation of operator [{\bb A}] (e.g. the Hamiltonian):[{\bb A}\varphi=a\varphi,\eqno(2.2.3.6)]where [\varphi({\bf r})] is a function of [{\bf r}]. When g acts on [{\bf r}], the function-space operator [\widetilde{g}] acts [according to (2.2.3.4)[link]] on [\varphi] yielding [\psi]: [\psi=\widetilde{g}\varphi\rightarrow\varphi=\widetilde{g}^{-1}\psi.\eqno(2.2.3.7)]By putting [\varphi] from (2.2.3.7)[link] into (2.2.3.6)[link], we obtain [{\bb A}\widetilde{g}^{-1}\psi=a\widetilde{g}^{-1}\psi. \eqno(2.2.3.8)]Multiplication from the left by [\widetilde{g}] yields [\widetilde{g}{\bb A}\widetilde{g}^{-1}\psi=a\widetilde{g}\widetilde{g}^{-1}\psi=a\psi.\eqno(2.2.3.9)]This defines the transformed operator [\widetilde{g}{\bb A}\widetilde {g}^{-1}] which acts on the transformed function [\psi] that is given by the original function [\varphi] but at position [g^{-1}{\bf r}].

2.2.3.3. The Seitz operators

| top | pdf |

The most general space-group operation is of the form [wp] with the point-group operation p (a rotation, reflection or inversion) followed by a translation w: [wp=\{p|{\bf w}\}.\eqno(2.2.3.10)]With the definition [\{p|{\bf w}\}{\bf r}=wp{\bf r}=w(p{\bf r})=p{\bf r} +{\bf w}\eqno(2.2.3.11)]it is easy to prove the multiplication rule [\{p|{\bf w}\}\{p^{\prime}|{\bf w}^{\prime}\}=\{pp^{\prime} |p{\bf w}^{\prime}+{\bf w}\}\eqno(2.2.3.12)]and define the inverse of a Seitz operator as [\{p|{\bf w}\}^{-1}=\{p^{-1}|-p^{-1}{\bf w}\},\eqno(2.2.3.13)]which satisfies [\{p|{\bf w}\}\{p|{\bf w}\}^{-1}=\{E|{\bf 0}\},\eqno(2.2.3.14)]where [\{E|{\bf 0}\}] does not change anything and thus is the identity of the space group G.

2.2.3.4. The important groups and their first classification

| top | pdf |

Using the Seitz operators, we can classify the most important groups as we need them at the beginning of this chapter:

  • (i) the space group, which consists of all elements [G=\{\{p|{\bf w}\}\}];

  • (ii) the point group (without any translations) [\ P=\{\{p|{\bf 0}\}\}]; and

  • (iii) the lattice translation subgroup [T=\{\{E|{\bf T}\}\}], which is an invariant subgroup of G, i.e. [T\triangleleft G]. Furthermore T is an Abelian group, i.e. the operation of two translations commute ([t_{1}t_{2}=t_{2}t_{1}]) (see also Section 1.2.3.1[link] of the present volume). A useful consequence of the commutation property is that T can be written as a direct product of the corresponding one-dimensional translations, [T=T_{x}\otimes T_{y}\otimes T_{z}.\eqno(2.2.3.15)]

  • (iv) A symmorphic space group contains no fractional translation vectors and thus P is a subgroup of G, i.e. [P\triangleleft G].

  • (v) In a non-symmorphic space group, however, some p are associated with fractional translation vectors [{\bf v}]. These [{\bf v}] do not belong to the translation lattice but when they are repeated a specific integer number of times they give a vector of the lattice. In this case, [\{p|{\bf 0}\}] can not belong to G for all p.

  • (vi) The Schrödinger group is the group S of all operations [\widetilde{g}] that leave the Hamiltonian invariant, i.e. [\widetilde {g}{\bb H}\widetilde{g}^{-1}={\bb H}] for all [\widetilde{g}\in S]. This is equivalent to the statement that [\widetilde{g}] and [{\bb H}] commute: [\widetilde{g}{\bb H}={\bb H}\widetilde{g}]. From this commutator relation we find the degenerate states in the Schrödinger equation, namely that [\widetilde{g}\varphi] and [\varphi] are degenerate with the eigenvalue E whenever [\widetilde{g}\in S], as follows from the three equations [\eqalignno{{\bb H}\varphi &=E\varphi &(2.2.3.16)\cr \widetilde{g}{\bb H}\varphi &=E\widetilde{g}\varphi &(2.2.3.17)\cr {\bb H}\widetilde{g}\varphi &=E\widetilde{g}\varphi .&(2.2.3.18)}%fd2.2.3.18]

2.2.4. The Bloch theorem

| top | pdf |

The electronic structure of an infinite solid looks so complicated that it would seem impossible to calculate it. Two important steps make the problem feasible. One is the single-particle approach, in which each electron moves in an average potential [V({\bf r})] according to a Schrödinger equation, [{\bb H}\psi({\bf r})=\left\{-{{\hbar^{2}}\over{2m}}\nabla^{2}+V({\bf r})\right\}\psi({\bf r})=E\psi({\bf r}),\eqno(2.2.4.1)]and has its kinetic energy represented by the first operator. The second important concept is the translational symmetry, which leads to Bloch functions. The single-particle aspect will be discussed later (for details see Sections 2.2.9[link] and 2.2.10[link]).

2.2.4.1. A simple quantum-mechanical derivation

| top | pdf |

In order to derive the Bloch theorem, we can simplify the problem by considering a one-dimensional case with a lattice constant a. [The generalization to the three-dimensional case can be done easily according to (2.2.3.15)[link].] The one-dimensional Schrödinger equation is [\left\{-{{\hbar^{2}}\over{2m}}{{d^{2}}\over{dx^{2}}}+V(x)\right\}\psi(x)=E\psi (x),\eqno(2.2.4.2)]where [V(x)] is invariant under translations, i.e. [V(x+a)=V(x)]. We define a translation operator t according to (2.2.3.1)[link] for the translation by one lattice constant as [tx=x+a\eqno(2.2.4.3)]and apply its functional counterpart [\widetilde{t}] to the potential, which gives [according to (2.2.3.4)[link]][\widetilde{t}V(x)=V(t^{-1}x)=V(x-a)=V(x).\eqno(2.2.4.4)]The first part in [{\bb H}] corresponds to the kinetic energy operator, which is also invariant under translations. Therefore, since [\widetilde{t}\in T] (the lattice translation subgroup) and [\widetilde{t}\in S] (the Schrödinger group), [\widetilde{t}] commutes with [{\bb H}], i.e. the commutator vanishes, [[\widetilde{t},{\bb H}]=0] or [\widetilde{t}{\bb H}={\bb H} \widetilde{t}]. This situation was described above [see (2.2.3.16)[link]–(2.2.3.18)] and leads to the fundamental theorem of quantum mechanics which states that when two operators commute the eigenvectors of the first must also be eigenvectors of the second. Consequently we have[\eqalignno{{\bb H}\psi(x) &=E\psi(x)&(2.2.4.5)\cr \widetilde{t}\psi(x) &=\mu\psi(x),&(2.2.4.6)}%fd2.2.4.6]where [\mu] is the eigenvalue corresponding to the translation by the lattice constant a. The second equation can be written explicitly as [\widetilde{t}\psi(x)=\psi(t^{-1}x)=\psi(x-a)=\mu\psi(x)\eqno(2.2.4.7)]and tells us how the wavefunction changes from one unit cell to the neighbouring unit cell. Notice that the electron density must be translationally invariant and thus it follows[\hbox{from }\psi^{\ast}(x-a)\psi(x-a)=\psi^{\ast}(x)\psi(x) \hbox{ that }\mu^{\ast}\mu=1, \eqno(2.2.4.8)]which is a necessary (but not sufficient) condition for defining [\mu].

2.2.4.2. Periodic boundary conditions

| top | pdf |

We can expect the bulk properties of a crystal to be insensitive to the surface and also to the boundary conditions imposed, which we therefore may choose to be of the most convenient form. Symmetry operations are covering transformations and thus we have an infinite number of translations in T, which is most inconvenient. A way of avoiding this is provided by periodic boundary conditions (Born–von Karman). In the present one-dimensional case this means that the wavefunction [\psi(x)] becomes periodic in a domain [L=Na] (with integer N number of lattice constants a), i.e. [\psi(x+Na)=\psi(x+L)=\psi(x).\eqno(2.2.4.9)]According to our operator notation (2.2.4.6)[link], we have the following situation when the translation t is applied n times: [\widetilde{t}^{n}\psi(x)=\psi(x-na)=\mu^{n}\psi(x).\eqno(2.2.4.10)]It follows immediately from the periodic boundary condition (2.2.4.9)[link] that [\mu^{N}=1\eqno(2.2.4.11)]with the obvious solution [\mu=\exp[2\pi i({n}/{N})]\quad\hbox{with}\quad n=0\pm1,\pm2,\ldots.\eqno(2.2.4.12)]Here it is convenient to introduce a notation [k={{2\pi}\over{a}}{{n}\over{N}}\eqno(2.2.4.13)]so that we can write [\mu=\exp(ika)]. Note that k is quantized due to the periodic boundary conditions according to (2.2.4.13)[link]. Summarizing, we have the Bloch condition (for the one-dimensional case): [\psi(x+a)=\exp(ika)\psi(x),\eqno(2.2.4.14)]i.e. when we change x by one lattice constant a the wavefunction at x is multiplied by a phase factor [\exp(ika)]. At the moment (2.2.4.13)[link] suggests the use of k as label for the wavefunction [\psi_{k}(x)].

Generalization to three dimensions leads to the exponential [\exp(i{\bf kT})] with [\textstyle\sum\limits_{i=1}^{3}k_{i}n_{i}={\bf k}\cdot{\bf T}\quad\hbox{using (2.2.2.6) and (2.2.2.1)}\eqno(2.2.4.15)]and thus to the Bloch condition [\psi_{{\bf k}}({\bf r+T})=\exp({i{\bf kT}})\psi_{{\bf k}} ({\bf r}), \eqno(2.2.4.16)]or written in terms of the translational operator [\{E|{\bf T}\}] [see (2.2.3.15)[link]] [\{E|{\bf T}\}\psi_{{\bf k}}({\bf r})=\psi_{{\bf k}}({\bf r-T})=\exp({-i{\bf kT}})\psi_{{\bf k}}({\bf r}).\eqno(2.2.4.17)]The eigenfunctions that satisfy (2.2.4.17)[link] are called Bloch functions and have the form [\psi_{{\bf k}}({\bf r})=\exp({i{\bf kr}})u_{{\bf k}}({\bf r}), \eqno(2.2.4.18)]where [u_{{\bf k}}({\bf r})] is a periodic function in the lattice, [u_{{\bf k}}({\bf r})=u_{{\bf k}}({\bf r+T})\quad\hbox{for all }{\bf T},\eqno(2.2.4.19)]and [{\bf k}] is a vector in the reciprocal lattice [see (2.2.2.6)[link]] that plays the role of the quantum number in solids. The [{\bf k}] vector can be chosen in the first BZ, because any [{\bf k}^{\prime}] that differs from [{\bf k}] by just a lattice vector [{\bf K}] of the reciprocal lattice has the same Bloch factor and the corresponding wavefunction [\psi_{{\bf k+K}}({\bf r})] satisfies the Bloch condition again, since [\exp[{i({\bf k+K)T}}]=\exp({i{\bf kT}})\exp({i{\bf KT}})=\exp({i{\bf kT}}),\eqno(2.2.4.20)]where the factor [\exp({i{\bf KT}})] is unity according to (2.2.2.7)[link]. Since these two functions, [\psi_{{\bf k+K}}({\bf r})] and [\psi_{{\bf k} }({\bf r})], belong to the same Bloch factor [\exp({i{\bf kT}})] they are equivalent. A physical interpretation of the Bloch states will be given in Section 2.2.8[link].

2.2.4.3. A simple group-theoretical approach

| top | pdf |

Let us repeat a few fundamental definitions of group theory: For any symmetry operation [g_{i}\in G], the product [gg_{i}g^{-1}] can always be formed for any [g\in G] and defines the conjugate element of [g_{i}] by g. Given any operation [g_{i}], its class [C(g_{i})] is defined as the set of all its conjugates under all operations [g\in G]. What we need here is an important property of classes, namely that no two classes have any element in common so that any group can be considered as a sum of classes.

Assuming periodic boundary conditions with [N_{1},N_{2},N_{3}] number of primitive cells along the axes [{\bf a}_{1},{\bf a}_{2},{\bf a}_{3}], respectively, a lump of crystal with [N=N_{1}N_{2}N_{3}] unit cells is studied. The translation subgroup T contains the general translation operators [{\bf T}], which [using (2.2.3.15)[link]] can be written as [\{E|{\bf T}\}=\{E|n_{1}{\bf a}_{1}\}\{E|n_{2}{\bf a}_{2} \}\{E|n_{3}{\bf a}_{3}\},\eqno(2.2.4.21)]where each factor belongs to one of the three axes. Since T is commutative (Abelian), each operation of T is its own class and thus the number of classes equals its order, namely N. From the general theorem that the squares of the dimensions of all irreducible representations of a group must equal the order of the group, it follows immediately that all N irreducible representations of T must be one-dimensional (see also Section 1.2.3.2[link] of the present volume). Taking the subgroup along the [{\bf a}_{1}] axis, we must have [N_{1}] different irreducible representations, which we label (for later convenience) by [k_{1}] and denote as [_{k_{1}}\widehat{T}\{E|n_{1}{\bf a}_{1}\}.\eqno(2.2.4.22)]These representations are one-dimensional matrices, i.e. numbers, and must be exponentials, often chosen of the form [\exp(-2\pi ik_{1}n_{1})]. The constant [k_{1}] must be related to the corresponding label of the irreducible representation. In the three-dimensional case, we have the corresponding representation [_{k_{1}k_{2}k_{3}}\widehat{T}\{E|{\bf T}\}=\exp[{-2\pi i(k_{1}n_{1}+k_{2} n_{2}+k_{3}n_{3})}]=\exp({-i{\bf k\cdot T}}),\eqno(2.2.4.23)]where we have used the definitions (2.2.2.6)[link] and (2.2.2.1)[link]. Within the present derivation, the vector [{\bf k}] corresponds to the label of the irreducible representation of the lattice translation subgroup.

2.2.5. The free-electron (Sommerfeld) model

| top | pdf |

The free-electron model corresponds to the special case of taking a constant potential in the Schrödinger equation (2.2.4.1)[link]. The physical picture relies on the assumption that the (metallic) valence electrons can move freely in the field of the positively charged nuclei and the tightly bound core electrons. Each valence electron moves in a potential which is nearly constant due to the screening of the remaining valence electrons. This situation can be idealized by assuming the potential to be constant [[V({\bf r})=0]]. This simple picture represents a crude model for simple metals but has its importance mainly because the corresponding equation can be solved analytically. By rewriting equation (2.2.4.1)[link], we have [\nabla^{2}\psi_{{\bf k}}({\bf r})=-{{2mE}\over{\hbar^{2}}}\psi _{{\bf k}}({\bf r})=-|{\bf k}|^{2}\psi_{{\bf k}}({\bf r}),\eqno(2.2.5.1)]where in the last step the constants are abbreviated (for later convenience) by [|{\bf k}|^{2}]. The solutions of this equation are plane waves (PWs) [\psi_{{\bf k}}({\bf r})=C\exp({i{\bf k\cdot r}}), \eqno(2.2.5.2)]where C is a normalization constant which is defined from the integral over one unit cell with volume [\Omega]. The PWs satisfy the Bloch condition and can be written (using the bra–ket notation) as [|{\bf k\rangle}=\psi_{{\bf k}}({\bf r})=\Omega^{1/2}\exp({i{\bf k\cdot r}}).\eqno(2.2.5.3)]From (2.2.5.1)[link] we see that the corresponding energy (labelled by [{\bf k}]) is given by [E_{{\bf k}}={{\hbar^{2}}\over{2m}}|{\bf k}|^{2}.\eqno(2.2.5.4)]

In this context it is useful to consider the momentum of the electron, which classically is the vector [{\bf p}=m{\bf v}], where m and [{\bf v}] are the mass and velocity, respectively. In quantum mechanics we must replace [{\bf p}] by the corresponding operator [{\bb P}]. [{\bb P}|{\bf k\rangle=}{{\hbar}\over{i}}{{\partial}\over{\partial{\bf r}}}|{\bf k\rangle=}{{\hbar}\over{i}}i{\bf k}|{\bf k\rangle =}\hbar{\bf k}|{\bf k\rangle}.\eqno(2.2.5.5)]

Thus a PW is an eigenfunction of the momentum operator with eigenvalue [\hbar{\bf k}]. Therefore the [{\bf k}] vector is also called the momentum vector. Note that this is strictly true for a vanishing potential but is otherwise only approximately true (referred to as pseudomomentum).

Another feature of a PW is that its phase is constant in a plane perpendicular to the vector [{\bf k}] (see Fig. 2.2.5.1[link]). For this purpose, consider a periodic function in space and time, [\varphi_{{\bf k}}({\bf r},t)=\exp[{i({\bf k\cdot r-}\omega t{\bf)}}], \eqno(2.2.5.6)]which has a constant phase factor [\exp({i\omega t})] within such a plane. We can characterize the spatial part by [{\bf r}] within this plane. Taking the nearest parallel plane (with vector [{\bf r}^{\prime}]) for which the same phase factors occur again but at a distance [\lambda] away (with the unit vector [{\bf e}] normal to the plane), [{\bf r}^{\prime}={\bf r}+\lambda{\bf e}={\bf r}+\lambda {{{\bf k}}\over{|{\bf k}|}},\eqno(2.2.5.7)]then [{\bf k}\cdot {\bf r}^{\prime}] must differ from [{\bf k}\cdot {\bf r}] by [2\pi]. This is easily obtained from (2.2.5.7)[link] by multiplication with [{\bf k}] leading to [\displaylines{\hfill{\bf k}\cdot{\bf r}^{\prime}={\bf k}\cdot{\bf r}+\lambda{{|{\bf k|}^{2}}\over {|{\bf k}|}}={\bf k}\cdot{\bf r}+\lambda|{\bf k}|\hfill(2.2.5.8)\cr \hfill{\bf k}\cdot{\bf r}^{\prime}-{\bf k}\cdot{\bf r}=\lambda|{\bf k}|=2\pi\hfill(2.2.5.9)\cr \hfill\lambda={{2\pi}\over{|{\bf k}|}}\hbox{ or } |{\bf k}|= {{2\pi}\over{\lambda}}.\hfill(2.2.5.10)}%fd2.2.5.10]Consequently [\lambda] is the wavelength and thus the [{\bf k}] vector is called the wavevector or propagation vector.

[Figure 2.2.5.1]

Figure 2.2.5.1 | top | pdf |

Plane waves. The wavevector k and the unit vector e are normal to the two planes and the vectors r in plane 1 and [{\bf r}'] in plane 2.

2.2.6. Space-group symmetry

| top | pdf |

2.2.6.1. Representations and bases of the space group

| top | pdf |

The effect of a space-group operation [\{p|{\bf w}\}] on a Bloch function, labelled by [{\bf k}], is to transform it into a Bloch function that corresponds to a vector [p{\bf k}],[\{p|{\bf w}\}\psi_{{\bf k}}=\psi_{p{\bf k}},\eqno(2.2.6.1)]which can be proven by using the multiplication rule of Seitz operators (2.2.3.12)[link] and the definition of a Bloch state (2.2.4.17)[link].

A special case is the inversion operator, which leads to [\{i|{\bf E}\}\psi_{{\bf k}}=\psi_{-{\bf k}}.\eqno(2.2.6.2)]The Bloch functions [\psi_{{\bf k}}] and [\psi_{p{\bf k}}], where p is any operation of the point group P, belong to the same basis for a representation of the space group G. [ \langle \psi_{{\bf k}}|= \langle \psi_{p{\bf k}}|\hbox{ for all }p\in P\hbox{ for all }p{\bf k}\in {\rm BZ}.\eqno(2.2.6.3)]The same [p{\bf k}] cannot appear in two different bases, thus the two bases [\psi_{{\bf k}}] and [\psi_{{\bf k}^{\prime}}] are either identical or have no [{\bf k}] in common.

Irreducible representations of T are labelled by the N distinct [{\bf k}] vectors in the BZ, which separate in disjoint bases of G (with no [{\bf k}] vector in common). If a [{\bf k}] vector falls on the BZ edge, application of the point-group operation p can lead to an equivalent [{\bf k}^{\prime}] vector that differs from the original by [{\bf K}] (a vector of the reciprocal lattice). The set of all mutually inequivalent [{\bf k}] vectors of [p{\bf k}] ([p\in P]) define the star of the k vector ([S_{{\bf k}}]) (see also Section 1.2.3.3[link] of the present volume).

The set of all operations that leave a [{\bf k}] vector invariant (or transform it into an equivalent [{\bf k+K}]) forms the group [G_{{\bf k}}] of the [{\bf k}] vector. Application of q, an element of [G_{{\bf k}}], to a Bloch function (Section 2.2.8[link]) gives [q\psi_{{\bf k}}^{j}({\bf r})=\psi_{{\bf k}}^{j^{\prime}} ({\bf r})\hbox{ for }q\in G_{{\bf k}},\eqno(2.2.6.4)]where the band index j (described below) may change to [j^{\prime}]. The Bloch factor stays constant under the operation of q and thus the periodic cell function [u_{{\bf k}}^{j}({\bf r})] must show this symmetry, namely [qu_{{\bf k}}^{j}({\bf r})=u_{{\bf k}}^{j^{\prime}}({\bf r})\hbox{ for }q\in G_{{\bf k}}.\eqno(2.2.6.5)]For example, a [p_{x}]-like orbital may be transformed into a [p_{y}]-like orbital if the two are degenerate, as in a tetragonal lattice.

A star of [{\bf k}] determines an irreducible basis, provided that the functions of the star are symmetrized with respect to the irreducible representation of the group of [{\bf k}] vectors, which are called small representations. The basis functions for the irreducible representations are given according to Seitz (1937[link]) by [ \langle s\psi_{{\bf k}}^{j}|,\hbox{ where }s\in S_{{\bf k}},]written as a row vector [ \langle |] with [j=1,\ldots,n], where n is the dimension of the irreducible representation of [S_{{\bf k}}] with the order [\left| S_{{\bf k}}\right| ]. Such a basis consists of [n\left| S_{{\bf k} }\right| ] functions and forms an [n\left| S_{{\bf k}}\right|]-dimensional irreducible representation of the space group. The degeneracies of these representations come from the star of [{\bf k}] (not crucial for band calculations except for determining the weight of the [{\bf k}] vector) and the degeneracy from [G_{{\bf k}}]. The latter is essential for characterizing the energy bands and using the compatibility relations (Bouckaert et al., 1930[link]; Bradley & Cracknell, 1972[link]).

2.2.6.2. Energy bands

| top | pdf |

Each irreducible representation of the space group, labelled by [{\bf k}], denotes an energy [E^{j}({\bf k})], where [{\bf k}] varies quasi-continuously over the BZ and the superscript j numbers the band states. The quantization of [{\bf k}] according to (2.2.4.13)[link] and (2.2.4.15)[link] can be done in arbitrary fine steps by choosing corresponding periodic boundary conditions (see Section 2.2.4.2[link]). Since [{\bf k}] and [{\bf k+K}] belong to the same Bloch state, the energy is periodic in reciprocal space: [E^{j}({\bf k})=E^{j}({\bf k+K}).\eqno(2.2.6.6)]Therefore it is sufficient to consider [{\bf k}] vectors within the first BZ. For a given [{\bf k}], two bands will not have the same energy unless there is a multidimensional small representation in the group of [{\bf k}] or the bands belong to different irreducible representations and thus can have an accidental degeneracy. Consequently, this can not occur for a general [{\bf k}] vector (without symmetry).

2.2.7. The [{\bf k}] vector and the Brillouin zone

| top | pdf |

2.2.7.1. Various aspects of the [{\bf k}] vector

| top | pdf |

The [{\bf k}] vector plays a fundamental role in the electronic structure of a solid. In the above, several interpretations have been given for the [{\bf k}] vector that

  • (a) is given in reciprocal space,

  • (b) can be restricted to the first Brillouin zone,

  • (c) is the quantum number for the electronic states in a solid,

  • (d) is quantized due to the periodic boundary conditions,

  • (e) labels the irreducible representation of the lattice translation subgroup T (see Section 2.2.4.3[link])

  • (f) is related to the momentum [according to (2.2.5.5)[link]] in the free-electron case and

  • (g) is the propagation vector (wavevector) associated with the plane-wave part of the wavefunction (see Fig. 2.2.5.1[link]).

2.2.7.2. The Brillouin zone (BZ)

| top | pdf |

Starting with one of the 14 Bravais lattices, one can define the reciprocal lattice [according to (2.2.2.4)[link]] by the Wigner–Seitz construction as discussed in Section 2.2.2.2[link]. The advantage of using the BZ instead of the parallelepiped spanned by the three unit vectors is its symmetry. Let us take a simple example first, namely an element (say copper) that crystallizes in the face-centred-cubic (f.c.c.) structure. With (2.2.2.4)[link] we easily find that the reciprocal lattice is body-centred-cubic (bcc) and the corresponding BZ is shown in Fig. 2.2.7.1[link]. In this case, f.c.c. Cu has [O_{h}] symmetry with 48 symmetry operations [p\in P] (point group). The energy eigenvalues within a star of [{\bf k}] (i.e. [{\bf k}\in S_{k}]) are the same, and therefore it is sufficient to calculate one member in the star. Consequently, it is enough to consider the irreducible wedge of the BZ (called the IBZ). In the present example, this corresponds to 1/48th of the BZ shown in Fig. 2.2.7.1[link]. To count the number of states in the BZ, one counts each [{\bf k}] point in the IBZ with a proper weight [w_{k}] to represent the star of this [{\bf k}] vector.

[Figure 2.2.7.1]

Figure 2.2.7.1 | top | pdf |

The Brillouin zone (BZ) and the irreducible wedge of the BZ for the f.c.c. direct lattice. After the corresponding figure from the Bilbao Crystallographic Server (http://www.cryst.ehu.es/ ). The IBZ for any space group can be obtained by using the option KVEC and specifying the space group (in this case No. 225).

2.2.7.3. The symmetry of the Brillouin zone

| top | pdf |

The BZ is purely constructed from the reciprocal lattice and thus only follows from the translational symmetry (of the 14 Bravais lattices). However, the energy bands [E^{j}({\bf k})], with [{\bf k}] lying within the first BZ, possess a symmetry associated with one of the 230 space groups. Therefore one can not simply use the geometrical symmetry of the BZ to find its irreducible wedge, although this is tempting. Since the effort of computing energy eigenvalues increases with the number of [{\bf k}] points, one wishes to restrict such calculations to the basic domain, but the latter can only be found by considering the space group of the corresponding crystal (including the basis with all atomic positions).

One possible procedure for finding the IBZ is the following. First a uniform grid in reciprocal space is generated by dividing the three unit-cell vectors [{\bf b}_{i}] by an integer number of times. This is easy to do in the parallelepiped, spanned by the three unit-cell vectors, and yields a (more-or-less) uniform grid of [{\bf k}] points. Now one must go through the complete grid of [{\bf k}] points and extract a list of non-equivalent [{\bf k}] points by applying to each [{\bf k}] point in the grid the point-group operations. If a [{\bf k}] point is found that is already in the list, its weight is increased by 1, otherwise it is added to the list. This procedure can easily be programmed and is often used when [{\bf k}] integrations are needed. The disadvantage of this scheme is that the generated [{\bf k}] points in the IBZ are not necessarily in a connected region of the BZ, since one member of the star of [{\bf k}] is chosen arbitrarily, namely the first that is found by going through the complete list.

2.2.8. Bloch functions

| top | pdf |

We can provide a physical interpretation for a Bloch function by the following considerations. By combining the group-theoretical concepts based on the translational symmetry with the free-electron model, we can rewrite a Bloch function [see (2.2.4.18)[link]] in the form [\psi_{{\bf k}}^{j}({\bf r})=|{\bf k\rangle} u_{{\bf k}} ^{j}({\bf r}),\eqno(2.2.8.1)]where [|{\bf k\rangle}] denotes the plane wave (ignoring normalization) in Dirac's ket notation (2.2.5.3)[link]. The additional superscript j denotes the band index associated with [E^{j}({\bf k})] (see Section 2.2.6.2[link]). The two factors can be interpreted most easily for the two limiting cases, namely:

  • (i) For a constant potential, for which the first factor corresponds to a plane wave with momentum [\hbar{\bf k}] [see (2.2.5.5)[link]] but the second factor becomes a constant. Note that for a realistic (non-vanishing) potential, the [{\bf k}] vector of a Bloch function is no longer the momentum and thus is often denoted as pseudomomentum.

  • (ii) If the atoms in a crystal are infinitely separated (i.e. for infinite lattice constants) the BZ collapses to a point, making the first factor a constant. In this case, the second factor must correspond to atomic orbitals and the label j denotes the atomic states 1s, 2s, 2p etc. In the intermediate case, [{\bf k}] is quantized [see (2.2.4.13)[link]] and can take N values (or 2N states including spin) for N cells contained in the volume of the periodic boundary condition [see (2.2.4.21)[link]]. Therefore, as the interatomic distance is reduced from infinity to the equilibrium separations, an atomic level j is broadened into a band [E^{j}({\bf k})] with the quasi-continuous [{\bf k}] vectors and thus shows dispersion.

According to another theorem, the mean velocity of an electron in a Bloch state with wavevector [{\bf k}] and energy [E^{j}({\bf k})] is given by [v^{j}({\bf k})={{1}\over{\hbar}}{{\partial}\over{\partial{\bf k}}} E^{j}({\bf k}).\eqno(2.2.8.2)]If the energy is independent of [{\bf k}], its derivative with respect to [{\bf k}] vanishes and thus the corresponding velocity. This situation corresponds to the genuinely isolated atomic levels (with band width zero) and electrons that are tied to individual atoms. If, however, there is any nonzero overlap in the atomic wavefunctions, then [E^{j}({\bf k})] will not be constant throughout the zone.

In the general case, different notations are used to characterize band states. Sometimes it is more appropriate to label an energy band by the atomic level from which it originates, especially for narrow bands. In other cases (with a large band width) the free-electron behaviour may be dominant and thus the corresponding free-electron notation is more appropriate.

2.2.9. Quantum-mechanical treatment

| top | pdf |

A description of the electronic structure of solids requires a quantum-mechanical (QM) treatment which can be parameterized (in semi-empirical schemes) but is often obtained from ab initio calculations. The latter are more demanding in terms of computational effort but they have the advantage that no experimental knowledge is needed in order to adjust parameters. The following brief summary is restricted to the commonly used types of ab initio methods and their main characteristics.

2.2.9.1. Exchange and correlation treatment

| top | pdf |

Hartree–Fock-based (HF-based) methods (for a general description see, for example, Pisani, 1996[link]) are based on a wavefunction description (with one Slater determinant in the HF method). The single-particle HF equations (written for an atom in Rydberg atomic units) can be written in the following form, which is convenient for further discussions:[\displaylines{\left[-\nabla^2+V_{Ne}({\bf r})+ \sum\limits_{j=1}^N \int\left| \psi_j ^{\rm HF}(r^\prime)\right| ^2{2\over |{\bf r-r}^\prime|}\,\,{\rm d}{\bf r}^\prime\right.\hfill\cr\left.\quad- \sum\limits_{j=1}^N \int\psi_j^{\rm HF}({\bf r}^\prime)^\ast{1\over|{\bf r-r}^\prime|}P_{rr^\prime}\psi_j^{\rm HF}({\bf r}^\prime)\,\,{\rm d}{\bf r}^\prime\right]\psi_i^{\rm HF}({\bf r})\hfill\cr\quad\quad = \epsilon_i^{\rm HF}\psi_i^{\rm HF}({\bf r}),\hfill(2.2.9.1)}]with terms for the kinetic energy, the nuclear electronic potential, the classical electrostatic Coulomb potential and the exchange, a function potential which involves the permutation operator [P_{rr^{\prime}}], which interchanges the arguments of the subsequent product of two functions. This exchange term can not be rewritten as a potential times the function [\psi _{i}^{\rm HF}({\bf r})] but is truly non-local (i.e. depends on [{\bf r}] and [{\bf r}^{\prime}]). The interaction of orbital j with itself (contained in the third term) is unphysical, but this self-interaction is exactly cancelled in the fourth term. This is no longer true in the approximate DFT method discussed below. The HF method treats exchange exactly but contains – by definition – no correlation effects. The latter can be added in an approximate form in post-HF procedures such as that proposed by Colle & Salvetti (1990[link]).

Density functional theory (DFT) is an alternative approach in which both effects, exchange and correlation, are treated in a combined scheme but both approximately. Several forms of DFT functionals are available now that have reached high accuracy, so many structural problems can be solved adequately. Further details will be given in Section 2.2.10.[link]

2.2.9.2. The choice of basis sets and wavefunctions

| top | pdf |

Most calculations of the electronic structure in solids (Pisani, 1996[link]; Singh, 1994[link]; Altmann, 1994[link]) use a linear combination of basis functions in one form or another but differ in the basis sets. Some use a linear combination of atomic orbitals (LCAO) where the AOs are given as Gaussian- or Slater-type orbitals (GTOs or STOs); others use plane-wave (PW) basis sets with or without augmentations; and still others make use of muffin-tin orbitals (MTOs) as in LMTO (linear combination of MTOs; Skriver, 1984[link]) or ASW (augmented spherical wave; Williams et al., 1979[link]). In the former cases, the basis functions are given in analytic form, but in the latter the radial wavefunctions are obtained numerically by integrating the radial Schrödinger equation (Singh, 1994[link]) (see Section 2.2.11[link]).

Closely related to the choice of basis sets is the explicit form of the wavefunctions, which can be well represented by them, whether they are nodeless pseudo-wavefunctions or all-electron wavefunctions including the complete radial nodal structure and a proper description close to the nucleus.

2.2.9.3. The form of the potential

| top | pdf |

In the muffin-tin or the atomic sphere approximation (MTA or ASA), each atom in the crystal is surrounded by an atomic sphere in which the potential is assumed to be spherically symmetric [see (2.2.12.5)[link] and the discussion thereof]. While these schemes work reasonably well in highly coordinated, closely packed systems (such as face-centred-cubic metals), they become very approximate in all non-isotropic cases (e.g. layered compounds, semiconductors, open structures or molecular crystals). Schemes that make no shape approximation in the form of the potential are termed full-potential schemes (Singh, 1994[link]; Blaha et al., 1990[link]; Schwarz & Blaha, 1996[link]).

With a proper choice of pseudo-potential one can focus on the valence electrons, which are relevant for chemical bonding, and replace the inner part of their wavefunctions by a nodeless pseudo-function that can be expanded in PWs with good convergence.

2.2.9.4. Relativistic effects

| top | pdf |

If a solid contains only light elements, non-relativistic calculations are well justified, but as soon as heavier elements are present in the system of interest relativistic effects can no longer be neglected. In the medium range of atomic numbers (up to about 54), so-called scalar relativistic schemes are often used (Koelling & Harmon, 1977[link]), which describe the main contraction or expansion of various orbitals (due to the Darwin s-shift or the mass–velocity term) but omit spin–orbit splitting. Unfortunately, the spin–orbit term couples spin-up and spin-down wavefunctions. If one has n basis functions without spin–orbit coupling, then including spin–orbit coupling in the Hamiltonian would lead to a [2n\times 2n] matrix equation, which requires about eight times as much computer time to solve it (due to the [n^{3}] scaling). Since the spin–orbit effect is generally small (at least for the valence states), one can simplify the procedure by diagonalizing the Hamiltonian including spin–orbit coupling in the space of the low-lying bands as obtained in a scalar relativistic step. This version is called second variational method (see e.g. Singh, 1994[link]). For very heavy elements it may be necessary to solve Dirac's equation, which has all these terms (Darwin s-shift, mass–velocity and spin–orbit) included. Additional aspects are illustrated in Section 2.2.14[link] in connection with the uranium atom.

2.2.10. Density functional theory

| top | pdf |

The most widely used scheme for calculating the electronic structure of solids is based on density functional theory (DFT). It is described in many excellent books, for example that by Dreizler & Gross (1990[link]), which contains many useful definitions, explanations and references. Hohenberg & Kohn (1964[link]) have shown that for determining the ground-state properties of a system all one needs to know is the electron density [\rho({\bf r})]. This is a tremendous simplification considering the complicated wavefunction of a crystal with (in principle infinitely) many electrons. This means that the total energy of a system (a solid in the present case) is a functional of the density [E[\rho(r)]], which is independent of the external potential provided by all nuclei. At first it was just proved that such a functional exists, but in order to make this fundamental theorem of practical use Kohn & Sham (1965[link]) introduced orbitals and suggested the following procedure.

In the universal approach of DFT to the quantum-mechanical many-body problem, the interacting system is mapped in a unique manner onto an effective non-interacting system of quasi-electrons with the same total density. Therefore the electron density plays the key role in this formalism. The non-interacting particles of this auxiliary system move in an effective local one-particle potential, which consists of a mean-field (Hartree) part and an exchange–correlation part that, in principle, incorporates all correlation effects exactly. However, the functional form of this potential is not known and thus one needs to make approximations.

Magnetic systems (with collinear spin alignments) require a generalization, namely a different treatment for spin-up and spin-down electrons. In this generalized form the key quantities are the spin densities [\rho_{\sigma}(r)], in terms of which the total energy [E_{\rm tot}] is [\eqalignno{E_{\rm tot}(\rho_\uparrow,\rho_\downarrow) &=T_s(\rho_\uparrow, \rho_\downarrow)+E_{ee}(\rho_\uparrow,\rho_\downarrow)+ E_{Ne}(\rho_\uparrow,\rho_\downarrow)&\cr&\quad+E_{xc}(\rho_\uparrow, \rho_\downarrow)+E_{NN}, &(2.2.10.1)}]with the electronic contributions, labelled conventionally as, respectively, the kinetic energy (of the non-interacting particles), the electron–electron repulsion, the nuclear–electron attraction and the exchange–correlation energies. The last term [E_{NN}] is the repulsive Coulomb energy of the fixed nuclei. This expression is still exact but has the advantage that all terms but one can be calculated very accurately and are the dominating (large) quantities. The exception is the exchange–correlation energy [E_{xc}], which is defined by (2.2.10.1)[link] but must be approximated. The first important methods for this were the local density approximation (LDA) or its spin-polarized generalization, the local spin density approximation (LSDA). The latter comprises two assumptions:

  • (i) That [E_{xc}] can be written in terms of a local exchange–correlation energy density [\varepsilon_{xc}] times the total (spin-up plus spin-down) electron density as [E_{xc}=\textstyle\int\varepsilon_{xc}(\rho_{\uparrow},\rho_{\downarrow})\ast[\rho_{\uparrow}+\rho\downarrow]\,\,{\rm d}r.\eqno(2.2.10.2)]

  • (ii) The particular form chosen for [\varepsilon_{xc}]. For a homogeneous electron gas [\varepsilon_{xc}] is known from quantum Monte Carlo simulations, e.g. by Ceperley & Alder (1984[link]). The LDA can be described in the following way. At each point [{\bf r}] in space we know the electron density [\rho({\bf r})]. If we locally replace the system by a homogeneous electron gas of the same density, then we know its exchange–correlation energy. By integrating over all space we can calculate [E_{xc}].

The most effective way known to minimize [E_{\rm tot}] by means of the variational principle is to introduce (spin) orbitals [\chi_{jk}^{\sigma}] constrained to construct the spin densities [see (2.2.10.7)[link] below]. According to Kohn and Sham (KS), the variation of [E_{\rm tot}] gives the following effective one-particle Schrödinger equations, the so-called Kohn–Sham equations (Kohn & Sham, 1965[link]) (written for an atom in Rydberg atomic units with the obvious generalization to solids):[[-\nabla^{2}+V_{Ne}+V_{ee}+V_{xc}^{\sigma}]\chi_{jk}^{\sigma }(r)=\epsilon_{jk}^{\sigma}(r)\chi_{jk}^{\sigma}(r),\eqno(2.2.10.3)]with the external potential (the attractive interaction of the electrons by the nucleus) given by[V_{Ne}(r)={{2Z}\over{r}},\eqno(2.2.10.4)]the Coulomb potential (the electrostatic interaction between the electrons) given by [V_{ee}({\bf r})=V_{C}({\bf r})=\int{{\rho({\bf r}^{\prime})}\over{|{\bf r-r}^{\prime}|}}\,\,{\rm d}{\bf r}^{\prime}\eqno(2.2.10.5)]and the exchange–correlation potential (due to quantum mechanics) given by the functional derivative[V_{xc}({\bf r})={{\delta E_{xc}[\rho(r)]}\over{\delta\rho}}.\eqno(2.2.10.6)]

In the KS scheme, the (spin) electron densities are obtained by summing over all occupied states, i.e. by filling the KS orbitals (with increasing energy) according to the Aufbau principle. [\rho_{\sigma}(r)=\textstyle\sum\limits_{j,k}\rho_{jk}^{\sigma}|\chi_{jk}^{\sigma} (r)|^{2}.\eqno(2.2.10.7)]Here [\rho_{jk}^{\sigma}] are occupation numbers such that [0\leq\rho _{jk}^{\sigma}\leq1/w_{k}], where [w_{k}] is the symmetry-required weight of point [{\bf k}]. These KS equations (2.2.10.3)[link] must be solved self-consistently in an iterative process, since finding the KS orbitals requires the knowledge of the potentials, which themselves depend on the (spin) density and thus on the orbitals again. Note the similarity to (and difference from) the Hartree–Fock equation (2.2.9.1)[link]. This version of the DFT leads to a (spin) density that is close to the exact density provided that the DFT functional is sufficiently accurate.

In early applications, the local density approximation (LDA) was frequently used and several forms of functionals exist in the literature, for example by Hedin & Lundqvist (1971[link]), von Barth & Hedin (1972[link]), Gunnarsson & Lundqvist (1976[link]), Vosko et al. (1980[link]) or accurate fits of the Monte Carlo simulations of Ceperley & Alder (1984[link]). The LDA has some shortcomings, mostly due to the tendency of overbinding, which causes, for example, too-small lattice constants. Recent progress has been made going beyond the LSDA by adding gradient terms or higher derivatives ([\nabla\rho] and [\nabla^{2}\rho]) of the electron density to the exchange–correlation energy or its corresponding potential. In this context several physical constraints can be formulated, which an exact theory should obey. Most approximations, however, satisfy only part of them. For example, the exchange density (needed in the construction of these two quantities) should integrate to [-1] according to the Fermi exclusion principle (Fermi hole). Such considerations led to the generalized gradient approximation (GGA), which exists in various parameterizations, e.g. in the one by Perdew et al. (1996[link]). This is an active field of research and thus new functionals are being developed and their accuracy tested in various applications.

The Coulomb potential [V_{C}({\bf r})] in (2.2.10.5)[link] is that of all N electrons. That is, any electron is also moving in its own field, which is physically unrealistic but may be mathematically convenient. Within the HF method (and related schemes) this self-interaction is cancelled exactly by an equivalent term in the exchange interaction [see (2.2.9.1)[link]]. For the currently used approximate density functionals, the self-interaction cancellation is not complete and thus an error remains that may be significant, at least for states (e.g. 4f or 5f) for which the respective orbital is not delocalized. Note that delocalized states have a negligibly small self-interaction. This problem has led to the proposal of self-interaction corrections (SICs), which remove most of this error and have impacts on both the single-particle eigenvalues and the total energy (Parr et al., 1978[link]).

The Hohenberg–Kohn theorems state that the total energy (of the ground state) is a functional of the density, but the introduction of the KS orbitals (describing quasi-electrons) are only a tool in arriving at this density and consequently the total energy. Rigorously, the Kohn–Sham orbitals are not electronic orbitals and the KS eigenvalues [\varepsilon_{i}] (which correspond to [E_{{\bf k}}] in a solids) are not directly related to electronic excitation energies. From a formal (mathematical) point of view, the [\varepsilon_{i}] are just Lagrange multipliers without a physical meaning.

Nevertheless, it is often a good approximation (and common practice) to partly ignore these formal inconsistencies and use the orbitals and their energies in discussing electronic properties. The gross features of the eigenvalue sequence depend only to a smaller extent on the details of the potential, whether it is orbital-based as in the HF method or density-based as in DFT. In this sense, the eigenvalues are mainly determined by orthogonality conditions and by the strong nuclear potential, common to DFT and the HF method.

In processes in which one removes (ionization) or adds (electron affinity) an electron, one compares the N electron system with one with [N-1] or [N+1] electrons. Here another conceptual difference occurs between the HF method and DFT. In the HF method one may use Koopmans' theorem, which states that the [\varepsilon_{i}^{\rm HF}] agree with the ionization energies from state i assuming that the corresponding orbitals do not change in the ionization process. In DFT, the [\varepsilon_{i}] can be interpreted according to Janak's theorem (Janak, 1978[link]) as the partial derivative with respect to the occupation number [n_{i}],[\varepsilon_{i}={{\partial E}\over{\partial n_{i}}}.\eqno(2.2.10.8)]Thus in the HF method [\varepsilon_{i}] is the total energy difference for [\Delta n=1], in contrast to DFT where a differential change in the occupation number defines [\varepsilon_{i}], the proper quantity for describing metallic systems. It has been proven that for the exact density functional the eigenvalue of the highest occupied orbital is the first ionization potential (Perdew & Levy, 1983[link]). Roughly, one can state that the further an orbital energy is away from the highest occupied state, the poorer becomes the approximation to use [\varepsilon_{i}] as excitation energy. For core energies the deviation can be significant, but one may use Slater's transition state (Slater, 1974[link]), in which half an electron is removed from the corresponding orbital, and then use the [\varepsilon_{i}^{\rm TS}] to represent the ionization from that orbital.

Another excitation from the valence to the conduction band is given by the energy gap, separating the occupied from the unoccupied single-particle levels. It is well known that the gap is not given well by taking [\Delta\varepsilon_{i}] as excitation energy. Current DFT methods significantly underestimate the gap (half the experimental value), whereas the HF method usually overestimates gaps (by a factor of about two). A trivial solution, applying the `scissor operator', is to shift the DFT bands to agree with the experimental gap. An improved but much more elaborate approach for obtaining electronic excitation energies within DFT is the GW method in which quasi-particle energies are calculated (Hybertsen & Louie, 1984[link]; Godby et al., 1986[link]; Perdew, 1986[link]). This scheme is based on calculating the dielectric matrix, which contains information on the response of the system to an external perturbation, such as the excitation of an electron.

In some cases, one can rely on the total energy of the states involved. The original Hohenberg–Kohn theorems (Hohenberg & Kohn, 1964[link]) apply only to the ground state. The theorems may, however, be generalized to the energetically lowest state of any symmetry representation for which any property is a functional of the corresponding density. This allows (in cases where applicable) the calculation of excitation energies by taking total energy differences.

Many aspects of DFT from formalism to applications are discussed and many references are given in the book by Springborg (1997[link]).

2.2.11. Band-theory methods

| top | pdf |

There are several methods for calculating the electronic structure of solids. They have advantages and disadvantages, different accuracies and computational requirements (speed or memory), and are based on different approximations. Some of these aspects have been discussed in Section 2.2.9.[link] This is a rapidly changing field and thus only the basic concepts of a few approaches in current use are outlined below.

2.2.11.1. LCAO (linear combination of atomic orbitals)

| top | pdf |

For the description of crystalline wavefunctions (Bloch functions), one often starts with a simple concept of placing atomic orbitals (AOs) at each site in a crystal denoted by [|m\rangle], from which one forms Bloch sums in order to have proper translational symmetry: [\chi_{{\bf k}}({\bf r})=\textstyle\sum\limits_{m}\exp({i{\bf kT}_{{m}}})|m{\rangle}.\eqno(2.2.11.1)]Then Bloch functions can be constructed by taking a linear combination of such Bloch sums, where the linear-combination coefficients are determined by the variational principle in which a secular equation must be solved. The LCAO can be used in combination with both the Hartree–Fock method and DFT.

2.2.11.2. TB (tight binding)

| top | pdf |

A simple version of the LCAO is found by parameterizing the matrix elements [\langle m^{\prime}|{\bb H}|m{\rangle}] and [\langle m^{\prime }|m{\rangle}] in a way similar to the Hückel molecular orbital (HMO) method, where the only non-vanishing matrix elements are the on-site integrals and the nearest-neighbour interactions (hopping integrals). For a particular class of solids the parameters can be adjusted to fit experimental values. With these parameters, the electronic structures of rather complicated solids can be described and yield quite satisfactory results, but only for the class of materials for which such a parametrization is available. Chemical bonding and symmetry aspects can be well described with such schemes, as Hoffmann has illustrated in many applications (Hoffmann, 1988[link]). In more complicated situations, however, such a simple scheme fails.

2.2.11.3. The pseudo-potential schemes

| top | pdf |

In many respects, core electrons are unimportant for determining the stability, structure and low-energy response properties of crystals. It is a well established practice to modify the one-electron part of the Hamiltonian by replacing the bare nuclear attraction with a pseudo-potential (PP) operator, which allows us to restrict our calculation to the valence electrons. The PP operator must reproduce screened nuclear attractions, but must also account for the Pauli exclusion principle, which requires that valence orbitals are orthogonal to core ones. The PPs are not uniquely defined and thus one seeks to satisfy the following characteristics as well as possible:

  • (1) PP eigenvalues should coincide with the true (all-electron) ones;

  • (2) PP orbitals should resemble as closely as possible the all-electron orbitals in an external region as well as being smooth and nodeless in the core region;

  • (3) PP orbitals should be properly normalized;

  • (4) the functional form of the PP should allow the simplification of their use in computations;

  • (5) the PP should be transferable (independent of the system); and

  • (6) relativistic effects should be taken into account (especially for heavy elements); this concerns mainly the indirect relativistic effects (e.g. core contraction, Darwin s-shift), but not the spin–orbit coupling.

There are many versions of the PP method (norm-conserving, ultrasoft etc.) and the actual accuracy of a calculation is governed by which is used. For standard applications, PP techniques can be quite successful in solid-state calculations. However, there are cases that require higher accuracy, e.g. when core electrons are involved, as in high-pressure studies or electric field gradient calculations (see Section 2.2.15[link]), where the polarization of the charge density close to the nucleus is crucial for describing the physical effects properly.

2.2.11.4. APW (augmented plane wave) and LAPW methods

| top | pdf |

The partition of space (i.e. the unit cell) between (non-overlapping) atomic spheres and an interstitial region (see Fig. 2.2.12.1[link]) is used in several schemes, one of which is the augmented plane wave (APW) method, originally proposed by Slater (Slater, 1937[link]) and described by Loucks (1967[link]), and its linearized version (the LAPW method), which is chosen as the one representative method that is described in detail in Section 2.2.12[link].

The basis set is constructed using the muffin-tin approximation (MTA) for the potential [see the discussion below in connection with (2.2.12.5)[link]]. In the interstitial region the wavefunction is well described by plane waves, but inside the spheres atomic-like functions are used which are matched continuously (at the sphere boundary) to each plane wave.

2.2.11.5. KKR (Korringa–Kohn–Rostocker) method

| top | pdf |

In the KKR scheme (Korringa, 1947[link]; Kohn & Rostocker, 1954[link]), the solution of the KS equations (2.2.10.3)[link] uses a Green-function technique and solves a Lippman–Schwinger integral equation. The basic concepts come from a multiple scattering approach which is conceptually different but mathematically equivalent to the APW method. The building blocks are spherical waves which are products of spherical harmonics and spherical Hankel, Bessel and Neumann functions. Like plane waves, they solve the KS equations for a constant potential. Augmenting the spherical waves with numerical solutions inside the atomic spheres as in the APW method yields the KKR basis set. Compared with methods based on plane waves, spherical waves require fewer basis functions and thus smaller secular equations.

The radial functions in the APW and KKR methods are energy-dependent and so are the corresponding basis functions. This leads to a nonlinear eigenvalue problem that is computationally demanding. Andersen (1975[link]) modelled the weak energy dependence by a Taylor expansion where only the first term is kept and thereby arrived at the so-called linear methods LMTO and LAPW.

2.2.11.6. LMTO (linear combination of muffin-tin orbitals) method

| top | pdf |

The LMTO method (Andersen, 1975[link]; Skriver, 1984[link]) is the linearized counterpart to the KKR method, in the same way as the LAPW method is the linearized counterpart to the APW method. This widely used method originally adopted the atomic sphere approximation (ASA) with overlapping atomic spheres in which the potential was assumed to be spherically symmetric. Although the ASA simplified the computation so that systems with many atoms could be studied, the accuracy was not high enough for application to certain questions in solid-state physics.

Following the ideas of Andersen, the augmented spherical wave (ASW) method was developed by Williams et al. (1979[link]). The ASW method is quite similar to the LMTO scheme.

It should be noted that the MTA and the ASA are not really a restriction on the method. In particular, when employing the MTA only for the construction of the basis functions but including a generally shaped potential in the construction of the matrix elements, one arrives at a scheme of very high accuracy which allows, for instance, the evaluation of elastic properties. Methods using the unrestricted potential together with basis functions developed from the muffin-tin potential are called full-potential methods. Now for almost every method based on the MTA (or ASA) there exists a counterpart employing the full potential.

2.2.11.7. CP (Car–Parrinello) method

| top | pdf |

Conventional quantum-mechanical calculations are done using the Born–Oppenheimer approximation, in which one assumes (in most cases to a very good approximation) that the electrons are decoupled from the nuclear motion. Therefore the electronic structure is calculated for fixed atomic (nuclear) positions. Car & Parrinello (1985[link]) suggested a new method in which they combined the motion of the nuclei (at finite temperature) with the electronic degrees of freedom. They started with a fictitious Lagrangian in which the wavefunctions follow a dynamics equation of motion. Therefore, the CP method combines the motion of the nuclei (following Newton's equation) with the electrons (described within DFT) into one formalism by solving equations of motion for both subsystems. This simplifies the computational effort and allows ab initio molecular dynamics calculations to be performed in which the forces acting on the atoms are calculated from the wavefunctions within DFT. The CP method has attracted much interest and is widely used, with a plane-wave basis, extended with pseudo-potentials and recently enhanced into an all-electron method using the projector augmented wave (PAW) method (Blöchl, 1994[link]). Such CP schemes can also be used to find equilibrium structures and to explore the electronic structure.

2.2.11.8. Order N schemes

| top | pdf |

The various techniques outlined so far have one thing in common, namely the scaling. In a system containing N atoms the computational effort scales as [N^{3}], since one must determine a number of orbitals that is proportional to N which requires diagonalization of [(kN)\times(kN)] matrices, where the prefactor k depends on the basis set and the method used. In recent years much work has been done to devise algorithms that vary linearly with N, at least for very large N (Ordejon et al., 1995[link]). First results are already available and look promising. When such schemes become generally available, it will be possible to study very large systems with relatively little computational effort. This interesting development could drastically change the accessibility of electronic structure results for large systems.

2.2.12. The linearized augmented plane wave method

| top | pdf |

The electronic structure of solids can be calculated with a variety of methods as described above (Section 2.2.11[link]). One representative example is the (full-potential) linearized augmented plane wave (LAPW) method. The LAPW method is one among the most accurate schemes for solving the effective one-particle (the so-called Kohn–Sham) equations (2.2.10.3)[link] and is based on DFT (Section 2.2.10[link]) for the treatment of exchange and correlation.

The LAPW formalism is described in many references, starting with the pioneering work by Andersen (1975[link]) and by Koelling & Arbman (1975[link]), which led to the development and the description of the computer code WIEN (Blaha et al., 1990[link]; Schwarz & Blaha, 1996[link]). An excellent book by Singh (1994[link]) is highly recommended to the interested reader. Here only the basic ideas are summarized, while details are left to the articles and references therein.

In the LAPW method, the unit cell is partitioned into (non-overlapping) atomic spheres centred around the atomic sites (type I) and an interstitial region (II) as shown schematically in Fig. 2.2.12.1[link]. For the construction of basis functions (and only for this purpose), the muffin-tin approximation (MTA) is used. In the MTA, the potential is assumed to be spherically symmetric within the atomic spheres but constant outside; in the former atomic-like functions and in the latter plane waves are used in order to adapt the basis set optimally to the problem. Specifically, the following basis sets are used in the two types of regions:

  • (1) Inside the atomic sphere t of radius [R_{t}] (region I), a linear combination of radial functions times spherical harmonics [Y_{\ell m}(\hat{r})] is used (we omit the index t when it is clear from the context): [\phi_{{\bf k}_{n}}=\textstyle\sum\limits_{\ell m}[A_{\ell m}u_{\ell}(r,E_{\ell})+B_{\ell m} \dot{u}_{\ell}(r,E_{\ell})]Y_{\ell m}(\hat{r}), \eqno(2.2.12.1)]where [\hat{r}] represents the angles [\vartheta] and [\varphi] of the polar coordinates. The radial functions [u_{\ell}(r,E)] depend on the energy E. Within a certain energy range this energy dependance can be accounted for by using a linear combination of the solution [u_{\ell}(r,E_{\ell})] and its energy derivative [\dot{u}_{\ell}(r,E_{\ell})], both taken at the same energy [E_{\ell}] (which is normally chosen at the centre of the band with the corresponding [\ell]-like character). This is the linearization in the LAPW method. These two functions are obtained on a radial mesh inside the atomic sphere by numerical integration of the radial Schrödinger equation using the spherical part of the potential inside sphere t and choosing the solution that is regular at the origin [r=0]. The coefficients [A_{\ell m}] and [B_{\ell m}] are chosen by matching conditions (see below).

    [Figure 2.2.12.1]

    Figure 2.2.12.1 | top | pdf |

    Schematic partitioning of the unit cell into atomic spheres (I) and an interstitial region (II).

  • (2) In the interstitial region (II), a plane-wave expansion (see the Sommerfeld model, Section 2.2.5[link]) is used:[\phi_{{\bf k}_{n}}=({1}/{\sqrt{\Omega}})\exp({i{\bf k}_{n}{\bf r}}), \eqno(2.2.12.2)]where [{\bf k}_{n}={\bf k}+{\bf K}_{n}], [{\bf K}_{n}] are vectors of the reciprocal lattice, [{\bf k}] is the wavevector in the first Brillouin zone and [\Omega] is the unit-cell volume [see (2.2.5.3)[link]]. This corresponds to writing the periodic function [u_{{\bf k}}({\bf r})] (2.2.4.19)[link] as a Fourier series and combining it with the Bloch function (2.2.4.18)[link]. Each plane wave (corresponding to [{\bf k}_{n}]) is augmented by an atomic-like function in every atomic sphere, where the coefficients [A_{\ell m}] and [B_{\ell m}] in (2.2.12.1)[link] are chosen to match (in value and slope) the atomic solution with the corresponding plane-wave basis function of the interstitial region.

The solutions to the Kohn–Sham equations are expanded in this combined basis set of LAPWs, [\psi_{k}=\textstyle\sum\limits_{n}c_{n}\phi_{k_{n}},\eqno(2.2.12.3)]where the coefficients [c_{n}] are determined by the Rayleigh–Ritz variational principle. The convergence of this basis set is controlled by the number of PWs, i.e. by the magnitude of the largest [{\bf K}] vector in equation (2.2.12.3)[link].

In order to improve upon the linearization (i.e. to increase the flexibility of the basis) and to make possible a consistent treatment of semi-core and valence states in one energy window (to ensure orthogonality), additional ([k_{n}]-independent) basis functions can be added. They are called `local orbitals' (Singh, 1994[link]) and consist of a linear combination of two radial functions at two different energies (e.g. at the [3s] and [4s] energy) and one energy derivative (at one of these energies): [\phi_{\ell m}^{\rm LO}=[A_{\ell m}u_{\ell}(r,E_{1,\ell})+B_{\ell m}\dot{u}_{\ell }(r,E_{1,\ell})+C_{\ell m}u_{\ell}(r,E_{2,\ell})]Y_{\ell m}(\hat {r}).\eqno(2.2.12.4)]The coefficients [A_{\ell m}], [B_{\ell m}] and [C_{\ell m}] are determined by the requirements that [\phi^{\rm LO}] should be normalized and has zero value and slope at the sphere boundary.

In its general form, the LAPW method expands the potential in the following form:[V(r)=\left\{ \matrix{\textstyle\sum\limits_{LM}V_{LM}(r)K_{LM}(\hat{r})\hfill & {\rm inside \,\,sphere}\hfill\cr \textstyle\sum\limits_{K}V_{K}\exp({iKr})\hfill & {\rm outside \,\, sphere}}\right. \eqno(2.2.12.5)]where [K_{LM}] are the crystal harmonics compatible with the point-group symmetry of the corresponding atom represented in a local coordinate system (see Section 2.2.13[link]). An analogous expression holds for the charge density. Thus no shape approximations are made, a procedure frequently called the `full-potential LAPW' (FLAPW) method.

The muffin-tin approximation (MTA) used in early band calculations corresponds to retaining only the [L=0] and [M=0] component in the first expression of (2.2.12.5)[link] and only the [K=0] component in the second. This (much older) procedure corresponds to taking the spherical average inside the spheres and the volume average in the interstitial region. The MTA was frequently used in the 1970s and works reasonable well in highly coordinated (metallic) systems such as face-centred-cubic (f.c.c.) metals. For covalently bonded solids, open or layered structures, however, the MTA is a poor approximation and leads to serious discrepancies with experiment. In all these cases a full-potential treatment is essential.

The choice of sphere radii is not very critical in full-potential calculations, in contrast to the MTA, where this choice may affect the results significantly. Furthermore, different radii would be found when one uses one of the two plausible criteria, namely based on the potential (maximum between two adjacent atoms) or the charge density (minimum between two adjacent atoms). Therefore in the MTA one must make a compromise, whereas in full-potential calculations this problem practically disappears.

2.2.13. The local coordinate system

| top | pdf |

The partition of a crystal into atoms (or molecules) is ambiguous and thus the atomic contribution cannot be defined uniquely. However, whatever the definition, it must follow the relevant site symmetry for each atom. There are at least two reasons why one would want to use a local coordinate system at each atomic site: the concept of crystal harmonics and the interpretation of bonding features.

2.2.13.1. Crystal harmonics

| top | pdf |

All spatial observables of the bound atom (e.g. the potential or the charge density) must have the crystal symmetry, i.e. the point-group symmetry around an atom. Therefore they must be representable as an expansion in terms of site-symmetrized spherical harmonics. Any point-symmetry operation transforms a spherical harmonic into another of the same [\ell]. We start with the usual complex spherical harmonics, [Y_{\ell m}(\vartheta,\varphi)=N_{\ell m}P_{\ell}^{m}(\cos\vartheta)\exp({im\varphi}),\eqno(2.2.13.1)]which satisfy Laplacian's differential equation. The [P_{\ell}^{m} (\cos\vartheta)] are the associated Legendre polynomials and the normalization [N_{\ell m}] is according to the convention of Condon & Shortley (1953[link]). For the [\varphi]-dependent part one can use the real and imaginary part and thus use [\cos(m\varphi)] and [\sin(m\varphi)] instead of the [\exp({im\varphi})] functions, but we must introduce a parity p to distinguish the functions with the same [\left| m\right| ]. For convenience we take real spherical harmonics, since physical observables are real. The even and odd polynomials are given by the combination of the complex spherical harmonics with the parity p either [+] or − by [\eqalignno{y_{\ell mp} &=\left\{ \matrix{ y_{\ell m+}=(1/\sqrt{2})(Y_{\ell m}+Y_{\ell\bar m}) \hfill&+\, \,{\rm parity}\cr y_{\ell m-}=-(i/\sqrt{2})(Y_{\ell m}-Y_{\bar m}) \hfill &-\, {\rm parity}},\right. \,\, m=2n &\cr y_{\ell mp} &=\left\{\matrix{ y_{\ell m+}=-(1/\sqrt{2})(Y_{\ell m}-Y_{\ell\bar m})\hfill&+\,\, {\rm parity}\cr y_{\ell m-}=(i/\sqrt{2})(Y_{\ell m}+Y_{\ell\bar m}) \hfill& -\, {\rm parity}},\right. \,\, m=2n+1.&\cr &&(2.2.13.2)}]

The expansion of – for example – the charge density [\rho({\bf r})] around an atomic site can be written using the LAPW method [see the analogous equation (2.2.12.5)[link] for the potential] in the form [\rho({\bf r})=\textstyle\sum\limits_{LM}\rho_{LM}(r)K_{LM}(\hat{r})\hbox{ inside an atomic sphere},\eqno(2.2.13.3)]where we use capital letters [LM] for the indices (i) to distinguish this expansion from that of the wavefunctions in which complex spherical harmonics are used [see (2.2.12.1)[link]] and (ii) to include the parity p in the index M (which represents the combined index [mp]). With these conventions, [K_{LM}] can be written as a linear combination of real spherical harmonics [y_{\ell mp}] which are symmetry-adapted to the site symmetry,[K_{LM}(\hat{r})=\left\{\matrix{ y_{\ell mp}\hfill & \hbox{non-cubic}\hfill\cr \textstyle\sum_{j}c_{Lj}y_{\ell jp}\hfill & \hbox{cubic}\hfill}\right. \eqno(2.2.13.4)]i.e. they are either [y_{\ell mp}] [(2.2.13.2)[link]] in the non-cubic cases (Table 2.2.13.1[link]) or are well defined combinations of [y_{\ell mp}]'s in the five cubic cases (Table 2.2.13.2[link]), where the coefficients [c_{Lj}] depend on the normalization of the spherical harmonics and can be found in Kurki-Suonio (1977[link]).

Table 2.2.13.1| top | pdf |
Picking rules for the local coordinate axes and the corresponding [LM] combinations ([\ell mp]) of non-cubic groups taken from Kurki-Suonio (1977[link])

SymmetryCoordinate axes[\ell, m, p] of [y_{\ell mp}]Crystal system
1 Any All [(\ell,m,\pm) ] Triclinic
[\overline{1}] Any [(2\ell,m,\pm)]
2 [2\parallel z] [(\ell,2m,\pm) ] Monoclinic
[m ] [m\perp z ] [ (\ell,\ell-2m,\pm) ]
[2/m ] [ 2\parallel z, m\perp z ] [(2\ell,2m,\pm) ]
222 [ 2\parallel z, 2\parallel y\,\, (2\parallel x) ] [(2\ell,2m,+), (2\ell+1,2m,-) ] Orthorhombic
[mm2] [ 2\parallel z, m\perp y\,\, (2\perp x) ] [ (\ell,2m,+) ]
[mmm] [2\perp z, m\perp y, 2\perp x ] [(2\ell,2m,+) ]
4 [4\parallel z ] [ (\ell,4m,\pm) ] Tetragonal
[\overline{4}] [-4\parallel z ] [ (2\ell,4m,\pm), (2\ell+1,4m+2,\pm) ]
[4/m ] [4\parallel z, m\perp z ] [ (2\ell,4m,\pm) ]
422 [4\parallel z, 2\parallel y\,\, (2\parallel x) ] [ (2\ell,4m,+), (2\ell+1,4m,-) ]
[4mm] [ 4\parallel z, m\perp y\,\, (2\perp x) ] [ (\ell,4m,+) ]
[\overline{4}2m ] [-4\parallel z, 2\parallel x \,\,(m=xy\rightarrow yx) ] [ (2\ell,4m,+), (2\ell+1,4m+2,-) ]
[4mmm ] [ 4\parallel z, m\perp z, m\perp x ] [ (2\ell,4m,+) ]
3 [3\parallel z] [ (\ell,3m,\pm) ] Rhombohedral
[\overline{3}] [-3\parallel z] [ (2\ell,3m,\pm) ]
32 [ 3\parallel z, 2\parallel y ] [ (2\ell,3m,+), (2\ell+1,3m,-) ]
[3m ] [ 3\parallel z, m\perp y ] [ (\ell,3m,+) ]
[\overline{3}m ] [ -3\parallel z, m\perp y ] [ (2\ell,3m,+) ]
6 [6\parallel z ] [ (\ell,6m,\pm) ] Hexagonal
[\overline{6}] [-6\parallel z ] [ (2\ell,6m,+), (2\ell+1,6m+3,\pm)]
[6/m ] [ 6\parallel z, m\perp z ] [ (2\ell,6m,\pm) ]
622 [ 6\parallel z, 2\parallel y \,\,(2\parallel x) ] [ (2\ell,6m,+), (2\ell+1,6m,-)]
[6mm ] [6\parallel z, m\parallel y\,\, (m\perp x) ] [(\ell,6m,+) ]
[\overline{6}2m ] [-6\parallel z, m\perp y\,\, (2\parallel x) ] [ (2\ell,6m,+), (2l+1,6m+3,+) ]
[6mmm ] [ 6\parallel z, m\perp z, m\perp y\,\, (m\perp x) ] [ (2\ell,6m,+) ]

Table 2.2.13.2| top | pdf |
LM combinations of cubic groups as linear cominations of [y_{\ell mp}]'s (given in parentheses)

The linear-combination coefficients can be found in Kurki-Suonio (1977[link]).

SymmetryLM combinations
23 (0 0), (3 2−), (4 0, 4 4[+]), (6 0, 6 4[+]), (6 2[+], 6 6[+])
[m3 ] (0 0), (4 0, 4 4[+]), (6 0, 6 4[+]) (6 2[+], 6 6[+])
432 (0 0), (4 0, 4 4[+]), (6 0, 6 4[+])
[\overline{4}3m ] (0 0), (3 2−), (4 0, 4 4[+]), (6 0, 6 4[+]),
[m3m ] (0 0), (4 0, 4 4[+]), (6 0, 6 4[+])

According to Kurki-Suonio, the number of (non-vanishing) [LM] terms [e.g. in (2.2.13.3)[link]] is minimized by choosing for each atom a local Cartesian coordinate system adapted to its site symmetry. In this case, other [LM] terms would vanish, so using only these terms corresponds to the application of a projection operator, i.e. equivalent to averaging the quantity of interest [e.g. [\rho({\bf r})]] over the star of [{\bf k}]. Note that in another coordinate system (for the L values listed) additional M terms could appear. The group-theoretical derivation led to rules as to how the local coordinate system must be chosen. For example, the z axis is taken along the highest symmetry axis, or the x and y axes are chosen in or perpendicular to mirror planes. Since these coordinate systems are specific for each atom and may differ from the (global) crystal axes, we call them `local' coordinate systems, which can be related by a transformation matrix to the global coordinate system of the crystal.

The symmetry constraints according to (2.2.13.4)[link] are summarized by Kurki-Suonio, who has defined picking rules to choose the local coordinate system for any of the 27 non-cubic site symmetries (Table 2.2.13.1[link]) and has listed the [LM] combinations, which are defined by (a linear combination of) functions [y_{\ell mp}] [see (2.2.13.2)[link]]. If the [\pm] parity appears, both the [+] and the − combination must be taken. An application of a local coordinate system to rutile TiO2 is described in Section 2.2.16.2[link].

In the case of the five cubic site symmetries, which all have a threefold axis in (111), a well defined linear combination of [y_{\ell mp}] functions (given in Table 2.2.13.2[link]) leads to the cubic harmonics.

2.2.13.2. Interpretation for bonding

| top | pdf |

Chemical bonding is often described by considering orbitals (e.g. a [p_{z}] or a [d_{z^{2}}] atomic orbital) which are defined in polar coordinates, where the z axis is special, in contrast to Cartesian coordinates, where x, y and z are equivalent. Consider for example an atom coordinated by ligands (e.g. forming an octahedron). Then the application of group theory, ligand-field theory etc. requires a certain coordinate system provided one wishes to keep the standard notation of the corresponding spherical harmonics. If this octahedron is rotated or tilted with respect to the global (unit-cell) coordinate system, a local coordinate system is needed to allow an easy orbital interpretation of the inter­actions between the central atom and its ligands. This applies also to spectroscopy or electric field gradients.

The two types of reasons mentioned above may or may not lead to the same choice of a local coordinate system, as is illustrated for the example of rutile in Section 2.2.16.2.[link]

2.2.14. Characterization of Bloch states

| top | pdf |

The electronic structure of a solid is specified by energy bands [E^{j}({\bf k})] and the corresponding wavefunctions, the Bloch functions [\psi_{{\bf k}}^{j}({\bf r})]. In order to characterize energy bands there are various schemes with quite different emphasis. The most important concepts are described below and are illustrated using selected examples in the following sections.

2.2.14.1. Characterization by group theory

| top | pdf |

The energy bands are primarily characterized by the wavevector [{\bf k}] in the first BZ that is associated with the translational symmetry according to (2.2.4.23)[link]. The star of [{\bf k}] determines an irreducible basis provided that the functions of the star are symmetrized with respect to the small representations, as discussed in Section 2.2.6.[link] Along symmetry lines in the BZ (e.g. from [\Gamma] along [\Delta] towards X in the BZ shown in Fig. 2.2.7.1[link]), the corresponding group of the [{\bf k}] vector may show a group–subgroup relation, as for example for [\Gamma] and [\Delta]. The corresponding irreducible representations can then be found by deduction (or by induction in the case of a group–supergroup relation). These concepts define the compatibility relations (Bouckaert et al., 1930[link]; Bradley & Cracknell, 1972[link]), which tell us how to connect energy bands. For example, the twofold degenerate representation [\Gamma_{12}] (the [e_{g}] symmetry in a cubic system) splits into the [\Delta_{1}] and [\Delta_{2}] manifold in the [\Delta] direction, both of which are one-dimensional. The compatibility relations tell us how to connect bands. In addition, one can also find an orbital representation and thus knows from the group-theoretical analysis which orbitals belong to a certain energy band. This is very useful for interpretations.

2.2.14.2. Energy regions

| top | pdf |

In chemistry and physics it is quite common to separate the electronic states of an atom into those from core and valence electrons, but sometimes this distinction is not well defined, as will be discussed in connection with the so-called semi-core states. For the sake of argument, let us discuss the situation in a solid using the concepts of the LAPW method, keeping in mind that very similar considerations hold for all other band-structure schemes.

A core state is characterized by a low-lying energy (i.e. with a large negative energy value with respect to the Fermi energy) and a corresponding wavefunction that is completely confined inside the sphere of the respective atom. Therefore there is effectively no overlap with the wavefunctions from neighbouring atoms and, consequently, the associated band width is practically zero.

The valence electrons occupy the highest states and have wavefunctions that strongly overlap with their counterparts at adjacent sites, leading to chemical bonding, large dispersion (i.e. a strong variation of the band energy with [{\bf k}]) and a significant band width.

The semi-core states are in between these two categories. For example, the 3s and 3p states of the 3d transition metals belong here. They are about 2–6 Ry (1 Ry = 13.6 eV) below the valence bands and have most of the wavefunctions inside their atomic spheres, but a small fraction (a few per cent) of the corresponding charge lies outside this sphere. This causes weak interactions with neighbouring atoms and a finite width of the corresponding energy bands.

Above the valence states are the unoccupied states, which often (e.g. in DFT or the HF method) require special attention.

2.2.14.3. Decomposition according to wavefunctions

| top | pdf |

For interpreting chemical bonding or the physical origin of a given Bloch state at [E^{j}({\bf k})], a decomposition according to its wavefunction is extremely useful but always model-dependent. The charge density [\psi_{{\bf k}}^{j}({\bf r})^{\ast}\psi_{{\bf k}}^{j}({\bf r})] corresponding to the Bloch state at [E^{j}({\bf k})] can be normalized to one per unit cell and is (in principle) an observable, while its decomposition depends on the model used. The following considerations are useful in this context:

  • (1) Site-centred orbitals. In many band-structure methods, the Bloch functions are expressed as a linear combination of atomic orbitals (LCAO). These orbitals are centred at the various nuclei that constitute the solid. The linear-combination coefficients determine how much of a given orbital contributes to the wavefunction (Mulliken population analysis).

  • (2) Spatially confined functions. In many schemes (LMTO, LAPW, KKR; see Section 2.2.11[link]), atomic spheres are used in which the wavefunctions are described in terms of atomic-like orbitals. See, for example, the representation (2.2.12.1)[link] in the LAPW method (Section 2.2.12[link]), where inside the atomic sphere the wavefunction is written as an [\ell]-like radial function times spherical harmonics (termed partial waves). The latter require a local coordinate system (Section 2.2.13[link]) which need not to be the same as the global coordinate system of the unit cell. The reasons for choosing a special local coordinate system are twofold: one is a simplification due to the use of the point-group symmetry, and the other is the interpretation, as will be illustrated below for TiO2 in the rutile structure (see Section 2.2.16.2[link]).

  • (3) Orbital decomposition. In all cases in which [\ell]-like orbitals are used (they do not require a local coordinate system) to construct the crystalline wavefunction, an [\ell]-like decomposition can be done. This is true for both atom-centred orbitals and spatially confined partial waves. A corresponding decomposition can be done on the basis of partial electronic charges, as discussed below. A further decomposition into the m components can only be done in a local coordinate system with respect to which the spherical harmonics are defined.

  • (4) Bonding character. As in a diatomic molecule with an orbital on atom A and another on atom B, we can form bonding and antibonding states by adding or subtracting the corresponding orbitals. The bonding interaction causes a lowering in energy with respect to the atomic state and corresponds to a constructive interference of the orbitals. For the antibonding state, the interaction raises the energy and leads to a change in sign of the wavefunction, causing a nodal plane that is perpendicular to the line connecting the nuclei. If the symmetry does not allow an interaction between two orbitals, a nonbonding state occurs. Analogous concepts can also be applied to solids.

  • (5) Partial charges. The charge corresponding to a Bloch function of state [E^{j}({\bf k})] – averaged over the star of [{\bf k}] – can be normalized to 1 in the unit cell. A corresponding decomposition of the charge can be done into partial electronic charges. This is illustrated first within the LAPW scheme. Using the resolution of the identity this 1 (unit charge) of each state [E_{{\bf k}}^{j}] can be spatially decomposed into the contribution [q^{\rm out}(E_{{\bf k}}^{j})] from the region outside all atomic spheres (interstitial region II) and a sum over all atomic spheres (with superscript t) which contain the charges [q^{t}(E_{{\bf k}}^{j})] (confined within atomic sphere t). The latter can be further decomposed into the partial [\ell]-like charges [q_{\ell}^{t}(E_{{\bf k}}^{j})], leading to [1=q^{\rm out} (E_{{\bf k}}^{j})+\textstyle\sum_{t}\textstyle\sum_{\ell}q_{\ell}^{t}(E_{{\bf k}}^{j})]. In a site-centred basis a similar decomposition can be done, but without the term [q^{\rm out}(E_{{\bf k}}^{j})]. The interpretation, however, is different, as will be discussed for Cu (see Section 2.2.16[link]). If the site symmetry (point group) permits, another partitioning according to m can be made, e.g. into the [t_{2g}] and [e_{g}] manifold of the fivefold degenerate d orbitals in a octahedral ligand field. The latter scheme requires a local coordinate system in which the spherical harmonics are defined (see Section 2.2.13[link]). In general, the proper m combinations are given by the irreducible representations corresponding to the site symmetry.

2.2.14.4. Localized versus itinerant electrons

| top | pdf |

Simple metals with valence electrons originating from s- and p-type orbitals form wide bands which are approximately free-electron like (with a large band width W). Such a case corresponds to itinerant electrons that are delocalized and thus cause metallic conductivity.

The other extreme case is a system with 4f (and some 5f) electrons, such as the lanthanides. Although the orbital energies of these electrons are in the energy range of the valence electrons, they act more like core electrons and thus are tightly bound to the corresponding atomic site. Such electrons are termed localized, since they do not hop to neighbouring sites (controlled by a hopping parameter t) and thus do not contribute to metallic conductivity. Adding another of these electrons to a given site would increase the Coulomb repulsion U. A large U (i.e. t) prevents them from hopping.

There are – as usual – borderline cases (e.g. the late 3d transition-metal oxides) in which a delicate balance between t and U, the energy gain by delocalizing electrons and the Coulomb repulsion, determines whether a system is metallic or insulating. This problem of metal/insulator transitions is an active field of research of solid-state physics which shall not be discussed here.

In one example, however, the dual role of f electrons is illustrated for the uranium atom using relativistic wavefunctions (with a large and a small component) characterized by the quantum numbers n, [\ell] and j. Fig. 2.2.14.1[link] shows the outermost lobe (the large component) of the electrons beyond the [Xe] core without the 4f and 5d core-like states. One can see the [6s_{1/2}], [6p_{1/2}] and [6p_{3/2}] (semi-core) electrons, and the [6d_{3/2}] and [7s_{1/2}] (valence) electrons.

[Figure 2.2.14.1]

Figure 2.2.14.1 | top | pdf |

Relativistic radial wavefunctions (large component) of the uranium atom. Shown are the outer lobes of valence and semi-core states excluding the [Xe] core, and the 4f and 5d core states.

On the one hand, the radial wavefunction of the [5f_{5/2}] orbital has its peak closer to the nucleus than the main lobes of the semi-core states [6s_{1/2}], [6p_{1/2}] and [6p_{3/2}], and thus demonstrates the core nature of these 5f electrons. On the other hand, the [5f_{5/2}] orbital decays (with distance) much less than the semi-core states and electrons in this orbital can thus also play the role of valence electrons, like electrons in the [6d_{3/2}] and [7s_{1/2}] orbitals. This dual role of the f electrons has been discussed, for example, by Schwarz & Herzig (1979[link]).

2.2.14.5. Spin polarization

| top | pdf |

In a non-fully-relativistic treatment, spin remains a good quantum number. Associated with the spin is a spin magnetic moment. If atoms have net magnetic moments they can couple in various orders in a solid. The simplest cases are the collinear spin alignments as found in ferromagnetic (FM) or antiferromagnetic (AF) systems with parallel (FM) and antiparallel (AF) moments on neighbouring sites. Ferrimagnets have opposite spin alignments but differ in the magnitude of their moments on neighbouring sites, leading to a finite net magnetization. These cases are characterized by the electronic structure of spin-up and spin-down electrons. More complicated spin structures (e.g. canted spins, spin spirals, spin glasses) often require a special treatment beyond simple spin-polarized calculations. In favourable cases, however, as in spin spirals, it is possible to formulate a generalized Bloch theorem and treat such systems by band theory (Sandratskii, 1990[link]).

In a fully relativistic formalism, an additional orbital moment may occur. Note that the orientation of the total magnetic moment (spin and orbital moment) with respect to the crystal axis is only defined in a relativistic treatment including spin–orbit interactions. In a spin-polarized calculation without spin–orbit coupling this is not the case and only the relative orientation (majority-spin and minority-spin) is known. The magnetic structures may lead to a lowering of symmetry, a topic beyond this book.

2.2.14.6. The density of states (DOS)

| top | pdf |

The density of states (DOS) is the number of one-electron states (in the HF method or DFT) per unit energy interval and per unit cell volume. It is better to start with the integral quantity [I(\varepsilon)], the number of states below a certain energy [\varepsilon],[I(\varepsilon)={{2}\over{V_{\rm BZ}}}\sum_{j}\int_{\rm BZ}\vartheta(\varepsilon -\varepsilon_{{\bf k}}^{j})\,\,{\rm d}{\bf k},\eqno(2.2.14.1)]where [V_{\rm BZ}] is the volume of the BZ, the factor 2 accounts for the occupation with spin-up and spin-down electrons (in a non-spin-polarized case), and [\vartheta(\varepsilon-\varepsilon_{{\bf k}}^{j})] is the step function, the value of which is 1 if [\varepsilon_{{\bf k}}^{j}] is less than [\varepsilon] and 0 otherwise. The sum over [{\bf k}] points has been replaced by an integral over the BZ, since the [{\bf k}] points are uniformly distributed. Both expressions, sum and integral, are used in different derivations or applications. The Fermi energy is defined by imposing that [I(E_F)=N], the number of (valence) electrons per unit cell.

The total DOS is defined as the energy derivative of [I(\varepsilon)] as [n(\varepsilon)={{{\rm d}I(\varepsilon)}\over{{\rm d}\varepsilon}}, \eqno(2.2.14.2)]with the normalization [N=\textstyle\int\limits_{-\infty}^{E_{F}}n(\varepsilon)\,\,{\rm d}\varepsilon,\eqno(2.2.14.3)]where the integral is taken from [-\infty] if all core states are included or from the bottom of the valence bands, often taken to be at zero. This defines the Fermi energy (note that the energy range must be consistent with N). In a bulk material, the origin of the energy scale is arbitrary and thus only relative energies are important. In a realistic case with a surface (i.e. a vacuum) one can take the potential at infinity as the energy zero, but this situation is not discussed here.

The total DOS [n(\varepsilon)] can be decomposed into a partial (or projected) DOS by using information from the wavefunctions as described above in Section 2.2.14.3.[link] If the charge corresponding to the wavefunction of an energy state is partitioned into contributions from the atoms, a site-projected DOS can be defined as [n^{t}(\varepsilon)], where the superscript t labels the atom t. These quantities can be further decomposed into [\ell]-like contributions within each atom to give [n_{\ell} ^{t}(\varepsilon)]. As discussed above for the partial charges, a further partitioning of the [\ell]-like terms according to the site symmetry (point group) can be done (in certain cases) by taking the proper m combinations, e.g. the [t_{2g}] and [e_{g}] manifold of the fivefold degenerate d orbitals in a octahedral ligand field. The latter scheme requires a local coordinate system in which the spherical harmonics are defined (see Section 2.2.13[link]). In this context all considerations as discussed above for the partial charges apply again. Note in particular the difference between site-centred and spatially decomposed wavefunctions, which affects the partition of the DOS into its wavefunction-dependent contributions. For example, in atomic sphere representations as in LAPW we have the decomposition [n(\varepsilon)=n^{\rm out}(\varepsilon)+\textstyle\sum\limits_{t}\textstyle\sum\limits_{\ell}n_{\ell}^{t} (\varepsilon).\eqno(2.2.14.4)]In the case of spin-polarized calculations, one can also define a spin-projected DOS for spin-up and spin-down electrons.

2.2.15. Electric field gradient tensor

| top | pdf |

2.2.15.1. Introduction

| top | pdf |

The study of hyperfine interactions is a powerful way to characterize different atomic sites in a given sample. There are many experimental techniques, such as Mössbauer spectroscopy, nuclear magnetic and nuclear quadrupole resonance (NMR and NQR), perturbed angular correlations (PAC) measurements etc., which access hyperfine parameters in fundamentally different ways. Hyperfine parameters describe the interaction of a nucleus with the electric and magnetic fields created by the chemical environment of the corresponding atom. Hence the resulting level splitting of the nucleus is determined by the product of a nuclear and an extra-nuclear quantity. In the case of quadrupole interactions, the nuclear quantity is the nuclear quadrupole moment (Q) that interacts with the electric field gradient (EFG) produced by the charges outside the nucleus. For a review see, for example, Kaufmann & Vianden (1979[link]).

The EFG tensor is defined by the second derivative of the electrostatic potential V with respect to the Cartesian coordinates [x_{i}], i = 1, 2, 3, taken at the nuclear site n, [\Phi_{ij}={{\partial^{2}V}\over{\partial x_{i}\,\,\partial x_{j}}}\bigg|_{n}-{{1}\over{3}}\delta_{ij}\nabla^{2}\bigg|_{n},\eqno(2.2.15.1)]where the second term is included to make it a traceless tensor. This is more appropriate, since there is no interaction of a nuclear quadrupole and a potential caused by s electrons. From a theoretical point of view it is more convenient to use the spherical tensor notation because electrostatic potentials (the negative of the potential energy of the electron) and the charge densities are usually given as expansions in terms of spherical harmonics. In this way one automatically deals with traceless tensors (for further details see Herzig, 1985[link]).

The analysis of experimental results faces two obstacles: (i) The nuclear quadrupole moments (Pyykkö, 1992[link]) are often known only with a large uncertainty, as this is still an active research field of nuclear physics. (ii) EFGs depend very sensitively on the anisotropy of the charge density close to the nucleus, and thus pose a severe challenge to electronic structure methods, since an accuracy of the density in the per cent range is required.

In the absence of a better tool, a simple point-charge model was used in combination with so-called Sternheimer (anti-) shielding factors in order to interpret the experimental results. However, these early model calculations depended on empirical parameters, were not very reliable and often showed large deviations from experimental values.

In their pioneering work, Blaha et al. (1985[link]) showed that the LAPW method was able to calculate EFGs in solids accurately without empirical parameters. Since then, this method has been applied to a large variety of systems (Schwarz & Blaha, 1992[link]) from insulators (Blaha et al., 1985[link]), metals (Blaha et al., 1988[link]) and superconductors (Schwarz et al., 1990[link]) to minerals (Winkler et al., 1996[link]).

Several other electronic structure methods have been applied to the calculation of EFGs in solids, for example the LMTO method for periodic (Methfessl & Frota-Pessoa, 1990[link]) or non-periodic (Petrilli & Frota-Pessoa, 1990[link]) systems, the KKR method (Akai et al., 1990[link]), the DVM (discrete variational method; Ellis et al., 1983[link]), the PAW method (Petrilli et al., 1998[link]) and others (Meyer et al., 1995[link]). These methods achieve different degrees of accuracy and are more or less suitable for different classes of systems.

As pointed out above, measured EFGs have an intrinsic uncertainty related to the accuracy with which the nuclear quadrupole moment is known. On the other hand, the quadrupole moment can be obtained by comparing experimental hyperfine splittings with very accurate electronic structure calculations. This has recently been done by Dufek et al. (1995a[link]) to determine the quadrupole moment of 57Fe. Hence the calculation of accurate EFGs is to date an active and challenging research field.

2.2.15.2. EFG conversion formulas

| top | pdf |

The nuclear quadrupole interaction (NQI) represents the interaction of Q (the nuclear quadrupole moment) with the electric field gradient (EFG) created by the charges surrounding the nucleus, as described above. Here we briefly summarize the main ideas (following Petrilli et al., 1998[link]) and provide conversions between experimental NQI splittings and electric field gradients.

Let us consider a nucleus in a state with nuclear spin quantum number [I>1/2] with the corresponding nuclear quadrupole moment [Q_{i,j}=({1}/{{e}})\textstyle\int {\rm d}^{3}r\rho_{n}(r)r_{i}r_{j}], where [\rho_{n}(r)] is the nuclear charge density around point [{\bf r}] and e is the proton's charge. The interaction of this Q with an electric field gradient tensor [V_{i,j}], [H=e\textstyle\sum\limits_{i,j}Q_{i,j}V_{i,j},\eqno(2.2.15.2)]splits the energy levels [E_{Q}] for different magnetic spin quantum numbers [m_{I}=I,I-1,\ldots,-I] of the nucleus according to [E_{Q}={{eQV_{zz}[3m_{I}^{2}-I(I+1)](1+\eta^{2}/3)^{1/2}}\over{4I(2I-1)}} \eqno(2.2.15.3)]in first order of [V_{i,j}], where Q represents the largest component of the nuclear quadrupole moment tensor in the state characterized by [m_{I}=I]. (Note that the quantum-mechanical expectation value of the charge distribution in an angular momentum eigenstate is cylindrical, which renders the expectation value of the remaining two components with half the value and opposite sign.) The conventional choice is [|V_{zz}|>|V_{yy}|\geq|V_{xx}|]. Hence, [V_{zz}] is the principal component (largest eigenvalue) of the electric field gradient tensor and the asymmetry parameter [\eta] is defined by the remaining two eigenvalues [V_{xx},V_{yy}] through [\eta={{|(V_{xx}-V_{yy})|}\over{|V_{zz}|}}.\eqno(2.2.15.4)](2.2.15.3)[link] shows that the electric quadrupole interaction splits the ([2I+1])-fold degenerate energy levels of a nuclear state with spin quantum number I ([I>1/2]) into I doubly degenerate substates (and one singly degenerate state for integer I). Experiments determine the energy difference [\Delta] between the levels, which is called the quadrupole splitting. The remaining degeneracy can be lifted further using magnetic fields.

Next we illustrate these definitions for 57Fe, which is the most common probe nucleus in Mössbauer spectroscopy measurements and thus deserves special attention. For this probe, the nuclear transition occurs between the [I=3/2] excited state and [I=1/2] ground state, with a 14.4 KeV [\gamma] radiation emission. The quadrupole splitting between the [m_{I} =\pm(1/2)] and the [m_{I}=\pm(3/2)] state can be obtained by exploiting the Doppler shift of the [\gamma] radiation of the vibrating sample. [\Delta={{V_{zz}eQ(1+\eta^{2}/3)^{1/2}}\over{2}}.\eqno(2.2.15.5)]For systems in which the 57Fe nucleus has a crystalline environment with axial symmetry (a threefold or fourfold rotation axis), the asymmetry parameter [\eta] is zero and [\Delta] is given directly by [\Delta={{V_{zz}eQ}\over{2}}.\eqno(2.2.15.6)]As [\eta] can never be greater than unity, the difference between the values of [\Delta] given by equation (2.2.15.5)[link] and equation (2.2.15.6)[link] cannot be more than about 15%. In the remainder of this section we simplify the expressions, as is often done, by assuming that [\eta=0]. As Mössbauer experiments exploit the Doppler shift of the [\gamma] radiation, the splitting is expressed in terms of the velocity between sample and detector. The quadrupole splitting can be obtained from the velocity, which we denote here by [\Delta_{v}], by [\Delta={{{{E_{\gamma}}}\over{{c}}}}\Delta_{v},\eqno(2.2.15.7)]where c = 2.9979245580 × 108 m s−1 is the speed of light and Eγ = 14.41 × 103 eV is the energy of the emitted [\gamma] radiation of the 57Fe nucleus.

Finally, we still need to know the nuclear quadrupole moment Q of the Fe nucleus itself. Despite its utmost importance, its value has been heavily debated. Recently, however, Dufek et al. (1995b[link]) have determined the value Q = 0.16 b for 57Fe (1 b = 10−28 m2) by comparing for fifteen different compounds theoretical [V_{zz}] values, which were obtained using the linearized augmented plane wave (LAPW) method, with the measured quadrupole splitting at the Fe site.

Now we relate the electric field gradient [V_{zz}] to the Doppler velocity via [\Delta_{v}={{{{eQc}}\over{{2E_{\gamma}}}}}V_{zz}.\eqno(2.2.15.8)]In the special case of the 57Fe nucleus, we obtain [\eqalignno{V_{zz}\,\,[10^{21}\,\,{\rm V}\,\,{\rm m}^{-2}]&=10^{4}{{2E_{\gamma}\,\,[{\rm eV}]}\over{c\,\, [{\rm m}\,\,{\rm s}^{-1}]Q\,\, [{\rm b}]}}\Delta _{v}\,\,[{\rm mm}\,\,{\rm s}^{-1}]&\cr&\approx 6\Delta_{v}\,\, [{\rm mm}\,\,{\rm s}^{-1}].&(2.2.15.9)}]EFGs can also be obtained by techniques like NMR or NQR, where a convenient measure of the strength of the quadrupole interaction is expressed as a frequency [\nu_{q}], related to [V_{zz}] by [\nu_{q}={{3eQV_{zz}}\over{2hI(2I-1)}}.\eqno(2.2.15.10)]The value [V_{zz}] can then be calculated from the frequency in MHz by [V_{zz}\,\,[10^{21}\,\,{\rm V}\,\,{\rm m}^{-2}]=0.02771{{I(2I-1)}\over{Q\,\,[{\rm b}]}}\nu_{q}\,\,[{\rm MHz}],\eqno(2.2.15.11)]where (h/e) = 4.1356692 × 10−15 V Hz−1. The principal component [V_{zz}] is also often denoted as [eq=V_{zz}].

In the literature, two conflicting definitions of [\nu_{q}] are in use. One is given by (2.2.15.10)[link], and the other, defined as [\nu_{q}\,\,[{\rm Hz}]={{e^{2}qQ}\over{2h}},\eqno(2.2.15.12)]differs from the first by a factor of 2 and assumes the value [I=3/2]. Finally, the definition of [q=V_{zz}/e] has been introduced here. In order to avoid confusion, we will refer here only to the definition given by (2.2.15.10)[link]. Furthermore, we also adopt the same sign convention for [V_{zz}] as Schwarz et al. (1990[link]) because it has been found to be consistent with the majority of experimental results.

2.2.15.3. Theoretical approach

| top | pdf |

Since the EFG is a ground-state property that is uniquely determined by the charge density distribution (of electrons and nuclei), it can be calculated within DFT without further approximations. Here we describe the basic formalism to calculate EFGs with the LAPW method (see Section 2.2.12[link]). In the LAPW method, the unit cell is divided into non-overlapping atomic spheres and an interstitial region. Inside each sphere the charge density (and analogously the potential) is written as radial functions [\rho_{LM}(r)] times crystal harmonics (2.2.13.4)[link] and in the interstitial region as Fourier series: [\rho(r)=\left\{ \matrix{\textstyle \sum\limits_{LM}\rho_{LM}(r)K_{LM}(\hat{r})\hfill & \hbox{inside sphere}\hfill\cr \textstyle\sum\limits_{K}\rho_{K}\exp({iKr})\hfill & \hbox{outside sphere}\hfill}\right. \eqno(2.2.15.13)]The charge density coefficients [\rho_{LM}(r)] can be obtained from the wavefunctions (KS orbitals) by (in shorthand notation) [\rho_{LM}(r)=\textstyle\sum\limits_{E_{k}^{j} \,\lt\, E_{F}}\textstyle\sum\limits_{\ell m}\textstyle\sum\limits_{\ell^{\prime}m^{\prime} }R_{\ell m}(r)R_{\ell^{\prime}m^{\prime}}(r)G_{L\ell\ell^{\prime}}^{Mmm^{\prime}},\eqno(2.2.15.14)]where [G_{L\ell\ell^{\prime}}^{Mmm^{\prime}}]are Gaunt numbers (integrals over three spherical harmonics) and [R_{\ell m}(r)] denote the LAPW radial functions [see (2.2.12.1)[link]] of the occupied states [E_{k}^{j}] below the Fermi energy [E_{F}]. The dependence on the energy bands in [R_{\ell m}(r)] has been omitted in order to simplify the notation.

For a given charge density, the Coulomb potential is obtained numerically by solving Poisson's equation in form of a boundary-value problem using a method proposed by Weinert (1981[link]). This yields the Coulomb potential coefficients [v_{LM}(r)] in analogy to (2.2.15.13)[link] [see also (2.2.12.5)[link]]. The most important contribution to the EFG comes from a region close to the nucleus of interest, where only the [L=2] terms are needed (Herzig, 1985[link]). In the limit [r\rightarrow0] (the position of the nucleus), the asymptotic form of the potential [r^{L}v_{LM}K_{LM}] can be used and this procedure yields (Schwarz et al., 1990[link]) for [L=2]: [\eqalignno{V_{2M} & =-C_{2M}\int_{0}^{R}{{\rho_{2M}(r)}\over{r^{3}}}r^{2}\,\,{\rm d}r+C_{2M}\int _{0}^{R}{{\rho_{2M}(r)}\over{r}}\left({{r}\over{R}}\right)^{5}\,\,{\rm d}r&\cr&\quad +5{{C_{2M}}\over{R^{2}}}\sum_{K}V(K)j_{2}(KR)K_{2M}(K),&(2.2.15.15)}]with [C_{2M}=2\sqrt{{4\pi}/{5}}], [C_{22}=\sqrt{{3}/{4}}C_{20}] and the spherical Bessel function [j_{2}]. The first term in (2.2.15.15)[link] (called the valence EFG) corresponds to the integral over the respective atomic sphere (with radius R). The second and third terms in (2.2.15.15)[link] (called the lattice EFG) arise from the boundary-value problem and from the charge distribution outside the sphere considered. Note that our definition of the lattice EFG differs from that based on the point-charge model (Kaufmann & Vianden, 1979[link]). With these definitions the tensor components are given as [\eqalignno{V_{xx} & =C\left[V_{22+}-\left({1}/{\sqrt{3}}\right)V_{20}\right]&\cr V_{yy} & =C \left[-V_{22+}-\left({1}/{\sqrt{3}}\right)V_{20}\right]&\cr V_{zz} & =C \left({2}/{\sqrt{3}}\right)V_{20}&\cr V_{xy} & =C V_{22-}&\cr V_{xz} & =C V_{21+}&\cr V_{yz} & =C V_{21-}&(2.2.15.16)}]where [C=\sqrt{{{15}/{4\pi}}}] and the index M combines m and the partity p (e.g. [2+]). Note that the prefactors depend on the normalization used for the spherical harmonics.

The non-spherical components of the potential [v_{LM}] come from the non-spherical charge density [\rho_{LM}]. For the EFG only the [L=2] terms (in the potential) are needed. If the site symmetry does not contain such a non-vanishing term (as for example in a cubic system with [L=4] in the lowest [LM] combination), the corresponding EFG vanishes by definition. According to the Gaunt numbers in (2.2.15.14)[link] only a few non-vanishing terms remain (ignoring f orbitals), such as the pp, dd or sd combinations (for f orbitals, pf and ff would appear), where this shorthand notation denotes the products of the two radial functions [R_{\ell m}(r)R_{\ell^{\prime }m^{\prime}}(r)]. The sd term is often small and thus is not relevant to the interpretation. This decomposition of the density can be used to partition the EFG (illustrated for the [V_{zz}] component), [V_{zz}\approx V_{zz}^{p}+V_{zz}^{d}+\hbox{small contributions},\eqno(2.2.15.17)]where the superscripts p and d are a shorthand notation for the product of two p- or d-like functions.

From our experience we find that the first term in (2.2.15.15)[link] is usually by far the most important and often a radial range up to the first node in the corresponding radial function is all that contributes. In this case the contribution from the other two terms is rather small (a few per cent). For first-row elements, however, which have no node in their 2p functions, this is no longer true and thus the first term amounts only to about 50–70%.

In some cases interpretation is simplified by defining a so-called asymmetry count, illustrated below for the oxygen sites in YBa2Cu3O7 (Schwarz et al., 1990[link]), the unit cell of which is shown in Fig. 2.2.15.1[link].

[Figure 2.2.15.1]

Figure 2.2.15.1 | top | pdf |

Unit cell of the high-temperature superconductor YBa2Cu3O7 with four non-equivalent oxygen sites.

In this case essentially only the O 2p orbitals contribute to the O EFG. Inside the oxygen spheres (all taken with a radius of 0.82 Å) we can determine the partial charges [q_{i}] corresponding to the [p_{x}], [p_{y}] and [p_{z}] orbitals, denoted in short as [p_{x}], [p_{y}] and [p_{z}] charges.

With these definitions we can define the p-like asymmetry count as [\Delta n_{p}={\textstyle{1\over 2}}(p_{x}+p_{y})-p_{z}\eqno(2.2.15.18)]and obtain the proportionality [V_{zz}\propto\left\langle {{1}/{r^{3}}}\right\rangle _{p}\Delta n_{p}, \eqno(2.2.15.19)]where [\left\langle {{1}/{r^{3}}}\right\rangle _{p}] is the expectation value taken with the p orbitals. A similar equation can be defined for the d orbitals. The factor [1/r^{3}] enhances the EFG contribution from the density anisotropies close to the nucleus. Since the radial wavefunctions have an asymptotic behaviour near the origin as [r^{\ell}], the p orbitals are more sensitive than the d orbitals. Therefore even a very small p anisotropy can cause an EFG contribution, provided that the asymmetry count is enhanced by a large expectation value.

Often the anisotropy in the [p_{x}], [p_{y}] and [p_{z}] occupation numbers can be traced back to the electronic structure. Such a physical interpretation is illustrated below for the four non-equivalent oxygen sites in YBa2Cu3O7 (Table 2.2.15.1[link]). Let us focus first on O1, the oxygen atom that forms the linear chain with the Cu1 atoms along the b axis. In this case, the [p_{y}] orbital of O1 points towards Cu1 and forms a covalent bond, leading to bonding and antibonding states, whereas the other two p orbitals have no bonding partner and thus are essentially nonbonding. Part of the corresponding antibonding states lies above the Fermi energy and thus is not occupied, leading to a smaller [p_{z}] charge of 0.91 e, in contrast to the fully occupied nonbonding states with occupation numbers around 1.2 e. (Note that only a fraction of the charge stemming from the oxygen 2p orbitals is found inside the relatively small oxygen sphere.) This anisotropy causes a finite asymmetry count [(2.2.15.18)[link]] that leads – according to (2.2.15.19)[link] – to a corresponding EFG.

Table 2.2.15.1| top | pdf |
Partial O 2p charges (in electrons) and electric field gradient tensor O EFG (in 1021 V m−2) for YBa2Cu3O7

Numbers in bold represent the main deviation from spherical symmetry in the [2p] charges and the related principal component of the EFG tensor.

Atom[p_{x}][ p_{y}][ p_{z}][ V_{aa} ][V_{bb} ][ V_{cc} ]
O1 1.18 0.91 1.25 −6.1 18.3 −12.2
O2 1.01 1.21 1.18 11.8 −7.0 −4.8
O3 1.21 1.00 1.18 −7.0 11.9 −4.9
O4 1.18 1.19 0.99 −4.7 −7.0 11.7

In this simple case, the anisotropy in the charge distribution, given here by the different p occupation numbers, is directly proportional to the EFG, which is given with respect to the crystal axes and is thus labelled [V_{aa}], [V_{bb}] and [V_{cc}] (Table 2.2.15.1[link]). The principal component of the EFG is in the direction where the p occupation number is smallest, i.e. where the density has its highest anisotropy. The other oxygen atoms behave very similarly: O2, O3 and O4 have a near neighbour in the a, b and c direction, respectively, but not in the other two directions. Consequently, the occupation number is lower in the direction in which the bond is formed, whereas it is normal (around 1.2 e) in the other two directions. The principal axis falls in the direction of the low occupation. The higher the anisotropy, the larger the EFG (compare O1 with the other three oxygen sites). Excellent agreement with experiment is found (Schwarz et al., 1990[link]). In a more complicated situation, where p and d contributions to the EFG occur [see (2.2.15.17)[link]], which often have opposite sign, the interpretation can be more difficult [see e.g. the copper sites in YBa2Cu3O7; Schwarz et al. (1990[link])].

The importance of semi-core states has been illustrated for rutile, where the proper treatment of 3p and 4p states is essential to finding good agreement with experiment (Blaha et al., 1992[link]). The orthogonality between [\ell]-like bands belonging to different principal quantum numbers (3p and 4p) is important and can be treated, for example, by means of local orbitals [see (2.2.12.4)[link]].

In many simple cases, the off-diagonal elements of the EFG tensor vanish due to symmetry, but if they don't, diagonalization of the EFG tensor is required, which defines the orientation of the principal axis of the tensor. Note that in this case the orientation is given with respect to the local coordinate axes (see Section 2.2.13[link]) in which the [LM] components are defined.

2.2.16. Examples

| top | pdf |

The general concepts described above are used in many band-structure applications and thus can be found in the corresponding literature. Here only a few examples are given in order to illustrate certain aspects.

2.2.16.1. F.c.c. copper

| top | pdf |

For the simple case of an element, namely copper in the f.c.c. structure, the band structure is shown in Fig. 2.2.16.1[link] along the [\Delta] symmetry direction from [\Gamma] to X. The character of the bands can be illustrated by showing for each band state the crucial information that is contained in the wavefunctions. In the LAPW method (Section 2.2.12[link]), the wavefunction is expanded in atomic like functions inside the atomic spheres (partial waves), and thus a spatial decomposition of the associated charge and its portion of [\ell]-like charge (s-, p-, d-like) inside the Cu sphere, [q_{\ell} ^{Cu}(E_{{\bf k}}^{j})], provides such a quantity. Fig. 2.2.16.1[link] shows for each state [E_{{\bf k}}^{j}] a circle the radius of which is proportional to the [\ell]-like charge of that state. The band originating from the Cu 4s and 4p orbitals shows an approximately free-electron behaviour and thus a [k^{2}] energy dependence, but it hybridizes with one of the d bands in the middle of the [\Delta] direction and thus the [\ell]-like character changes along the [\Delta] direction.

[Figure 2.2.16.1]

Figure 2.2.16.1 | top | pdf |

Character of energy bands of f.c.c. copper in the [\Delta] direction. The radius of each circle is proportional to the respective partial charge of the given state.

This can easily be understood from a group-theoretical point of view. Since the d states in an octahedral environment split into the [e_{g}] and [t_{2g}] manifold, the d bands can be further partitioned into the two subsets as illustrated in Fig. 2.2.16.2[link]. The s band ranges from about −9.5 eV below [E_{F}] to about 2 eV above. In the [\Delta] direction, the s band has [\Delta_{1}] symmetry, the same as one of the d bands from the [e_{g}] manifold, which consists of [\Delta_{1}] and [\Delta_{2}]. As a consequence of the `non-crossing rule', the two states, both with [\Delta_{1}] symmetry, must split due to the quantum-mechanical interaction between states with the same symmetry. This leads to the avoided crossing seen in the middle of the [\Delta] direction (Fig. 2.2.16.1[link]). Therefore the lowest band starts out as an `s band' but ends near X as a `d band'. This also shows that bands belonging to different irreducible representations (small representations) may cross. The fact that [\Gamma_{12}] splits into the subgroups [\Delta_{1}] and [\Delta_{2}] is an example of the compatibility relations. In addition, group-theoretical arguments can be used (Altmann, 1994[link]) to show that in certain symmetry directions the bands must enter the face of the BZ with zero slope.

[Figure 2.2.16.2]

Figure 2.2.16.2 | top | pdf |

Decomposition of the Cu d bands into the [e_{g}] and [t_{2g}] manifold. The radius of each circle is proportional to the corresponding partial charge.

Note that in a site-centred description of the wavefunctions a similar [\ell]-like decomposition of the charge can be defined as [1=\textstyle\sum_{t} \textstyle\sum_{\ell}q_{\ell}^{t}] (without the [q^{\rm out}] term), but here the partial charges have a different meaning than in the spatial decomposition. In one case (e.g. LAPW), [q_{\ell}^{t}] refers to the partial charge of [\ell]-like character inside sphere t, while in the other case (LCAO), it means [\ell]-like charge coming from orbitals centred at site t. For the main components (for example Cu d) these two procedures will give roughly similar results, but the small components have quite a different interpretation. For this purpose consider an orbital that is centred on the neighbouring site j, but whose tail enters the atomic sphere i. In the spatial representation this tail coming from the j site must be represented by the (s, p, d etc.) partial waves inside sphere i and consequently will be associated with site i, leading to a small partial charge component. This situation is sometimes called the off-site component, in contrast to the on-site component, which will appear at its own site or in its own sphere, depending on the representation, site-centred or spatially confined.

2.2.16.2. The rutile TiO2

| top | pdf |

The well known rutile structure (e.g. TiO2) is tetragonal (see Fig. 2.2.16.3[link]) with the basis consisting of the metal atoms at the [2a] Wyckoff positions, ([0, 0, 0]) and ([\textstyle{{1}\over{2}},\textstyle{{1}\over{2}},\textstyle{{1}\over{2}}]), and anions at the [4f] position, located at ([\pm u,\pm u,0]) and ([\textstyle{{1}\over{2}}\pm u,\textstyle{{1}\over{2}}\mp u,\textstyle{{1}\over{2}}]) with a typical value of about 0.3 for the internal coordinate u. Rutile belongs to the non-symmorphic space group [P4_{2}/mmm] ([D_{4h}^{14}]) in which the metal positions are transformed into each other by a rotation by 90° around the crystal c axis followed by a non-primitive translation of ([\textstyle{{1}\over{2}},\textstyle{{1}\over{2}},\textstyle{{1}\over{2}}]). The two metal positions at the centre and at the corner of the unit cell are equivalent when the surrounding octahedra are properly rotated. The metal atoms are octahedrally coordinated by anions which, however, do not form an ideal octahedron. The distortion depends on the structure parameters a, [c/a] and u, and results in two different metal–anion distances, namely the apical distance [d_{a}] and the equatorial distance [d_{e}], the height (z axis) and the basal spacing of the octahedron. For a certain value [u^{*}] the two distances [d_{a}] and [d_{e}] become equal:[u=u^{*}=\textstyle{{1}\over{4}}[1+\textstyle{{1}\over{2}}({{c}/{a}})^{2}].\eqno(2.2.16.1)]For this special value [u^{*}] and an ideal [{{c}/{a}}] ratio, the basal plane of the octahedron is quadratic and the two distances are equal. An ideal octahedral coordination is thus obtained with [\displaylines{\hfill d_{a}=d_{e},\quad c=\sqrt{2}(1–2u)a\hfill(2.2.16.2)\cr \hfill u_{\rm ideal}=\textstyle{{1}\over{2}}(2-\sqrt{2})=0.293\hfill(2.2.16.3)\cr \hfill (c/a)_{\rm ideal}=2-\sqrt{2}=0.586.\hfill(2.2.16.4)}%fd2.2.16.4]Although the actual coordination of the metal atoms deviates from the ideal octahedron (as in all other systems that crystallize in the rutile structure), we still use this concept for symmetry arguments and call it octahedral coordination.

[Figure 2.2.16.3]

Figure 2.2.16.3 | top | pdf |

The local coordinate system in rutile for titanium (small spheres) and oxygen (large spheres).

The concept of a local coordinate system is illustrated for rutile (TiO2) from two different aspects, namely the crystal harmonics expansion (see Section 2.2.13[link]) and the interpretation of chemical bonding (for further details see Sorantin & Schwarz, 1992[link]).

  • (i) The expansion in crystal harmonics. We know that titanium occupies the Wyckoff position [2a] with point group [mmm]. From Table 2.2.13.1[link] we see that for point group [mmm] (listed under the orthorhombic structure) we must choose the x axis parallel to [[\overline{1}10]], the y axis parallel to [[110]] and the z axis parallel to [[001]]. We can transform the global coordinate system (i.e. that of the unit cell) into the local coordinate system around Ti. The following first LM combinations appear in the series (2.2.12.5)[link]: [(LM) =] [(0,0),] [(2,0),] [(2,2),] [(4,0),] [(4,2),] [(4,4), \ldots], etc.

  • (ii) The interpretation of bonding. The second reason for choosing a local coordinate system is that it allows the use of symmetry-adapted orbitals for interpreting bonding, interactions or crystal-field effects. For this purpose, one likes to have the axes pointing to the six oxygen ligands, i.e. the x and y axes towards the oxygen atoms in the octahedral basal plane, and the z axis towards the apical oxygen (Fig. 2.2.16.3[link]). The Cartesian x and y axes, however, are not exactly (but approximately) directed toward the oxygen ligands due to the rectangular distortion of the octahedral basal plane.

    For oxygen in TiO2 with point group [mm2], the two types of local systems are identical and are shown in Fig. 2.2.16.3[link] for the position ([\textstyle{{1}\over{2}}-u,\textstyle{{1}\over{2}}+u,\textstyle{{1}\over{2}}]). The z axis coincides with that of the Ti atom, while it points to the neighbouring oxygen of the basal plane in the octahedron around Ti at the origin. Only in this local coordinate system are the orbitals arranged in the usual way for an octahedron, where the d orbitals split (into the three orbitals of [t_{2g}] and the two of [e_{g}] symmetry) and thus allow an easy interpretation of the interactions, e.g. one of the two [e_{g}] orbitals, namely the Ti [d_{z^{2}}] can form a [\sigma] bond with the O [p_{z}] orbital.

2.2.16.3. Core electron spectra

| top | pdf |

In excitations involving core electrons, simplifications are possible that allow an easier interpretation. As one example, (soft) X-ray emission (XES) or absorption (XAS) spectra are briefly discussed. In the one-electron picture, the XES process can be described as sketched in Fig. 2.2.16.4[link]. First a core electron of atom A in state [n^{\prime}\ell^{\prime}] is knocked out (by electrons or photons), and then a transition occurs between the occupied valence states at energy [\varepsilon] and the core hole (the transitions between inner core levels are ignored).

[Figure 2.2.16.4]

Figure 2.2.16.4 | top | pdf |

Schematic transitions in X-ray emission and absorption spectra.

According to Fermi's golden rule, the intensity of such a transition can be described by[I_{An^{\prime}\ell^{\prime}}(\nu)=\nu^{3}\textstyle\sum\limits_{\ell}W_{\ell\ell^{\prime} }n_{\ell}^{A}(\varepsilon)M_{A}(\ell,n^{\prime}\ell^{\prime},\varepsilon)^{2}\delta(\varepsilon-E_{n^{\prime}\ell^{\prime}}^{A}-h\nu),\eqno(2.2.16.5)]where [W_{\ell\ell^{\prime}}] comes from the integral over the angular components (Table 2.2.16.1[link]) and contains the [\Delta\ell=\pm1] selection rule, [n_{\ell}^{A}(\varepsilon)] is the local (within atomic sphere A) partial ([\ell]-like) DOS, [M_{A}(\ell,n^{\prime}\ell^{\prime},\varepsilon)^{2}] is the radial transition probability [see (2.2.16.6)[link] below], and the last term takes the energy conservation into account.

Table 2.2.16.1| top | pdf |
[W_{\ell\ell^{\prime}}] factors for X-ray emission spectra showing the [\Delta\ell=\pm 1] selection rule

[\ell'][\ell]
01234
0   1/3      
1 1   2/5    
2   2/3   3/7  
3     3/5   4/11

The [M_{A}(\ell,n^{\prime}\ell^{\prime},\varepsilon)^{2}] are defined as the dipole transition (with the dipole operator r) probability between the valence state at [\varepsilon] and the core state characterized by quantum numbers [n^{\prime}\ell^{\prime}],[M_{A}(\ell,n^{\prime}\ell^{\prime},\varepsilon)^{2}={{[\int_{0}^{R_{A}}u_{\ell}^{A}(r,\varepsilon)r^{3}R_{n^{\prime}\ell^{\prime}}^{A\,{\rm core}}(r)\,\,{\rm d}r]^{2}}\over{\int_{0}^{R_{A}}[u_{\ell}^{A}(r,\varepsilon)]^{2}r^{2}\,\,{\rm d}r}}.\eqno(2.2.16.6)]In this derivation one makes use of the fact that core states are completely confined inside the atomic sphere. Therefore the integral, which should be taken over the entire space, can be restricted to one atomic sphere (namely A), since the core wavefunction [R_{n^{\prime}\ell^{\prime}}^{A\,{\rm core}}(r)] and thus the integrand vanishes outside this sphere. This is also the reason why XES (or XAS) are related to [n_{\ell}^{A}(\varepsilon)], the local DOS weighted with the [\ell]-like charge within the atomic sphere A.

The interpretation of XES intensities is as follows. Besides the [\nu^{3}] factor from Fermi's golden rule, the intensity is governed by the [\Delta \ell=\pm1] selection rule and the energy conservation. In addition, it depends on the number of available states at [\varepsilon] which reside inside sphere A and have an [\ell]-like contribution, times the probability for the transition to take place from the valence and to the core hole under energy conservation. For an application, see for example the comparison between theory and experiment for the compounds NbC and NbN (Schwarz, 1977[link]).

Note again that the present description is based on an atomic sphere representation with partial waves inside the spheres, in contrast to an LCAO-like treatment with site-centred basis functions. In the latter, an equivalent formalism can be defined which differs in details, especially for the small components (off-site contributions). If the tails of an orbital enter a neighbouring sphere and are crucial for the interpretation of XES, there is a semantic difference between the two schemes as discussed above in connection with f.c.c. Cu in Section 2.2.16.1[link]. In the present framework, all contributions come exclusively from the sphere where the core hole resides, whereas in an LCAO representation `cross transitions' from the valence states on one atom to the core hole of a neighbouring atom may be important. The latter contributions must be (and are) included in the partial waves within the sphere in schemes such as LAPW. There is no physical difference between the two descriptions.

In XES, spectra are interpreted on the basis of results from ground-state calculations, although there could be relaxations due to the presence of a core hole. As early as 1979, von Barth and Grossmann formulated a `final state rule' for XES in metallic systems (von Barth & Grossmann, 1979[link]). In this case, the initial state is one with a missing core electron (core hole), whereas the final state is close to the ground state, since the hole in the valence bands (after a valence electron has filled the core hole) has a very short lifetime and is very quickly filled by other valence electrons. They applied time-dependent perturbation theory and could show by model calculations that the main XES spectrum can be explained by ground-state results, whereas the satellite spectrum (starting with two core holes and ending with one) requires a treatment of the core-hole relaxation. This example illustrates the importance of the relevant physical process in experiments related to the energy-band structure: it may not always be the just the ground states that are involved and sometimes excited states must be considered.

2.2.17. Conclusion

| top | pdf |

There are many more applications of band theory to solids and thus an enormous amount of literature has not been covered here. In this chapter, an attempt has been made to collect relevant concepts, definitions and examples from group theory, solid-state physics and crystallography in order to understand symmetry aspects in combination with a quantum-mechanical treatment of the electronic structure of solids.

Acknowledgements

The author wishes to thank the following persons who contributed to this chapter: P. Blaha, the main author of WIEN; J. Luitz, for help with the figures; and P. Herzig, with whom the author discussed the group-theoretical aspects.

References

Akai, H., Akai, M., Blügel, S., Drittler, B., Ebert, H., Terakura, K., Zeller, R. & Dederichs, P. H. (1990). Theory of hyperfine interactions in metals. Prog. Theor. Phys. Suppl. 101, 11–77.
Altmann, S. L. (1994). Band theory of solids: An introduction from the view of symmetry. Oxford: Clarendon Press.
Andersen, O. K. (1975). Linear methods in band theory. Phys. Rev. B, 12, 3060–3083.
Barth, U. von & Grossmann, G. (1979). The effect of the core hole on X-ray emission spectra in simple metals. Solid State Commun. 32, 645–649.
Barth, U. von & Hedin, L. (1972). A local exchange-correlation potential for the spin-polarized case: I. J. Phys. C, 5, 1629–1642.
Blaha, P., Schwarz, K. & Dederichs, P. H. (1988). First-principles calculation of the electric field gradient in hcp metals. Phys. Rev. B, 37, 2792–2796.
Blaha, P., Schwarz, K. & Herzig, P. (1985). First-principles calculation of the electric field gradient of Li3N. Phys. Rev. Lett. 54, 1192–1195.
Blaha, P., Schwarz, K., Sorantin, P. I. & Trickey, S. B. (1990). Full-potential linearized augmented plane wave programs for crystalline systems. Comput. Phys. Commun. 59, 399–415.
Blaha, P., Singh, D. J., Sorantin, P. I. & Schwarz, K. (1992). Electric field gradient calculations for systems with large extended core state contributions. Phys. Rev. B, 46, 5849–5852.
Blöchl, P. E. (1994). Projector augmented-wave method. Phys. Rev. B, 50, 17953–17979.
Bouckaert, L. P., Smoluchowski, R. & Wigner, E. (1930). Theory of Brillouin zones and symmetry properties of wavefunctions in crystals. Phys. Rev. 50, 58–67.
Bradley, C. J. & Cracknell, A. P. (1972). The mathematical theory of symmetry in solids. Oxford: Clarendon Press.
Car, R. & Parrinello, M. (1985). Unified approach for molecular dynamics and density-functional theory. Phys. Rev. Lett. 55, 2471–2474.
Ceperley, D. M. & Alder, B. J. (1984). Ground state of the electron gas by a stochastic method. Phys. Rev. Lett. 45, 566–572.
Colle, R. & Salvetti, O. (1990). Generalisation of the Colle–Salvetti correlation energy method to a many determinant wavefunction. J. Chem. Phys. 93, 534–544.
Condon, E. U. & Shortley, G. H. (1953). The mathematical theory of symmetry in crystals. Cambridge University Press.
Dreizler, R. M. & Gross, E. K. U. (1990). Density functional theory. Berlin, Heidelberg, New York: Springer-Verlag.
Dufek, P., Blaha, P. & Schwarz, K. (1995a). Theoretical investigation of the pressure induced metallization and the collapse of the antiferromagnetic states in NiI2. Phys. Rev. B, 51, 4122–4127.
Dufek, P., Blaha, P. & Schwarz, K. (1995b). Determination of the nuclear quadrupole moment of 57Fe. Phys. Rev. Lett. 75, 3545–3548.
Ellis, D. E., Guenzburger, D. & Jansen, H. B. (1983). Electric field gradient and electronic structure of linear-bonded halide compounds. Phys. Rev. B, 28, 3697–3705.
Godby, R. W., Schlüter, M. & Sham, L. J. (1986). Accurate exchange-correlation potential for silicon and its discontinuity of addition of an electron. Phys. Rev. Lett. 56, 2415–2418.
Gunnarsson, O. & Lundqvist, B. I. (1976). Exchange and correlation in atoms, molecules, and solids by the spin-density-functional formation. Phys. Rev. B, 13, 4274–4298.
Hedin, L. & Lundqvist, B. I. (1971). Explicit local exchange-correlation potentials. J. Phys. C, 4, 2064–2083.
Herzig, P. (1985). Electrostatic potentials, field gradients from a general crystalline charge density. Theoret. Chim. Acta, 67, 323–333.
Hoffmann, R. (1988). Solids and surfaces: A chemist's view of bonding in extended structures. New York: VCH Publishers, Inc.
Hohenberg, P. & Kohn, W. (1964). Inhomogeneous electron gas. Phys. Rev. 136, B864–B871.
Hybertsen, M. S. & Louie, G. (1984). Non-local density functional theory for the electronic and structural properties of semiconductors. Solid State Commun. 51, 451–454.
International Tables for Crystallography (2001). Vol. B. Reciprocal space, edited by U. Shmueli, 2nd ed. Dordrecht: Kluwer Academic Publishers.
International Tables for Crystallography (2005). Vol. A. Space-group symmetry, edited by Th. Hahn, 5th ed. Heidelberg: Springer.
Janak, J. F. (1978). Proof that [\partial E/\partial n_i = \epsilon_i] in density-functional theory. Phys. Rev. B, 18, 7165–7168.
Kaufmann, E. N. & Vianden, R. J. (1979). The electric field gradient in noncubic metals. Rev. Mod. Phys. 51, 161–214.
Koelling, D. D. & Arbman, G. O. (1975). Use of energy derivative of the radial solution in an augmented plane wave method: application to copper. J. Phys. F Metal Phys. 5, 2041–2054.
Koelling, D. D. & Harmon, B. N. (1977). A technique for relativistic spin-polarized calculations. J. Phys. C Solid State Phys. 10, 3107–3114.
Kohn, W. & Rostocker, N. (1954). Solution of the Schrödinger equation in periodic lattice with an application to metallic lithium. Phys. Rev. 94, A1111–A1120.
Kohn, W. & Sham, L. J. (1965). Self-consistent equations including exchange. Phys. Rev. 140, A1133–A1138.
Korringa, J. (1947). On the calculation of the energy of a Bloch wave in a metal. Physica, 13, 392–400.
Kurki-Suonio, K. (1977). Symmetry and its implications. Isr. J. Chem. 16, 115–123.
Loucks, T. L. (1967). Augmented plane wave method. New York, Amsterdam: W. A. Benjamin, Inc.
Methfessl, M. & Frota-Pessoa, S. (1990). Real-space method for calculation of the electric-field gradient: Comparison with K-space results. J. Phys. Condens. Matter, 2, 149–158.
Meyer, B., Hummler, K., Elsässer, C. & Fähnle, M. (1995). Reconstruction of the true wavefunction from the pseudo-wavefunctions in a crystal and calculation of electric field gradients. J. Phys. Condens. Matter, 7, 9201–9218.
Ordejon, P., Drabold, D. A., Martin, R. A. & Grumbach, M. P. (1995). Linear system-size scaling methods for electronic-structure calculations. Phys. Rev. B, 51, 1456–1476.
Parr, R., Donnelly, R. A., Levy, M. & Palke, W. A. (1978). Electronegativity: the density functional viewpoint. J. Chem. Phys. 68, 3801–3807.
Perdew, J. P. (1986). Density functional theory and the band gap problem. Int. J. Quantum Chem. 19, 497–523.
Perdew, J. P., Burke, K. & Ernzerhof, M. (1996) Generalized gradient approximation made simple. Phys. Rev. Lett. 77, 3865–3868.
Perdew, J. P. & Levy, M. (1983). Physical content of the exact Kohn–Sham orbital energies: band gaps and derivative discontinuities. Phys. Rev. Lett. 51, 1884–1887.
Petrilli, H. M., Blöchl, P. E., Blaha, P. & Schwarz, K. (1998). Electric-field-gradient calculations using the projector augmented wave method. Phys. Rev. B, 57, 14690–14697.
Petrilli, H. M. & Frota-Pessoa, S. (1990). Real-space method for calculation of the electric field gradient in systems without symmetry. J. Phys. Condens. Matter, 2, 135–147.
Pisani, C. (1996). Quantum-mechanical ab-initio calculation of properties of crystalline materials. Lecture notes in chemistry, 67, 1–327. Berlin, Heidelberg, New York: Springer-Verlag.
Pyykkö, P. (1992). The nuclear quadrupole moments of the 20 first elements: High-precision calculations on atoms and small molecules. Z. Naturforsch. A, 47, 189–196.
Sandratskii, L. M. (1990). Symmetry properties of electronic states of crystals with spiral magnetic order. Solid State Commun. 75, 527–529.
Schwarz, K. (1977). The electronic structure of NbC and NbN. J. Phys. C Solid State Phys. 10, 195–210.
Schwarz, K., Ambrosch-Draxl, C. & Blaha, P. (1990). Charge distribution and electric field gradients in YBa2Cu3O7−x. Phys. Rev. B, 42, 2051–2061.
Schwarz, K. & Blaha, P. (1992). Ab initio calculations of the electric field gradients in solids in relation to the charge distribution. Z. Naturforsch. A, 47, 197–202.
Schwarz, K. & Blaha, P. (1996). Description of an LAPW DF program (WIEN95). In Lecture notes in chemistry, Vol. 67, Quantum-mechanical ab initio calculation of properties of crystalline materials, edited by C. Pisani. Berlin, Heidelberg, New York: Springer-Verlag.
Schwarz, K. & Herzig, P. (1979). The sensitivity of partially filled f bands to configuration and relativistic effects. J. Phys. C Solid State Phys. 12, 2277–2288.
Seitz, F. (1937). On the reduction of space groups. Ann. Math. 37, 17–28.
Singh, D. J. (1994). Plane waves, pseudopotentials and the LAPW method. Boston, Dordrecht, London: Kluwer Academic Publishers.
Skriver, H. L. (1984). The LMTO method. Springer series in solid-state sciences, Vol. 41. Berlin, Heidelberg, New York, Tokyo: Springer.
Slater, J. C. (1937). Wavefunctions in a periodic crystal. Phys. Rev. 51, 846–851.
Slater, J. C. (1974). The self-consistent field for molecules and solids. New York: McGraw-Hill.
Sorantin, P. & Schwarz, K. (1992). Chemical bonding in rutile-type compounds. Inorg. Chem. 31, 567–576.
Springborg, M. (1997). Density-functional methods in chemistry and material science. Chichester, New York, Weinheim, Brisbane, Singapore, Toronto: John Wiley and Sons Ltd.
Vosko, S. H., Wilk, L. & Nusair, M. (1980). Accurate spin-dependent electron liquid correlation energies for local spin density calculations. Can. J. Phys. 58, 1200–1211.
Weinert, M. (1981). Solution of Poisson's equation: beyond Ewald-type methods. J. Math. Phys. 22, 2433–2439.
Williams, A. R., Kübler, J. & Gelatt, C. D. Jr (1979). Cohesive properties of metallic compounds: Augmented-spherical-wave calculations. Phys. Rev. B, 19, 6094–6118.
Winkler, B., Blaha, P. & Schwarz, K. (1996). Ab initio calculation of electric field gradient tensors of fosterite. Am. Mineral. 81, 545–549.








































to end of page
to top of page