Cluster expansions

This page provides a short introduction to cluster expansions. For more extensive descriptions, please consult for example [SanDucGra84], [Wal09] or [AngMunRah19]. A simpler introduction to the subject can be found in the first part of this tutorial.

Formalism

In the following, we are concerned with configurations corresponding to a distribution of \(M\) different species over \(N\) sites that can be written as a vector \(\boldsymbol{\sigma}\), whose elements can assume \(M\) different values, e.g., from \(S=\{-m, -m+1, \ldots m-1, m\}\) where \(M=2m\) (for even \(M\)) or \(M=2m+1\) (for odd \(M\)) [SanDucGra84]. One now seeks to represent a property \(Q\) of the system, such as the total energy, as a function of \(\boldsymbol{\sigma}\), i.e. \(Q = f(\boldsymbol{\sigma})\). To this end, one can construct a complete orthonormal basis of cluster functions \(\Gamma_{\alpha}(\boldsymbol{\sigma})\) [SanDucGra84] [San10], which allows one to express \(Q\) in the form [Wal09]

\[Q = \sum_\alpha m_\alpha J_\alpha \left<\Gamma_{\alpha'}(\boldsymbol{\sigma})\right>_{\alpha}.\]

Here, the sum extends over all symmetry equivalent clusters (orbit) \(\alpha\), \(m_{\alpha}\) denotes the multiplicity, whereas the coefficients \(J_{\alpha}\) are the so-called effective cluster interactions (ECIs). The last term in the above expression represents the average over cluster functions \(\Gamma_{\alpha}(\boldsymbol{\sigma})\) belonging to symmetry equivalent clusters. The cluster functions themselves are obtained as a product over point functions \(\Theta\). In icet, the point functions from [Wal09] are used:

\[\begin{split}\Theta_{n}(\sigma_p) &= \begin{cases} 1 & \qquad \text{if $n=0$} \\ - \cos\left(\pi(n+1)\sigma_p/M\right) & \qquad \text{if $n$ is odd} \\ -\sin\left(\pi n \sigma_p/M\right) & \qquad \text{if $n$ is even}, \end{cases}\end{split}\]

These point functions form an orthogonal set over all possible occupation numbers. For further details, please consult [Wal09] or [AngMunRah19].

In icet, the formalism is handled internally by the ClusterSpace class.

Cluster expansion construction

The task of training a CE can be formally written as a linear problem

\[\mathbf{\bar{\Pi}} \boldsymbol{J} = \boldsymbol{Q}\]

where \(\boldsymbol{Q}\) is the vector of target properties, the vector \(\boldsymbol{J}\) represents the ECIs, and \(\mathbf{\bar{\Pi}}\) is a matrix that is obtained by stacking the vectors that represent each structure (also referred to in this documentation as cluster vectors) of the training set. Here, the multiplicities \(m_{\alpha}\) have been included in \(\mathbf{\bar{\Pi}}\).

This problem can be approached by choosing the number of structures \(n_{\boldsymbol{Q}}\) (and thus the dimensionality of \(\boldsymbol{Q}\)), to be (much) larger than the number of ECIs \(n_{\boldsymbol{J}}\) (and thus the dimensionality of \(\boldsymbol{J}\), (\(n_{\boldsymbol{Q}}>n_{\boldsymbol{J}}\)). The set of equations is thus overdetermined. The optimal set of ECIs for fixed training set and cluster function basis is then obtained by minimizing the \(l_2\)-norm of \(\mathbf{\bar{\Pi}} \boldsymbol{J} - \boldsymbol{Q}\)

\[\boldsymbol{J}_{\text{opt}} = \arg\min_{\boldsymbol{J}} \left\{ || \mathbf{\bar{\Pi}} \boldsymbol{J} - \boldsymbol{Q} ||_2 \right\}.\]

Traditional algorithms [Wal09] then proceed by generating a series of CEs corresponding to different basis set choices, i.e. different values of \(n_{\boldsymbol{J}}\). By comparing the performance of each CE by means of a cross validation (CV) score the best performing CE is selected. An alternative approach is to choose the number of ECIs larger than the number of structures (\(n_{\boldsymbol{J}}>n_{\boldsymbol{Q}}\)). The resulting problem is then underdetermined, but can be solved by means of compressive sensing [CanWak08] [NelHarZho13] [NelOzoRee13]. In practice, one obtains a sparse solution, i.e., a solution in which many ECIs are zero. It is also possible to employ Bayesian approaches [MueCed09] or genetic algorithms [BluHarWal05].

icet is agnostic to a particular optimization approach and can in principle be used in conjunction with any of these techniques. For many applications, the trainstation package, the documentation of which can be found here, provides a particular simple interface and many of our examples use this package.