Training with constraints and weights

In some cases, might want to manipulate the linear problem by introducing constraints and/or weights for certain properties or structures. In this tutorial, we show how this can be achieved within the icet framework. We will show how exact constraints can be implemented as well as more flexible weighted constraints.

A more comprehensive tutorial on using weighted constraints to improve surface segregation energies can be found here.

In this tutorial, we show how constraints and weights can be used in icet. Here, we mean constraints in the sense of any function that is a function of one or several cluster vectors.

Exact constraints

First we show how exact linear constraits can be enforced. This entails reformulating the linear problem such that a certain condition has to be fulfilled and can be achieved by using the Constraints class. Note that this is only possible for conditions that can be formulated such that the right-hand-side of the linear problem equals zero.

The constraint is enforced via a matrix \(M\) with dimensions \((n_\mathrm{const}, n_\mathrm{params})\) where \(n_\mathrm{const}\) is the number of constraits and \(n_\mathrm{params}\) is the number of effective cluster interactions (ECIs). Element \(M_{ij}\) determines how the \(j\)-th ECI contributes to constraint \(i\).

In the example below, we demonstrate fitting of a cluster expansion under the constraint that parameter 2 and parameter 4 should be equal. This entails setting up a matrix with elements \(M_{02}=1\) and \(M_{04}=-1\) such that the constraint applied to the linear problem results in \(p_2 - p_4 = 0\) where \(p_j\) is ECI \(j\).

[1]:
import numpy as np
from icet.tools import Constraints
from trainstation import Optimizer

# Set up random sensing matrix and target "energies"
n_params = 10
n_energies = 20
A = np.random.random((n_energies, n_params))
y = np.random.random(n_energies)

# Define constraints
c = Constraints(n_params=n_params)
M = np.zeros((1, n_params))
M[0, 2] = 1
M[0, 4] = -1
c.add_constraint(M)

# Do the actual fit and finally extract parameters
A_constrained = c.transform(A)
opt = Optimizer((A_constrained, y), fit_method='ridge')
opt.train()
parameters = c.inverse_transform(opt.parameters)

print(parameters[2], parameters[4])
0.06936944354493091 0.06936944354493094

We see that the parameters with index 2 and 4 are exactly equal as expected.

Another example is to use this feature to constrain the mixing energy to zero at the pure phases. Below, we show how this is achieved for a toy system with random cluster vectors and random target energies.

[2]:
import numpy as np
from ase.build import bulk
from icet import ClusterSpace, ClusterExpansion
from icet.tools import get_mixing_energy_constraints
from trainstation import Optimizer

# Set up cluster space along with random sensing matrix and target "energies"
Au, Ag = bulk('Au', a=4.0), bulk('Ag', a=4.0)
cs = ClusterSpace(Au, cutoffs=[6.0, 5.0], chemical_symbols=['Au', 'Ag'])
n_params = len(cs)
n_energies = 20
A = np.random.random((n_energies, n_params))
y = np.random.random(n_energies)

# Define constraints
c = get_mixing_energy_constraints(cs)

# Do the actual fit and finally extract parameters
A_constrained = c.transform(A)
opt = Optimizer((A_constrained, y), fit_method='ridge')
opt.train()
parameters = c.inverse_transform(opt.parameters)

# Construct cluster expansion
ce = ClusterExpansion(cs, parameters)
print('Au: ', ce.predict(Au))
print('Ag: ', ce.predict(Ag))
Au:  7.771561172376096e-16
Ag:  1.1102230246251565e-16

We see that for the pure elements, the prediction is (very close to) zero (due to the floating point precision).

Weighted constraints

In some cases, one might want to introduce more flexible constraints that are weighted and/or have a non-zero right-hand-size. This can be achieved by manually manipulating the sensing matrix.

First, we show the simple example of adding a weight to structure 3 in the training set using the toy system from the example above. This is achieved by setting up a weight vector \(W\) where element \(W_i\) corresponds to the weight of structure \(i\).

[3]:
W = np.ones(n_energies)
W[3] = 5.0

# Multiply each row of A and element of y with the corresponding weight
A_weighted = np.multiply(A, W.reshape(-1,1))
y_weighted = W * y

# Do the training
opt = Optimizer((A_weighted, y_weighted), fit_method='ridge')
opt.train()
parameters = opt.parameters

# Check the errors for the first five structures
for i in range(5):
    print(f'Structure {i} error: {parameters.dot(A[i,:]) - y[i]}')

Structure 0 error: -0.23922788020875074
Structure 1 error: 0.010046343959063009
Structure 2 error: -0.089031447377665
Structure 3 error: -0.0023861248597936147
Structure 4 error: 0.009128818679495665

We see that the result of structure 3 is much smaller than the other errors, since it is given higher priority in training.

One can also add weighted constraints corresponding to any function of any number of cluster vectors. Here, we show an example where we want the energy of structures 2 and 3 to be similar. This is achieved by adding a new row to the sensing matrix.

[4]:
# Set up a new row for sensing matrix corresponding to the difference in
# cluster vector for structures 2 and 3 and a corresponding target value 0.0
A_add = A[2,:] - A[3,:]
y_add = 0.0

# Select a weight for the constraint
w = 5.0

# Stack sensing matricies and target vectors
A_constrained = np.vstack((A, w * A_add))
y_constrained = np.hstack((y, w * y_add))

# First train without the constraint
opt = Optimizer((A, y), fit_method='ridge')
opt.train()
parameters_ref = opt.parameters

# Then train with the constraint
opt = Optimizer((A_constrained, y_constrained), fit_method='ridge')
opt.train()
parameters = opt.parameters

# Check the predicted and target energy for the first five structures
for i in [2,3]:
    print(f'Structure {i} w/o constr.: {parameters_ref.dot(A[i,:]):.4f},'
          f'w/ constr: {parameters.dot(A[i,:]):.4f}, target: {y[i]:.4f}')
Structure 2 w/o constr.: 0.5882,w/ constr: 0.5951, target: 0.6392
Structure 3 w/o constr.: 0.7107,w/ constr: 0.5914, target: 0.7764

We see that the predicted energies with added constraits are much closer than without constraints.

This approach is useful, e.g., for constraining the surface segregation energy for surface slabs as demonstrated here.