# Tools¶

## Mapping structures¶

icet.tools.map_structure_to_reference(structure, reference, inert_species=None, tol_positions=0.0001, suppress_warnings=False, assume_no_cell_relaxation=False)[source]

Maps a structure onto a reference structure. This is often desirable when, for example, a structure has been relaxed using DFT, and one wants to use it as a training structure in a cluster expansion.

The function returns a tuple comprising the ideal supercell most closely matching the input structure and a dictionary with supplementary information concerning the mapping. The latter includes for example the largest deviation of any position in the input structure from its reference position (drmax), the average deviation of the positions in the input structure from the reference positions (dravg), and the strain tensor for the input structure relative to the reference structure (strain_tensor).

The returned Atoms object provides further supplemental information via custom per-atom arrays including the atomic displacements (Displacement, Displacement_Magnitude),the distances to the three closest sites (Minimum_Distances), as well as a mapping between the indices of the returned Atoms object and those of the input structure (‘IndexMapping’).

Parameters
• structure (Atoms) – input structure, typically a relaxed structure

• reference (Atoms) – reference structure, which can but need not be the primitive structure

• inert_species (Optional[List[str]]) – list of chemical symbols (e.g., ['Au', 'Pd']) that are never substituted for a vacancy; the number of inert sites is used to rescale the volume of the input structure to match the reference structure.

• tol_positions (float) – tolerance factor applied when scanning for overlapping positions in Angstrom (forwarded to ase.build.make_supercell())

• suppress_warnings (bool) – if True, print no warnings of large strain or relaxation distances

• assume_no_cell_relaxation (bool) –

if False volume and cell metric of the input structure are rescaled to match the reference structure; this can be unnecessary (and counterproductive) for some structures, e.g., with many vacancies

Note: When setting this parameter to False the reference cell metric must be obtainable via an integer transformation matrix from the reference cell metric. In other words the input structure should not involve relaxations of the volume or the cell metric.

Example

The following code snippet illustrates the general usage. It first creates a primitive FCC cell, which is latter used as reference structure. To emulate a relaxed structure obtained from, e.g., a density functional theory calculation, the code then creates a 4x4x4 conventional FCC supercell, which is populated with two different atom types, has distorted cell vectors, and random displacements to the atoms. Finally, the present function is used to map the structure back the ideal lattice:

>>> from ase.build import bulk
>>> reference = bulk('Au', a=4.09)
>>> structure = bulk('Au', cubic=True, a=4.09).repeat(4)
>>> structure.set_chemical_symbols(10 * ['Ag'] + (len(structure) - 10) * ['Au'])
>>> structure.set_cell(structure.cell * 1.02, scale_atoms=True)
>>> structure.rattle(0.1)
>>> mapped_structure, info = map_structure_to_reference(structure, reference)

Return type

Tuple[Atoms, dict]

## Structure enumeration¶

icet.tools.enumerate_structures(structure, sizes, chemical_symbols, concentration_restrictions=None, niggli_reduce=None, symprec=1e-05, position_tolerance=None)[source]

Yields a sequence of enumerated structures. The function generates all inequivalent structures that are permissible given a certain lattice. Using the chemical_symbols and concentration_restrictions keyword arguments it is possible to specify which chemical_symbols are to be included on which site and in which concentration range.

The function is sensitive to the boundary conditions of the input structure. An enumeration of, for example, a surface can thus be performed by setting structure.pbc = [True, True, False].

The algorithm implemented here was developed by Gus L. W. Hart and Rodney W. Forcade in Phys. Rev. B 77, 224115 (2008) [HarFor08] and Phys. Rev. B 80, 014120 (2009) [HarFor09].

Parameters
• structure (Atoms) – primitive structure from which derivative superstructures should be generated

• sizes (Union[List[int], range]) – number of sites (included in enumeration)

• chemical_symbols (list) – chemical species with which to decorate the structure, e.g., ['Au', 'Ag']; see below for more examples

• concentration_restrictions (Optional[dict]) – allowed concentration range for one or more element in chemical_symbols, e.g., {'Au': (0, 0.2)} will only enumerate structures in which the Au content is between 0 and 20 %; here, concentration is always defined as the number of atoms of the specified kind divided by the number of all atoms.

• niggli_reduction – if True perform a Niggli reduction with spglib for each structure; the default is True if structure is periodic in all directions, False otherwise.

• symprec (float) – tolerance imposed when analyzing the symmetry using spglib

• position_tolerance (Optional[float]) – tolerance applied when comparing positions in Cartesian coordinates; by default this value is set equal to symprec

Examples

The following code snippet illustrates how to enumerate structures with up to 6 atoms in the unit cell for a binary alloy without any constraints:

>>> from ase.build import bulk
>>> prim = bulk('Ag')
>>> for structure in enumerate_structures(structure=prim,
...                                       sizes=range(1, 5),
...                                       chemical_symbols=['Ag', 'Au']):
...     pass # Do something with the structure


To limit the concentration range to 10 to 40% Au the code should be modified as follows:

>>> conc_restr = {'Au': (0.1, 0.4)}
>>> for structure in enumerate_structures(structure=prim,
...                                       sizes=range(1, 5),
...                                       chemical_symbols=['Ag', 'Au'],
...                                       concentration_restrictions=conc_restr):
...     pass # Do something with the structure


Often one would like to consider mixing on only one sublattice. This can be achieved as illustrated for a Ga(1-x)Al(x)As alloy as follows:

>>> prim = bulk('GaAs', crystalstructure='zincblende', a=5.65)
>>> for structure in enumerate_structures(structure=prim,
...                                       sizes=range(1, 9),
...                                       chemical_symbols=[['Ga', 'Al'], ['As']]):
...     pass # Do something with the structure

Return type

Atoms

icet.tools.enumerate_supercells(structure, sizes, niggli_reduce=None, symprec=1e-05, position_tolerance=None)[source]

Yields a sequence of enumerated supercells. The function generates all inequivalent supercells that are permissible given a certain lattice. Any supercell can be reduced to one of the supercells generated.

The function is sensitive to the boundary conditions of the input structure. An enumeration of, for example, a surface can thus be performed by setting structure.pbc = [True, True, False].

The algorithm is based on Gus L. W. Hart and Rodney W. Forcade in Phys. Rev. B 77, 224115 (2008) [HarFor08] and Phys. Rev. B 80, 014120 (2009) [HarFor09].

Parameters
• structure (Atoms) – primitive structure from which supercells should be generated

• sizes (Union[List[int], range]) – number of sites (included in enumeration)

• niggli_reduction – if True perform a Niggli reduction with spglib for each supercell; the default is True if structure is periodic in all directions, False otherwise.

• symprec (float) – tolerance imposed when analyzing the symmetry using spglib

• position_tolerance (Optional[float]) – tolerance applied when comparing positions in Cartesian coordinates; by default this value is set equal to symprec

Examples

The following code snippet illustrates how to enumerate supercells with up to 6 atoms in the unit cell:

>>> from ase.build import bulk
>>> prim = bulk('Ag')
>>> for supercell in enumerate_supercells(structure=prim, sizes=range(1, 7)):
...     pass # Do something with the supercell

Return type

Atoms

## Generation of special structures¶

icet.tools.structure_generation.generate_sqs(cluster_space, max_size, target_concentrations, include_smaller_cells=True, pbc=None, T_start=5.0, T_stop=0.001, n_steps=None, optimality_weight=1.0, random_seed=None, tol=1e-05)[source]

Given a cluster_space, generate a special quasirandom structure (SQS), i.e., a structure that for a given supercell size provides the best possible approximation to a random alloy [ZunWeiFer90].

In the present case, this means that the generated structure will have a cluster vector that as closely as possible matches the cluster vector of an infintely large randomly occupated supercell. Internally the function uses a simulated annealing algorithm and the difference between two cluster vectors is calculated with the measure suggested by A. van de Walle et al. in Calphad 42, 13-18 (2013) [WalTiwJon13] (for more information, see mchammer.calculators.TargetVectorCalculator).

Parameters
• cluster_space (ClusterSpace) – a cluster space defining the lattice to be occupated

• max_size (int) – maximum supercell size

• target_concentrations (dict) – concentration of each species in the target structure, per sublattice (for example {'Au': 0.5, 'Pd': 0.5} for a single sublattice Au-Pd structure, or {'A': {'Au': 0.5, 'Pd': 0.5}, 'B': {'H': 0.25, 'X': 0.75}} for a system with two sublattices. The symbols defining sublattices (‘A’, ‘B’ etc) can be found by printing the cluster_space

• include_smaller_cells (bool) – if True, search among all supercell sizes including max_size, else search only among those exactly matching max_size

• pbc (Union[Tuple[bool, bool, bool], Tuple[int, int, int], None]) – Periodic boundary conditions for each direction, e.g., (True, True, False). The axes are defined by the cell of cluster_space.primitive_structure. Default is periodic boundary in all directions.

• T_start (float) – artificial temperature at which the simulated annealing starts

• T_stop (float) – artifical temperature at which the simulated annealing stops

• n_steps (Optional[int]) – total number of Monte Carlo steps in the simulation

• optimality_weight (float) – controls weighting $$L$$ of perfect correlations, see mchammer.calculators.TargetVectorCalculator

• random_seed (Optional[int]) – seed for the random number generator used in the Monte Carlo simulation

• tol (float) – Numerical tolerance

Return type

Atoms

icet.tools.structure_generation.generate_sqs_by_enumeration(cluster_space, max_size, target_concentrations, include_smaller_cells=True, pbc=None, optimality_weight=1.0, tol=1e-05)[source]

Given a cluster_space, generate a special quasirandom structure (SQS), i.e., a structure that for a given supercell size provides the best possible approximation to a random alloy [ZunWeiFer90].

In the present case, this means that the generated structure will have a cluster vector that as closely as possible matches the cluster vector of an infintely large randomly occupied supercell. Internally the function uses a simulated annealing algorithm and the difference between two cluster vectors is calculated with the measure suggested by A. van de Walle et al. in Calphad 42, 13-18 (2013) [WalTiwJon13] (for more information, see mchammer.calculators.TargetVectorCalculator).

This functions generates SQS cells by exhaustive enumeration, which means that the generated SQS cell is guaranteed to be optimal with regard to the specified measure and cell size.

Parameters
• cluster_space (ClusterSpace) – a cluster space defining the lattice to be occupied

• max_size (int) – maximum supercell size

• target_concentrations (dict) – concentration of each species in the target structure, per sublattice (for example {'Au': 0.5, 'Pd': 0.5} for a single sublattice Au-Pd structure, or {'A': {'Au': 0.5, 'Pd': 0.5}, 'B': {'H': 0.25, 'X': 0.75}} for a system with two sublattices. The symbols defining sublattices (‘A’, ‘B’ etc) can be found by printing the cluster_space

• include_smaller_cells (bool) – if True, search among all supercell sizes including max_size, else search only among those exactly matching max_size

• pbc (Union[Tuple[bool, bool, bool], Tuple[int, int, int], None]) – Periodic boundary conditions for each direction, e.g., (True, True, False). The axes are defined by the cell of cluster_space.primitive_structure. Default is periodic boundary in all directions.

• optimality_weight (float) – controls weighting $$L$$ of perfect correlations, see mchammer.calculators.TargetVectorCalculator

• tol (float) – Numerical tolerance

Return type

Atoms

icet.tools.structure_generation.generate_sqs_from_supercells(cluster_space, supercells, target_concentrations, T_start=5.0, T_stop=0.001, n_steps=None, optimality_weight=1.0, random_seed=None, random_start=True, tol=1e-05)[source]

Given a cluster_space and one or more supercells, generate a special quasirandom structure (SQS), i.e., a structure that for the provided supercells size provides the best possible approximation to a random alloy [ZunWeiFer90].

In the present case, this means that the generated structure will have a cluster vector that as closely as possible matches the cluster vector of an infintely large randomly occupated supercell. Internally the function uses a simulated annealing algorithm and the difference between two cluster vectors is calculated with the measure suggested by A. van de Walle et al. in Calphad 42, 13-18 (2013) [WalTiwJon13] (for more information, see mchammer.calculators.TargetVectorCalculator).

Parameters
• cluster_space (ClusterSpace) – a cluster space defining the lattice to be occupated

• supercells (List[Atoms]) – list of one or more supercells among which an optimal structure will be searched for

• target_concentrations (dict) – concentration of each species in the target structure, per sublattice (for example {'Au': 0.5, 'Pd': 0.5} for a single sublattice Au-Pd structure, or {'A': {'Au': 0.5, 'Pd': 0.5}, 'B': {'H': 0.25, 'X': 0.75}} for a system with two sublattices. The symbols defining sublattices (‘A’, ‘B’ etc) can be found by printing the cluster_space

• T_start (float) – artificial temperature at which the simulated annealing starts

• T_stop (float) – artifical temperature at which the simulated annealing stops

• n_steps (Optional[int]) – total number of Monte Carlo steps in the simulation

• optimality_weight (float) – controls weighting $$L$$ of perfect correlations, see mchammer.calculators.TargetVectorCalculator

• random_seed (Optional[int]) – seed for the random number generator used in the Monte Carlo simulation and used for initializing the occupation of the supercells if random_start is True

• random_start (bool) – randomly occupy starting structure, can be disabled if the user prefers to pass an initial structure

• tol (float) – Numerical tolerance

Return type

Atoms

icet.tools.structure_generation.generate_target_structure(cluster_space, max_size, target_concentrations, target_cluster_vector, include_smaller_cells=True, pbc=None, T_start=5.0, T_stop=0.001, n_steps=None, optimality_weight=1.0, random_seed=None, tol=1e-05)[source]

Given a cluster_space and a target_cluster_vector, generate a structure that as closely as possible matches that cluster vector. The search is performed among all inequivalent supercells shapes up to a certain size.

Internally the function uses a simulated annealing algorithm and the difference between two cluster vectors is calculated with the measure suggested by A. van de Walle et al. in Calphad 42, 13-18 (2013) [WalTiwJon13] (for more information, see mchammer.calculators.TargetVectorCalculator).

Parameters
• cluster_space (ClusterSpace) – a cluster space defining the lattice to be occupied

• max_size (int) – maximum supercell size

• target_concentrations (dict) – concentration of each species in the target structure, per sublattice (for example {'Au': 0.5, 'Pd': 0.5} for a single sublattice Au-Pd structure, or {'A': {'Au': 0.5, 'Pd': 0.5}, 'B': {'H': 0.25, 'X': 0.75}} for a system with two sublattices. The symbols defining sublattices (‘A’, ‘B’ etc) can be found by printing the cluster_space

• target_cluster_vector (List[float]) – cluster vector that the generated structure should match as closely as possible

• include_smaller_cells (bool) – if True, search among all supercell sizes including max_size, else search only among those exactly matching max_size

• pbc (Union[Tuple[bool, bool, bool], Tuple[int, int, int], None]) – Periodic boundary conditions for each direction, e.g., (True, True, False). The axes are defined by the cell of cluster_space.primitive_structure. Default is periodic boundary in all directions.

• T_start (float) – artificial temperature at which the simulated annealing starts

• T_stop (float) – artifical temperature at which the simulated annealing stops

• n_steps (Optional[int]) – total number of Monte Carlo steps in the simulation

• optimality_weight (float) – controls weighting $$L$$ of perfect correlations, see mchammer.calculators.TargetVectorCalculator

• random_seed (Optional[int]) – seed for the random number generator used in the Monte Carlo simulation

• tol (float) – Numerical tolerance

Return type

Atoms

icet.tools.structure_generation.generate_target_structure_from_supercells(cluster_space, supercells, target_concentrations, target_cluster_vector, T_start=5.0, T_stop=0.001, n_steps=None, optimality_weight=1.0, random_seed=None, random_start=True, tol=1e-05)[source]

Given a cluster_space and a target_cluster_vector and one or more supercells, generate a structure that as closely as possible matches that cluster vector.

Internally the function uses a simulated annealing algorithm and the difference between two cluster vectors is calculated with the measure suggested by A. van de Walle et al. in Calphad 42, 13-18 (2013) [WalTiwJon13] (for more information, see mchammer.calculators.TargetVectorCalculator).

Parameters
• cluster_space (ClusterSpace) – a cluster space defining the lattice to be occupied

• supercells (List[Atoms]) – list of one or more supercells among which an optimal structure will be searched for

• target_concentrations (dict) – concentration of each species in the target structure, per sublattice (for example {'Au': 0.5, 'Pd': 0.5} for a single sublattice Au-Pd structure, or {'A': {'Au': 0.5, 'Pd': 0.5}, 'B': {'H': 0.25, 'X': 0.75}} for a system with two sublattices. The symbols defining sublattices (‘A’, ‘B’ etc) can be found by printing the cluster_space

• target_cluster_vector (List[float]) – cluster vector that the generated structure should match as closely as possible

• T_start (float) – artificial temperature at which the simulated annealing starts

• T_stop (float) – artifical temperature at which the simulated annealing stops

• n_steps (Optional[int]) – total number of Monte Carlo steps in the simulation

• optimality_weight (float) – controls weighting $$L$$ of perfect correlations, see mchammer.calculators.TargetVectorCalculator

• random_seed (Optional[int]) – seed for the random number generator used in the Monte Carlo simulation and used for initializing the occupation of the supercells if random_start is True

• random_start (bool) – randomly occupy starting structure, can be disabled if the user prefers to pass an initial structure

• tol (float) – Numerical tolerance

Return type

Atoms

icet.tools.structure_generation.occupy_structure_randomly(structure, cluster_space, target_concentrations, random_seed=None)[source]

Occupy a structure with quasirandom order but fulfilling target_concentrations.

Parameters
• structure (Atoms) – ASE Atoms object that will be occupied randomly

• cluster_space (ClusterSpace) – cluster space (needed as it carries information about sublattices)

• target_concentrations (dict) – concentration of each species in the target structure, per sublattice (for example {'Au': 0.5, 'Pd': 0.5} for a single sublattice Au-Pd structure, or {'A': {'Au': 0.5, 'Pd': 0.5}, 'B': {'H': 0.25, 'X': 0.75}} for a system with two sublattices. The symbols defining sublattices (‘A’, ‘B’ etc) can be found by printing the cluster_space

• random_seed (Optional[int]) – seed for the random number generator)

Return type

None

## Ground state finder¶

class icet.tools.ground_state_finder.GroundStateFinder(cluster_expansion, structure, solver_name=None, verbose=True)[source]

This class provides functionality for determining the ground states using a binary cluster expansion. This is efficiently achieved through the use of mixed integer programming (MIP) as developed by Larsen et al. in Phys. Rev. Lett. 120, 256101 (2018).

This class relies on the Python-MIP package. Python-MIP can be used together with Gurobi, which is not open source but issues academic licenses free of charge. Pleaase note that Gurobi needs to be installed separately. The GroundStateFinder works also without Gurobi, but if performance is critical, Gurobi is highly recommended.

Warning

In order to be able to use Gurobi with python-mip one must ensure that GUROBI_HOME should point to the installation directory (<installdir>):

export GUROBI_HOME=<installdir>


Note

The current implementation only works for binary systems.

Parameters
• cluster_expansion (ClusterExpansion) – cluster expansion for which to find ground states

• structure (Atoms) – atomic configuration

• solver_name (str, optional) – ‘gurobi’, alternatively ‘grb’, or ‘cbc’, searches for available solvers if not informed

• verbose (bool, optional) – whether to display solver messages on the screen (default: True)

Example

The following snippet illustrates how to determine the ground state for a Au-Ag alloy. Here, the parameters of the cluster expansion are set to emulate a simple Ising model in order to obtain an example that can be run without modification. In practice, one should of course use a proper cluster expansion:

>>> from ase.build import bulk
>>> from icet import ClusterExpansion, ClusterSpace

>>> # prepare cluster expansion
>>> # the setup emulates a second nearest-neighbor (NN) Ising model
>>> # (zerolet and singlet parameters are zero; only first and second neighbor
>>> # pairs are included)
>>> prim = bulk('Au')
>>> chemical_symbols = ['Ag', 'Au']
>>> cs = ClusterSpace(prim, cutoffs=[4.3], chemical_symbols=chemical_symbols)
>>> ce = ClusterExpansion(cs, [0, 0, 0.1, -0.02])

>>> # prepare initial configuration
>>> structure = prim.repeat(3)

>>> # set up the ground state finder and calculate the ground state energy
>>> gsf = GroundStateFinder(ce, structure)
>>> ground_state = gsf.get_ground_state({'Ag': 5})
>>> print('Ground state energy:', ce.predict(ground_state))


Finds the ground state for a given structure and species count, which refers to the count_species, if provided when initializing the instance of this class, or the first species in the list of chemical symbols for the active sublattice.

Parameters
• species_count (Optional[Dict[str, int]]) – dictionary with count for one of the species on each active sublattice. If no count is provided for a sublattice, the concentration is allowed to vary.

• max_seconds (float) – maximum runtime in seconds (default: inf)

• threads (int) – number of threads to be used when solving the problem, given that a positive integer has been provided. If set to 0 the solver default configuration is used while -1 corresponds to all available processing cores.

Return type

Atoms

property model: Model

Python-MIP model

Return type

Model

property optimization_status: OptimizationStatus

Optimization status

Return type

OptimizationStatus

## Convex hull construction¶

class icet.tools.ConvexHull(concentrations, energies)[source]

This class provides functionality for extracting the convex hull of the (free) energy of mixing. It is based on the convex hull calculator in SciPy.

Parameters
• concentrations (list(float) or list(list(float))) – concentrations for each structure listed as [[c1, c2], [c1, c2], ...]; for binaries, in which case there is only one independent concentration, the format [c1, c2, c3, ...] works as well.

• energies (list(float)) – energy (or energy of mixing) for each structure

concentrations

concentrations of the N structures on the convex hull

Type

np.ndarray

energies

energies of the N structures on the convex hull

Type

np.ndarray

dimensions

number of independent concentrations needed to specify a point in concentration space (1 for binaries, 2 for ternaries etc.)

Type

int

structures

indices of structures that constitute the convex hull (indices are defined by the order of their concentrations and energies are fed when initializing the ConvexHull object)

Type

list(int)

Examples

A ConvexHull object is easily initialized by providing lists of concentrations and energies:

>>> data = {'concentration': [0,    0.2,  0.2,  0.3,  0.4,  0.5,  0.8,  1.0],
...         'mixing_energy': [0.1, -0.2, -0.1, -0.2,  0.2, -0.4, -0.2, -0.1]}
>>> hull = ConvexHull(data['concentration'], data['mixing_energy'])


Now one can for example access the points along the convex hull directly:

>>> for c, e in zip(hull.concentrations, hull.energies):
...     print(c, e)
0.0 0.1
0.2 -0.2
0.5 -0.4
1.0 -0.1


or plot the convex hull along with the original data using e.g., matplotlib:

>>> import matplotlib.pyplot as plt
>>> plt.scatter(data['concentration'], data['mixing_energy'], color='darkred')
>>> plt.plot(hull.concentrations, hull.energies)
>>> plt.show(block=False)


It is also possible to extract structures at or close to the convex hull:

>>> low_energy_structures = hull.extract_low_energy_structures(
...     data['concentration'], data['mixing_energy'],
...     energy_tolerance=0.005)


A complete example can be found in the basic tutorial.

extract_low_energy_structures(concentrations, energies, energy_tolerance)[source]

Returns the indices of energies that lie within a certain tolerance of the convex hull.

Parameters
• concentrations (Union[List[float], List[List[float]]]) –

concentrations of candidate structures

If there is one independent concentration, a list of floats is sufficient. Otherwise, the concentrations must be provided as a list of lists, such as [[0.1, 0.2], [0.3, 0.1], ...].

• energies (List[float]) – energies of candidate structures

• energy_tolerance (float) – include structures with an energy that is at most this far from the convex hull

Return type

List[int]

get_energy_at_convex_hull(target_concentrations)[source]

Returns the energy of the convex hull at specified concentrations. If any concentration is outside the allowed range, NaN is returned.

Parameters

target_concentrations (Union[List[float], List[List[float]]]) –

concentrations at target points

If there is one independent concentration, a list of floats is sufficient. Otherwise, the concentrations ought to be provided as a list of lists, such as [[0.1, 0.2], [0.3, 0.1], ...].

Return type

ndarray

## Fitting with constraints¶

class icet.tools.constraints.Constraints(n_params)[source]

Class for handling linear constraints with right hand side equal to zero.

Parameters

n_params (int) – number of parameters in model

Example

The following example demonstrates fitting of a cluster expansion under the constraint that parameter 2 and parameter 4 should be equal:

>>> import numpy as np
>>> from icet.tools import Constraints
>>> from trainstation import Optimizer

>>> # Set up random sensing matrix and target "energies"
>>> n_params = 10
>>> n_energies = 20
>>> A = np.random.random((n_energies, n_params))
>>> y = np.random.random(n_energies)

>>> # Define constraints
>>> c = Constraints(n_params=n_params)
>>> M = np.zeros((1, n_params))
>>> M[0, [2, 4]] = 1

>>> # Do the actual fit and finally extract parameters
>>> A_constrained = c.transform(A)
>>> opt = Optimizer((A_constrained, y), fit_method='ridge')
>>> opt.train()
>>> parameters = c.inverse_transform(opt.parameters)


Add a constraint matrix and resolve for the constraint space

Parameters

M (ndarray) – Constraint matrix with each constraint as a row. Can (but need not be) cluster vectors.

Return type

None

inverse_transform(A)[source]

Inverse transform array from constrained parameter space to unconstrained space

Parameters

A (ndarray) – array to be inversed transformed

Return type

ndarray

transform(A)[source]

Transform array to constrained parameter space

Parameters

A (ndarray) – array to be transformed

Return type

ndarray

icet.tools.constraints.get_mixing_energy_constraints(cluster_space)[source]

A cluster expansion of mixing energy should ideally predict zero energy for concentration 0 and 1. This function constructs a Constraints object that enforces that condition during fitting.

Parameters

cluster_space (ClusterSpace) – Cluster space corresponding to cluster expansion for which constraints should be imposed

Example

This example demonstrates how to constrain the mixing energy to zero at the pure phases in a toy example with random cluster vectors and random target energies:

>>> import numpy as np
>>> from ase.build import bulk
>>> from icet import ClusterSpace
>>> from icet.tools import get_mixing_energy_constraints
>>> from trainstation import Optimizer

>>> # Set up cluster space along with random sensing matrix and target "energies"
>>> prim = bulk('Au')
>>> cs = ClusterSpace(prim, cutoffs=[6.0, 5.0], chemical_symbols=['Au', 'Ag'])
>>> n_params = len(cs)
>>> n_energies = 20
>>> A = np.random.random((n_energies, n_params))
>>> y = np.random.random(n_energies)

>>> # Define constraints
>>> c = get_mixing_energy_constraints(cs)

>>> # Do the actual fit and finally extract parameters
>>> A_constrained = c.transform(A)
>>> opt = Optimizer((A_constrained, y), fit_method='ridge')
>>> opt.train()
>>> parameters = c.inverse_transform(opt.parameters)


Warning

Constraining the energy of one structure is always done at the expense of the fit quality of the others. Always expect that your cross-validation scores will increase somewhat when using this function.

Return type

Constraints

## Constituent strain¶

class icet.tools.ConstituentStrain(supercell, primitive_structure, chemical_symbols, concentration_symbol, strain_energy_function, k_to_parameter_function=None, damping=1.0, tol=1e-06)[source]

Class for handling constituent strain in cluster expansions (see Laks et al., Phys. Rev. B 46, 12587 (1992) [LakFerFro92]). This makes it possible to use cluster expansions to describe systems with strain due to, for example, coherent phase separation. For an extensive example on how to use this module, please see this example.

Parameters
• supercell (Atoms) – Defines supercell that will be used when calculating constituent strain.

• primitive_structure (Atoms) – The primitive structure that supercell is based on

• chemical_symbols (List[str]) – List with chemical symbols involved, such as ['Ag', 'Cu']

• concentration_symbol (str) – Chemical symbol used to define concentration, such as 'Ag'

• strain_energy_function (Callable[[float, List[float]], float]) – A function that takes two arguments, a list of parameters and concentration (e.g., [0.5, 0.5, 0.5] and 0.3), and returns the corresponding strain energy. The parameters are in turn determined by k_to_parameter_function (see below). If k_to_parameter_function is None, the parameters list will be the k point. For more information, see this example.

• k_to_parameter_function (Optional[Callable[[List[float]], List[float]]]) – A function that takes a k point as a list of three floats and returns a parameter vector that will be fed into the strain_energy_function (see above). If None, the k point itself will be the parameter vector to strain_energy_function. The purpose of this function is to be able to precompute any factor in the strain energy that depends on k point but not concentration. For more information, see this example.

• damping (float) – Damping factor $$\eta$$ used to suppress impact of large-magnitude k points by multiplying strain with $$\exp(-(\eta \mathbf{k})^2)$$ (unit Angstrom)

• tol (float) – Numerical tolerance when comparing k points (units of inverse Angstrom)

accept_change()[source]

Update structure factor for each kpoint to the value in structure_factor_after. This makes it possible to efficiently calculate changes in constituent strain with the get_constituent_strain_change function; this function should be called if the last occupations used to call get_constituent_strain_change should be the starting point for the next call of get_constituent_strain_change. This is taken care of automatically by the Monte Carlo simulations in mchammer.

Return type

None

get_concentration(occupations)[source]

Calculate current concentration.

occupations

Current occupations

Return type

float

get_constituent_strain(occupations)[source]

Calculate total constituent strain.

occupations

Current occupations

Return type

float

get_constituent_strain_change(occupations, atom_index)[source]

Calculate change in constituent strain upon change of the occupation of one site.

Warning

This function is dependent on the internal state of the ConstituentStrain object and should typically only be used internally by mchammer. Specifically, the structure factor is saved internally to speed up computation. The first time this function is called, occupations must be the same array as was used to initialize the ConstituentStrain object, or the same as was last used when get_constituent_strain was called. After the present function has been called, the same occupations vector need to be used the next time as well, unless accept_change has been called, in which case occupations should incorporate the changes implied by the previous call to the function.

Parameters
• occupations (ndarray) – Occupations before change

• atom_index (int) – Index of site the occupation of which is to be changed

Return type

float

## Other structure tools¶

icet.tools.get_primitive_structure(structure, no_idealize=True, to_primitive=True, symprec=1e-05)[source]

Returns the primitive structure using spglib.

Parameters
• structure (Atoms) – input atomic structure

• no_idealize (bool) – if True lengths and angles are not idealized

• to_primitive (bool) – convert to primitive structure

• symprec (float) – tolerance imposed when analyzing the symmetry using spglib

Return type

Atoms

icet.tools.get_wyckoff_sites(structure, map_occupations=None, symprec=1e-05)[source]

Returns the Wyckoff symbols of the input structure. The Wyckoff sites are of general interest for symmetry analysis but can be especially useful when setting up, e.g., a SiteOccupancyObserver. The Wyckoff labels can be conveniently attached as an array to the structure object as demonstrated in the examples section below.

By default the occupation of the sites is part of the symmetry analysis. If a chemically disordered structure is provided this will usually reduce the symmetry substantially. If one is interested in the symmetry of the underlying structure one can control how occupations are handled. To this end, one can provide the map_occupations keyword argument. The latter must be a list, each entry of which is a list of species that should be treated as indistinguishable. As a shortcut, if all species should be treated as indistinguishable one can provide an empty list. Examples that illustrate the usage of the keyword are given below.

Parameters
• structure (Atoms) – input structure, note that the occupation of the sites is included in the symmetry analysis

• map_occupations (Optional[List[List[str]]]) – each sublist in this list specifies a group of chemical species that shall be treated as indistinguishable for the purpose of the symmetry analysis

• symprec (float) – tolerance imposed when analyzing the symmetry using spglib

Examples

Wyckoff sites of a hexagonal-close packed structure:

>>> from ase.build import bulk
>>> structure = bulk('Ti')
>>> wyckoff_sites = get_wyckoff_sites(structure)
>>> print(wyckoff_sites)
['2d', '2d']


The Wyckoff labels can also be attached as an array to the structure, in which case the information is also included when storing the Atoms object:

>>> from ase.io import write
>>> structure.new_array('wyckoff_sites', wyckoff_sites, str)
>>> write('structure.xyz', structure)


The function can also be applied to supercells:

>>> structure = bulk('GaAs', crystalstructure='zincblende', a=3.0).repeat(2)
>>> wyckoff_sites = get_wyckoff_sites(structure)
>>> print(wyckoff_sites)
['4a', '4c', '4a', '4c', '4a', '4c', '4a', '4c',
'4a', '4c', '4a', '4c', '4a', '4c', '4a', '4c']


Now assume that one is given a supercell of a (Ga,Al)As alloy. Applying the function directly yields much lower symmetry since the symmetry of the original structure is broken:

>>> structure.set_chemical_symbols(
...        ['Ga', 'As', 'Al', 'As', 'Ga', 'As', 'Al', 'As',
...         'Ga', 'As', 'Ga', 'As', 'Al', 'As', 'Ga', 'As'])
>>> print(get_wyckoff_sites(structure))
['8g', '8i', '4e', '8i', '8g', '8i', '2c', '8i',
'2d', '8i', '8g', '8i', '4e', '8i', '8g', '8i']


Since Ga and Al occupy the same sublattice, they should, however, be treated as indistinguishable for the purpose of the symmetry analysis, which can be achieved via the map_occupations keyword:

>>> print(get_wyckoff_sites(structure, map_occupations=[['Ga', 'Al'], ['As']]))
['4a', '4c', '4a', '4c', '4a', '4c', '4a', '4c',
'4a', '4c', '4a', '4c', '4a', '4c', '4a', '4c']


If occupations are to ignored entirely, one can simply provide an empty list. In the present case, this turns the zincblende lattice into a diamond lattice, on which case there is only one Wyckoff site:

>>> print(get_wyckoff_sites(structure, map_occupations=[]))
['8a', '8a', '8a', '8a', '8a', '8a', '8a', '8a',
'8a', '8a', '8a', '8a', '8a', '8a', '8a', '8a']

Return type

List[str]