Structure containers

class icet.StructureContainer(cluster_space)[source]

This class serves as a container for structure objects as well as their properties and cluster vectors.

Parameters:

cluster_space (ClusterSpace) – Cluster space used for evaluating the cluster vectors.

Example

The following snippet illustrates the initialization and usage of a StructureContainer object. A structure container provides convenient means for compiling the data needed to train a cluster expansion, i.e., a sensing matrix and target property values:

>>> from ase.build import bulk
>>> from icet import ClusterSpace, StructureContainer
>>> from icet.tools import enumerate_structures
>>> from random import random

>>> # create cluster space
>>> prim = bulk('Au')
>>> cs = ClusterSpace(prim, cutoffs=[7.0, 5.0],
...                   chemical_symbols=[['Au', 'Pd']])

>>> # build structure container
>>> sc = StructureContainer(cs)
>>> for structure in enumerate_structures(prim, range(5), ['Au', 'Pd']):
>>>     sc.add_structure(structure,
...                      properties={'my_random_energy': random()})
>>> print(sc)

>>> # fetch sensing matrix and target energies
>>> A, y = sc.get_fit_data(key='my_random_energy')
add_structure(structure, user_tag=None, properties=None, allow_duplicate=True, sanity_check=True)[source]

Adds a structure to the structure container.

Parameters:
  • structure (Atoms) – Atomic structure to be added.

  • user_tag (Optional[str]) – User tag for labeling structure.

  • properties (Optional[dict]) – Scalar properties. If properties are not specified the structure object will be checked for an attached ASE calculator object with a calculated potential energy.

  • allow_duplicate (bool) – Whether or not to add the structure if there already exists a structure with identical cluster vector.

  • sanity_check (bool) – Whether or not to carry out a sanity check before adding the structure. This includes checking occupations and volume.

property available_properties: List[str]

List of the available properties.

property cluster_space: ClusterSpace

Cluster space used to calculate the cluster vectors.

get_condition_number(structure_indices=None)[source]

Returns the condition number for the sensing matrix.

A very large condition number can be a sign of multicollinearity. More information can be found [here](https://en.wikipedia.org/wiki/Condition_number).

Parameters:

structure_indices (Optional[List[int]]) – List of structure indices to include. By default (None) the method will return all fit data available.

Return type:

float

Returns:

Condition number of the sensing matrix.

get_fit_data(structure_indices=None, key='energy')[source]

Returns fit data for all structures. The cluster vectors and target properties for all structures are stacked into numpy arrays.

Parameters:
  • structure_indices (Optional[List[int]]) – List of structure indices. By default (None) the method will return all fit data available.

  • key (str) – Name of property to use. If None do not include property values. This can be useful if only the fit matrix is needed.

Return type:

Tuple[ndarray, ndarray]

Returns:

Cluster vectors and target properties for desired structures.

get_structure_indices(user_tag=None)[source]

Returns indices of structures with the given user tag. This method provides a simple means for filtering structures. The user_tag is assigned when adding structures via the add_structure() method.

Parameters:

user_tag (Optional[str]) – The indices of structures with this user tag are returned.

Return type:

List[int]

Returns:

List of structure indices.

static read(infile)[source]

Reads StructureContainer object from file.

Parameters:

infile (Union[str, BinaryIO, TextIO]) – File from which to read.

to_dataframe()[source]

Summary of StructureContainer object in DataFrame format.

Return type:

DataFrame

write(outfile)[source]

Writes structure container to a file.

Parameters:

outfile (Union[str, BinaryIO, TextIO]) – Output file name or file object.

Return type:

None