# Structure enumeration¶

To train a cluster expansion, one needs a set of symmetrically inequivalent structures and their corresponding energy (or another property of interest). It is sometimes possible to generate such structures by random occupation of a supercell. In many systems, however, a much better approach is to generate all symmetrically inequivalent occupations up to a certain supercell size. This process is usually referred to as structure enumeration. Enumeration is useful both for generating a set of small supercells without duplicates and for systematically searching for ground states once a cluster expansion is fitted. The present tutorial demonstrates the use of the structure enumeration tool in icet.

## Import modules¶

The enumerate_structures function needs to be imported together with some additional functions from ASE.

from ase import Atom
from ase.db import connect
from icet.tools import enumerate_structures



## Generate binary structures¶

Before being able to perform the structural enumeration, it is first necessary to generate a primitive structure. In this case, an Au fcc ase.Atoms object is created using the bulk() function. Then a database AuPd-fcc.db is initialized, in which the enumerated structures will be stored. All possible binary Au/Pd structures with up to 6 atoms per unit cell are subsequently generated and stored in this database.

primitive_structure = bulk('Au')
db = connect('AuPd-fcc.db')
for structure in enumerate_structures(primitive_structure,
range(1, 7),
['Pd', 'Au']):
db.write(structure)



## Generate binary structures in the dilute limit¶

The number of distinct structures grows extremely quickly with the size of the supercell. It is thus not possible to enumerate too large cell sizes. When the number of structures grows, a larger and larger proportion of the structures will have equal amounts of the constituent elements (e.g., most structures will have concentrations close to 50% in binary systems). Structures in the dilute limit may thus be underrepresented. To overcome this problem, it is possible to enumerate structures in a specified concentration regime by providing a dict in which the range of allowed concentrations is specified for one or more of the elements in the system. Concentration is here always defined as the number of atoms of the specified element divided by the total number of atoms in the structure, without respect to site restrictions. Please note that for very large systems, concentration restricted enumeration may still be prohibitively time or memory consuming even if the number of structures in the specified concentration regime is small.

conc_rest = {'Au': (0, 0.1)}
for structure in enumerate_structures(primitive_structure,
range(10, 14),
['Pd', 'Au'],
concentration_restrictions=conc_rest):
db.write(structure)



## Generate structures with vacancies¶

The steps above are now repeated to enumerate all palladium hydride structures based on up to four primitive cells, which contain up to 4 Pd atoms and between 0 and 4 H atoms. Vacancies, represented by vanadium, are included, which results in a ternary system. The structures thus obtained are stored in a database named PdHVac-fcc.db.

a = 4.0
primitive_structure = bulk('Au', a=a)
primitive_structure.append(Atom('H', (a / 2, a / 2, a / 2)))
species = [['Pd'], ['H', 'V']]
db = connect('PdHVac-fcc.db')
for structure in enumerate_structures(primitive_structure, range(1, 5), species):
db.write(structure)



## Generate surface slabs with adsorbates¶

Lower dimensional systems can also be enumerated. Here, this is demonstrated with a copper surface with oxygen atoms adsorbed in hollow sites on a {111} surface. The key to trigger a two- or one-dimensional enumeration is to make sure that the periodic boundary conditions of the input structure reflect the desired behavior. For the surface system, this means that the the boundary conditions are not periodic in the direction of the normal to the surface. This is the default behavior with ASE:s surface building functions, but is in the below example enforced for clarity.

primitive_structure = fcc111('Cu', (1, 1, 5), vacuum=10.0)
primitive_structure.pbc = [True, True, False]
species = []
for atom in primitive_structure:
if atom.symbol == 'Cu':
species.append(['Cu'])
else:
species.append(['O', 'H'])
for structure in enumerate_structures(primitive_structure, range(1, 5), species):
db.write(structure)


## Source code¶

The complete source code is available in examples/enumerate_structures.py
"""
This example demonstrates how to enumerate structures, i.e. how to
generate all inequivalent structures derived from a primitive
structure up to a certain size.
"""

# Import modules
from ase import Atom
from ase.db import connect
from icet.tools import enumerate_structures

# Generate all binary fcc structures with up to 6 atoms/cell
# and save them in a database
primitive_structure = bulk('Au')
db = connect('AuPd-fcc.db')
for structure in enumerate_structures(primitive_structure,
range(1, 7),
['Pd', 'Au']):
db.write(structure)

# Generate fcc structures in the dilute limit
conc_rest = {'Au': (0, 0.1)}
for structure in enumerate_structures(primitive_structure,
range(10, 14),
['Pd', 'Au'],
concentration_restrictions=conc_rest):
db.write(structure)

# Enumerate all palladium hydride structures with up to 4 primitive
# cells (= up to 4 Pd atoms and between 0 and 4 H atoms). We want to
# specify that one site should always be Pd while the other can be
# either a hydrogen or a vacancy (vanadium will serve as our "vacancy")
a = 4.0
primitive_structure = bulk('Au', a=a)
primitive_structure.append(Atom('H', (a / 2, a / 2, a / 2)))
species = [['Pd'], ['H', 'V']]
db = connect('PdHVac-fcc.db')
for structure in enumerate_structures(primitive_structure, range(1, 5), species):
db.write(structure)

# Enumerate a copper surface with oxygen adsorbates (or vacancies) in
# fcc and hcp hollow sites.
primitive_structure = fcc111('Cu', (1, 1, 5), vacuum=10.0)
primitive_structure.pbc = [True, True, False]