Preparation of reference data

In this tutorial we will be using reference data for training and comparison. The present section provides a short description of the code to generate these data.

General preparations

Several ASE functions are required for generating a database with reference data. Specifically, ase.db.connect() and ase.build.bulk() are needed to initialize the database and the create the primitive structure, respectively, while ase.calculators.emt.EMT() and ase.optimize.BFGS() will be used relax and optimize structures. To construct the latter the icet function enumerate_structures will be employed.

from ase.build import bulk
from ase.db import connect
from ase.calculators.emt import EMT
from ase.optimize import BFGS
from icet.tools import enumerate_structures

Connect to database

The first step is to initialize the database, which will be called structures.db, as well as the primitive structure, in the form of an gold bulk unit cell. Additionally, it is decided that the enumerated structures, created in the next step, will be randomly populated with gold and silver atoms.

db = connect('structures.db')
prim = bulk('Au')
subelements = ['Ag', 'Au']

Enumeration and relaxation

The second step is to generate, relax and then add the structures to the database. This is achieved by looping over the ASE Atoms instances obtained by calling the enumerate_structures function with the primitive structure and the list of elements specified earlier as well as the list sizes, which specifies the permissible number of atoms per cell, as input arguments. Note that the original positions of all atoms are recorded. Next, a ase.calculators.emt.EMT() calculator is attached and the structure is relaxed until all forces are smaller than 0.01 eV/atom. Lastly, the relaxed structure is added to the database.

sizes = range(1, 7)
for k, atoms in enumerate(enumerate_structures(prim, sizes, subelements)):
    # skip if structure is already present in database
    if any(db.select(structure_id=k)):
        continue
    # remember the original (unrelaxed) positions
    original_positions = atoms.get_positions()
    # relax the structure
    atoms.calc = EMT()
    dyn = BFGS(atoms)
    dyn.run(fmax=0.01)
    # store energy
    relaxed_energy = atoms.get_potential_energy()
    # restore ideal positions
    atoms.set_positions(original_positions)
    # add the structure to the database
    db.write(atoms, structure_id=k, relaxed_energy=relaxed_energy)

When running this script, the ase.optimize.BFGS.run() function prints the time, energy and maximum forces after each step of the structure relaxation. In particular, the output should be a long list of entries, similar to the following:

...
      Step     Time          Energy         fmax
BFGS:    0 17:12:42       -0.024699        0.1726
BFGS:    1 17:12:42       -0.025870        0.1556
BFGS:    2 17:12:42       -0.031735        0.0157
BFGS:    3 17:12:42       -0.031748        0.0107
BFGS:    4 17:12:42       -0.031755        0.0073
      Step     Time          Energy         fmax
BFGS:    0 17:12:42       -0.020691        0.0691
BFGS:    1 17:12:42       -0.020822        0.0636
BFGS:    2 17:12:42       -0.021549        0.0004
      Step     Time          Energy         fmax
BFGS:    0 17:12:42       -0.023467        0.0000
...

Note that after relaxation the energy is stored and the positions are reset to the original ones. This is done in order to simplify the access to the original positions during construction of the cluster expansion.

Computation of mixing energy and concentration

Finally, for the sake of clarity and convenience when constructing the cluster expansion in the next step of this tutorial, the mixing energy and concentration are computed and added as fields to the database.

eref = {}
for elem in subelements:
    for row in db.select('{}=1'.format(elem), natoms=1):
        eref[elem] = row.relaxed_energy / row.natoms
        break

# step 4: compute mixing energies and add them to the database
for row in db.select():
    conc = float(row.count_atoms().get('Ag', 0)) / row.natoms
    emix = row.relaxed_energy / row.natoms
    emix -= conc * eref['Ag'] + (1.0 - conc) * eref['Au']
    db.update(row.id, emix=emix, conc=conc)

Source code

The complete source code is available in tutorial/basic/1_prepare_reference_data.py
from ase.build import bulk
from ase.db import connect
from ase.calculators.emt import EMT
from ase.optimize import BFGS
from icet.tools import enumerate_structures

# step 1: Prepare database and set up basic structure
db = connect('structures.db')
prim = bulk('Au')
subelements = ['Ag', 'Au']

# step 2: Enumerate structures, then relax and add them to the database
sizes = range(1, 7)
for k, atoms in enumerate(enumerate_structures(prim, sizes, subelements)):
    # skip if structure is already present in database
    if any(db.select(structure_id=k)):
        continue
    # remember the original (unrelaxed) positions
    original_positions = atoms.get_positions()
    # relax the structure
    atoms.calc = EMT()
    dyn = BFGS(atoms)
    dyn.run(fmax=0.01)
    # store energy
    relaxed_energy = atoms.get_potential_energy()
    # restore ideal positions
    atoms.set_positions(original_positions)
    # add the structure to the database
    db.write(atoms, structure_id=k, relaxed_energy=relaxed_energy)

# step 3: get reference energies for the elements
eref = {}
for elem in subelements:
    for row in db.select('{}=1'.format(elem), natoms=1):
        eref[elem] = row.relaxed_energy / row.natoms
        break

# step 4: compute mixing energies and add them to the database
for row in db.select():
    conc = float(row.count_atoms().get('Ag', 0)) / row.natoms
    emix = row.relaxed_energy / row.natoms
    emix -= conc * eref['Ag'] + (1.0 - conc) * eref['Au']
    db.update(row.id, emix=emix, conc=conc)