Wang-Landau simulations#

This example illustrates how to carry out Wang-Landau (WL) simulations with icet. Compared to Monte Carlo (MC) simulations in conventional thermodynamic ensembles there are a few differences to be aware of. It is therefore strongly recommended to work through this entire example before running your own WL simulations.

Background#

When sampling a thermodynamic ensemble, say the canonical or semi-grand canonical ones, one conducts MC simulations at a given temperature, commonly in order to gather thermodynamic averages of some observables. In order to obtain the temperature dependence of these quantities one has to conduct a series of simulations at different temperatures. In particular in the vicinity of continuous phase transitions these simulations can become tedious or close to impossible to converge.

Here, the WL algorithm provides an alternative approach. It allows one to extract the microcanonical density of states, from which many other thermodynamic quantities can be calculated [WanLan01a] [LanTsaExl04]. Usually the density of states (DOS) is acquired as a function of the energy. To this end, the WL algorithm accumulates both the microcanonical entropy \(S(E)\) and a histogram \(H(E)\) on an energy grid with a predefined spacing (see argument energy_spacing of WangLandauEnsemble).

The algorithm is initialized as follows.

Generate an initial configuration.
Initialize counters for the microcanonical entropy \(S(E)\) and the histogram \(H(E)\) to zero.
Set the fill factor \(f=1\).

It then proceeds as follows.

Propose a new configuration (see argument trial_move of WangLandauEnsemble).
Accept or reject the new configuration with probability

\[P = \min \{ 1, \, \exp [ S(E_\mathrm{new}) - S(E_\mathrm{cur}) ] \},\]

where \(E_\mathrm{cur}\) and \(E_\mathrm{new}\) are the energies of the current and new configurations, respectively.
Update the microcanonical entropy \(S(E)\leftarrow S(E) + f\) and histogram \(H(E) \leftarrow H(E) + 1\) where \(E\) is the energy of the system at the end of the move.
Check the flatness of the histogram \(H(E)\). If \(H(E) > \chi \langle H(E)\rangle\,\forall E\) reset the histogram \(H(E) = 0\) and reduce the fill factor \(f \leftarrow f / 2\). The parameter \(\chi\) is set via the argument flatness_limit of WangLandauEnsemble.
If \(f\) is smaller than a certain threshold (commonly between \(10^{-8}\) and \(10^{-6}\), see argument fill_factor_limit of WangLandauEnsemble), terminate the loop, otherwise return to 1.

The microcanonical entropy \(S(E)\) and the histogram along with related information are written to the data container every time \(f\) is updated. Using the density \(\rho(E) = \exp S(E)\) one can then readily compute various thermodynamic quantities, including, e.g., the average energy:

\[\left<E\right>_T = \frac{\sum_E E \rho(E) \exp(-E / k_B T)}{ \sum_E \rho(E) \exp(-E / k_B T)}\]

Similarly, it is also possible to compute averages and standard deviations of any other observable in the data container.

2D Ising model#

The two-dimensional Ising model is well suited for demonstrating the utility of the WL algorithm and has been extensively studied in the literature [WanLan01a] [WanLan01b] [LanTsaExl04]. The model exhibits a continuous phase transition, which occurs at \(T_c = 2 J / k_B \ln (1 + \sqrt{2}) \approx 2.26919 J / k_B\) in the infinite system-size limit. In this example, we use the 2D Ising model as a computationally inexpensive toy model.

The following code generates a cluster expansion that represents the 2D Ising model. Here, Au and Ag are used as dummy species. Internally they are represented by \(0\) and \(1\), while in the Ising model the species are represented by spins of \(+1\) and \(-1\). The effective cluster interaction for the first-nearest neighbor pair is therefore set to \(J=2\) (as opposed to \(J=1\)). By means of this transformation the energy/temperature scales remains invariant.

prim = Atoms('Au', positions=[[0, 0, 0]], cell=[1, 1, 10], pbc=True)
cs = ClusterSpace(prim, cutoffs=[1.01], chemical_symbols=['Ag', 'Au'])
ce = ClusterExpansion(cs, [0, 0, 2])

Running a WL simulation#

For computational convenience, we consider a very small system of only \(4\times4 = 16\) sites.

structure = prim.repeat((4, 4, 1))
for k in range(8):
    structure[k].symbol = 'Ag'

A WL simulation is set up similar to a thermodynamic ensemble.

calculator = ClusterExpansionCalculator(structure, ce)
mc = WangLandauEnsemble(structure=structure,
                        calculator=calculator,
                        energy_spacing=1,
                        dc_filename='wl_n16.dc',
                        ensemble_data_write_interval=len(structure)*10,
                        trajectory_write_interval=len(structure)*100,
                        data_container_write_period=120)

In the case of this very simple model, the energy spectrum is discrete and the choice of the energy spacing is straightforward (energy_spacing=1). While for arbitrary CEs the energy spectrum is technically discrete it is practically continuous and the choice of the energy spacing requires more care, as illustrated below.

We also note that the intervals at which data is written to the data container are set to 10 MCS (ensemble_data_write_interval) and 100 MCS (trajectory_write_interval. Often WL simulations need to be run for a considerable number of steps. It is not uncommon that one requires on the order of 10 to 100 million MCS (see below). As a result, data containers can become quite large, which can lead to a notable memory footprint and slow down analysis. It is therefore recommend to choose the write_interval parameters not too small.

We also attach a short-range order observer in order to illustrate the analysis of observables other than the energy:

obs = BinaryShortRangeOrderObserver(cluster_space=cs,
                                    structure=structure,
                                    interval=len(structure)*10,
                                    radius=1.01)
mc.attach_observer(obs)

And then run the WL simulation for a number of MCS:

mc.run(number_of_trial_steps=len(structure)*int(2e5))

Analyzing a WL simulation#

The data container file contains all the information that is needed for extracting thermodynamic data from a WL simulation. In addition to the regular fields found in most data containers (e.g., mctrial, potential, or acceptance_ratio), a WL data container contains the fields fill_factor, which is the current value of \(f\), histogram, and entropy. These data are not written at every step as they consume quite a lot of space. Rather they are only added to the data container when the fill factor \(f\) is updated or the maximum number of MC trial steps has been reached. While the entropy data and fill_factor values are recorded during the entire simulation, the histogram data is only stored for the last state.

icet provides support functions to make the simplify analysis of these data. One can for example extract the (relative) entropy as well as the DOS. In particular, it is possible to check the convergence of the properties of interest by specifying a fill_factor_limit to the analysis function, in this case get_density_of_states_wl. By doing so, the results obained corresponds to data collected up to the point when the fill_factor reached below the specified limit.

fill_factor_limits = [10 ** -i for i in range(1, 6)]
for ffl in fill_factor_limits:
    df, _ = get_density_of_states_wl(dc, fill_factor_limit=ffl)

For the present, very small system, the number of energy bins is small and the DOS is relatively rough.

../_images/wang_landau_density.svg — Density of states from Wang-Landau simulation of \(4\times4\) two-dimensional Ising model.#

Availability of the DOS enables one for example to compute the heat capacity as a function of temperature using the following relations:

\[\begin{split}\left<E^n\right> = \frac{\sum_E E^n \rho(E) \exp(-E / k_B T)}{ \sum_E \rho(E) \exp(-E / k_B T)} \\ C_v = \left( \left<E^2\right> - \left<E\right>^2 \right) \Big/ k_B T^2\end{split}\]

The mean and the standard deviation of the energy as well as thermodynamic averages of other observables can be extracted using get_average_observables_wl:

for f, ffl in enumerate(fill_factor_limits):
    df = get_average_observables_wl(dc,
                                    temperatures=np.arange(0.4, 6, 0.05),
                                    observables=['sro_Ag_1'],
                                    boltzmann_constant=1,
                                    fill_factor_limit=ffl)

The heat capacity obtained using the WL approach matches those from a series of MC simulations in the canonical ensemble (not described here). The availability of the full DOS enables one, however, to extract the heat capacity as a practically continuous function of temperature.

../_images/wang_landau_heat_capacity_sro.svg — Heat capacity (top) and short-range order parameter (bottom) from Wang-Landau simulation of a \(4\times4\) two-dimensional Ising model.#

Using the same approach one can also extract thermodynamic averages of other observables. This is illustrated here by the first-nearest neighbor short-range order parameter. Notably the standard deviation of the latter correlates with the heat capacity, corresponding to the increase (and for an infinite system, divergence) of the correlation length at the transition temperature. Since both the density of states, the heat capacity, and the short-ranged order show small variations with respect to the fill factor limit, the calculations can be considered already rather well converged. If this had not been the case, one could have continued the simulations by rerunning the script, after having reduced the value on the fill_factor_limit.

Representative configurations#

In addition to common thermodynamic observables one can also be interested in the evolution of the structure itself as a function of temperature. To this end, icet provides the get_average_cluster_vectors_wl function, which allows one to obtain the average cluster vectors at different temperatures:

df = get_average_cluster_vectors_wl(dc,
                                    cluster_space=cs,
                                    temperatures=arange(0.1, 6.01, 0.1),
                                    boltzmann_constant=1)

The result shows a systematic variation of the pair term of the cluster vector with temperature.

../_images/wang_landau_cluster_vector.svg — Pair term of the cluster vector of the average structure as a function of temperature.#

Using the generate_target_structure function, one can then furthermore translate the cluster vector into a structure:

temperature = 0.1
target_cluster_vector = list(df[df.temperature == temperature].cv_mean[0])
target_concentrations = {'Ag': 0.5, 'Au': 0.5}
structure = generate_target_structure(cluster_space=cs,
                                      max_size=4,
                                      target_cluster_vector=target_cluster_vector,
                                      target_concentrations=target_concentrations)

The generation of structures from cluster vectors is described in much more detail in this example.

Size-dependence of transition temperature#

For the system size considered above one obtains a critical temperature of \(T_c(N=16) = 2.742 J / k_B\). This is still quite far off from the value of \(T_c(N\rightarrow\infty) = 2.26919 J / k_B\) that one obtains analytically for the infinite-size limit. Using the WL algorithm this system size dependence can be explored rather efficiently.

../_images/wang_landau_heat_capacity_size.svg — Heat capacity from Wang-Landau simulations of the two-dimensional Ising model for different system sizes.#

With increasing system size the peak in the heat capacity both sharpens and shifts to the left, approaching the infinite-size limit in the expected fashion.

Parallelizing Wang-Landau simulations#

The larger the system, the wider the energy range, the larger the number of bins. Moreover, in larger systems the ratio between maximum and minimum of the DOS increases as well. It therefore becomes increasingly tedious to converge a WL simulation. To combat this challenge one can split up the energy range in multiple segments and sample each segment in a separate WL simulation. After the algorithm has reached convergence for each segment, the DOS can be patched together. This approach has at least two crucial advantages:

The extraction of the DOS can be parallelized.
The number of steps required for converging one segment is smaller than the number of steps required to converge the DOS without binning.

icet allows one to run binned simulations by setting the energy_limit_left and energy_limit_right keyword arguments of the WangLandauEnsemble class. For very small systems such as the one considered above, binning the energy range is actually ineffective. The number of energy levels is simply too small. To demonstrate the benefit of binning, here, we therefore show results from a set of simulations for a system with \(8\times8=64\) sites, which is set up in the same way as before.

prim = Atoms('Au', positions=[[0, 0, 0]], cell=[1, 1, 10], pbc=True)
cs = ClusterSpace(prim, cutoffs=[1.01], chemical_symbols=['Ag', 'Au'])
ce = ClusterExpansion(cs, [0, 0, 2])

# Prepare initial configuration
structure = prim.repeat((8, 8, 1))
for k in range(len(structure) // 2):
    structure[k].symbol = 'Ag'

# Initalize calculator
calculator = ClusterExpansionCalculator(structure, ce)

As is detailed in a separate example, Monte Carlo simulations can be run in parallel using the multiprocessing package. The same approach can be applied in the present case, i.e. to run multiple WL simulations for separate energy bins, provided that the run script presented in the previous sections has been appropriately modified. Firstly, one needs to define a wrapper function, which sets up and runs a single simulation using the WangLandauEnsemble class, given a dictionary containing all the required arguments:

def run_simulation(args: dict) -> None:
    mc = WangLandauEnsemble(structure=args['structure'],
                            calculator=calculator,
                            energy_spacing=args['energy_spacing'],
                            energy_limit_left=args['energy_limit_left'],
                            energy_limit_right=args['energy_limit_right'],
                            fill_factor_limit=args['fill_factor_limit'],
                            flatness_limit=args['flatness_limit'],
                            dc_filename=args['dc_filename'],
                            ensemble_data_write_interval=args['ensemble_data_write_interval'],
                            trajectory_write_interval=args['trajectory_write_interval'],
                            data_container_write_period=args['data_container_write_period'])
    mc.run(number_of_trial_steps=args['number_of_trial_steps'])

In order to be able to define the input parameters for the individual simulations, one should begin by generating sets of overlapping energy bins using, e.g., the get_bins_for_parallel_simulations function. The output from the latter, in the form of a list of tuples containing the lower and upper energy limits for each of the bins, are then collected, together with all common arguments, in a list of dictionaries:

energy_spacing = 1
bins = get_bins_for_parallel_simulations(n_bins=6, energy_spacing=energy_spacing,
                                         minimum_energy=-128, maximum_energy=96)
args = []
for k, (energy_limit_left, energy_limit_right) in enumerate(bins):
    args.append({'structure': structure,
                 'energy_spacing': energy_spacing,
                 'energy_limit_left': energy_limit_left,
                 'energy_limit_right': energy_limit_right,
                 'fill_factor_limit': 1e-4,
                 'flatness_limit': 0.7,
                 'dc_filename': 'wl_k{}.dc'.format(k),
                 'ensemble_data_write_interval': len(structure)*100,
                 'trajectory_write_interval': len(structure)*1000,
                 'data_container_write_period': 120,
                 'number_of_trial_steps': len(structure)*int(2e9)})

The final step is to initiate a multiprocessing Pool object and, at the same time, decide the number of processes that will be run simultaneously. Although the latter could, potentially, be set equal to the number of available cores, one should also take the memory requirements into account. Next, the simulations are started by mapping the list of input arguments to the wrapper function:

pool = Pool(processes=4)
results = pool.map_async(run_simulation, args)
results.get()

Note

It is strongly recommended to use the functions for asynchronus mapping, specifically Pool.map_async and Pool.starmap_async. The reason for this is that these, in contrast to Pool.map and Pool.starmap, do not block the main process, which can cause some of the child processes to hang when running Monte Carlo simulations using functionalities imported from the mchammer module.

The functions that were previosuly used for analyzing the output from the serial simulations can be applied in this case as well. More precisely, this is achieved by providing a dictionary with the WangLandauDataContainer objects obtained from the separate runs to, e.g., get_density_of_states_wl, for the purpose of collecting the DOS:

dcs = {}
for i in range(6):
    dc = WangLandauDataContainer.read('wl_k{}.dc'.format(i))
    dcs[i] = dc

df, _ = get_density_of_states_wl(dcs)

An analysis of the resulting diagrams reveals that one recovers the correct density of states, as well as the heat capacity, regardless of the number of bins. In particular the distinctive features in the low-energy range, which are visible in the middle panel and correspond to the occurrence of ordered structures, are reproduced for all bin sizes.

../_images/wang_landau_binned_density.svg — Density of states from binned Wang-Landau simulations for a \(8\times8\) system using a varying number of bins \(n_b\).#

The number of Monte Carlo sweeps (MCS) required to reach convergence, here \(f<10^{-6}\), in a certain energy range, scales approximately inversely with the DOS in the respective bin. The low and high-energy regions therefore require a much larger number of MCSs than the central regions (see figure below). Moreover, the smaller the energy range (and hence the smaller the number of distinct energies) the lower the number of MCSs required to achieve convergence.

../_images/wang_landau_binned_nsteps_vs_energy.svg

../_images/wang_landau_binned_nsteps_vs_nbins.svg — Number of MCSs required to achieve convergence (\(f<10^{-6}\)) as a function of the energy from simulations using a varying number of bins \(n_b\) (top) as well as with and without a cutoff (bottom) for a \(8\times8\) system.#

This demonstrates how binning can be beneficial not only since it allows for parallelization but since a limited energy range can be sampled more quickly, leading to faster convergence.

A further reduction of computational cost can be achieved by realizing that one often does not require the DOS to be converged in the entire energy range. It is usually the low energy range that dominates under most thermodynamically relevant conditions. Under such circumstances, it is thus sufficient to only include bins up to a certain cutoff energy \(E_c\). In practice, this is achieved by only including data containers that correspond to bins for which the upper energy limit is smaller than \(E_c\). One can, for instance, use the get_average_observables_wl, to calculate the heat capacity for a range of different cutoffs:

cutoffs = [-80, -40, 0, 40]
for cutoff in cutoffs:
    dcs_cutoff = {}
    for key, dc in dcs.items():
        elr = dc.ensemble_parameters['energy_limit_right']
        if not np.isnan(elr) and elr <= cutoff:
            dcs_cutoff[key] = dc
        else:
            break
    if len(dcs_cutoff) < 1:
        continue
    df = get_average_observables_wl(dcs_cutoff,
                                    temperatures=np.arange(0.4, 6, 0.05),
                                    boltzmann_constant=1)

This analysis reveals (see diagram below) that a very good agreement is achieved up to a temperature of approximately 6, compared with the data obtained using the full energy range, even if the integration over the DOS is cut off beyond \(E_c=0\).

../_images/wang_landau_binned_heat_capacity_ecut.svg — Heat capacity obtained by integrating over the DOS up to a certain cutoff energy \(E_c\) for a \(8\times8\) system.#

This translates to a further reduction in the number of MCSs required for convergence (orange symbols in the figure presented earlier). The data above suggest that further optimization could be possible by using an inhomogeneous distribution of bin sizes with smaller/larger bin sizes at the boundaries/center of the energy range. An option that is not further explored here.

Further tips#

It is a good idea to write the data container to file in regular intervals. This can be achieved by setting the data_container_write_period keyword argument of WangLandauEnsemble.

Simulations can be easily restarted by rerunning the original run script provided a data container file has been written. In that case, the last state of the simulation will be restored from the data container.

At the beginning of the simulation, the potential of the initial configuration is not guaranteed to fall within the specified energy range. In that case, a WL simulation over the entire energy range is carried out until the targeted energy range has been reached. In order to shorten this initial “equilibration” period, one can start the simulation from suitable configurations. Since it easier to find high-density configurations, a suitable approach is to start from known ground state (or very low energy) configurations to sample the lower-energy region.

Source code#

from ase import Atoms
from icet import ClusterExpansion, ClusterSpace
from mchammer.calculators import ClusterExpansionCalculator
from mchammer.ensembles import WangLandauEnsemble
from mchammer.observers import BinaryShortRangeOrderObserver

# Prepare cluster expansion
prim = Atoms('Au', positions=[[0, 0, 0]], cell=[1, 1, 10], pbc=True)
cs = ClusterSpace(prim, cutoffs=[1.01], chemical_symbols=['Ag', 'Au'])
ce = ClusterExpansion(cs, [0, 0, 2])

# Prepare initial configuration
structure = prim.repeat((4, 4, 1))
for k in range(8):
    structure[k].symbol = 'Ag'

# Set up WL simulation
calculator = ClusterExpansionCalculator(structure, ce)
mc = WangLandauEnsemble(structure=structure,
                        calculator=calculator,
                        energy_spacing=1,
                        dc_filename='wl_n16.dc',
                        ensemble_data_write_interval=len(structure)*10,
                        trajectory_write_interval=len(structure)*100,
                        data_container_write_period=120)

# Add short-range order observer
obs = BinaryShortRangeOrderObserver(cluster_space=cs,
                                    structure=structure,
                                    interval=len(structure)*10,
                                    radius=1.01)
mc.attach_observer(obs)

# Run WL simulation
mc.run(number_of_trial_steps=len(structure)*int(2e5))

import matplotlib.pyplot as plt
import numpy as np
from mchammer import WangLandauDataContainer
from mchammer.data_containers import (get_average_observables_wl,
                                      get_density_of_states_wl)

# Read data container
dc = WangLandauDataContainer.read('wl_n16.dc')

# Set up DOS plot
_, ax = plt.subplots()

# Extract and plot the DOS
fill_factor_limits = [10 ** -i for i in range(1, 6)]
for ffl in fill_factor_limits:
    df, _ = get_density_of_states_wl(dc, fill_factor_limit=ffl)

    # Plot DOS
    ax.semilogy(df.energy, df.density, marker='o',
                label=r'$f\leq10^{{{:0.0f}}}$'.format(np.log10(ffl)))

# Add labels and legends
ax.set_xlabel('Energy')
ax.set_ylabel('Density of states')
ax.legend()
plt.savefig('wang_landau_density.svg', bbox_inches='tight')

# Set up plot of heat capacity and short-range order parameter
fig, axes = plt.subplots(nrows=2, sharex=True)
n_atoms = dc.ensemble_parameters['n_atoms']

# Compute thermodynamic averages and plot the results
for f, ffl in enumerate(fill_factor_limits):
    df = get_average_observables_wl(dc,
                                    temperatures=np.arange(0.4, 6, 0.05),
                                    observables=['sro_Ag_1'],
                                    boltzmann_constant=1,
                                    fill_factor_limit=ffl)

    # Plot heat capacity
    line = axes[0].plot(df.temperature,
                        df.potential_std ** 2 / df.temperature ** 2 / n_atoms,
                        label=r'$f\leq10^{{{:0.0f}}}$'.format(np.log10(ffl)))

    # Plot short-range order parameters
    color = line[0].get_color()
    if f == 0:
        axes[1].plot(df.temperature, df.sro_Ag_1_mean, color=color,
                     linestyle='-', label='mean')
        axes[1].plot(df.temperature, df.sro_Ag_1_std, color=color,
                     linestyle='--', label='stddev')
    else:
        axes[1].plot(df.temperature, df.sro_Ag_1_mean, color=color,
                     linestyle='-')
        axes[1].plot(df.temperature, df.sro_Ag_1_std, color=color,
                     linestyle='--')

# Add labels and legends
axes[0].set_xlabel('Temperature')
axes[0].set_ylabel('Heat capacity')
axes[1].set_xlabel('Temperature')
axes[1].set_ylabel('Short-range order parameter')
axes[0].legend()
axes[1].legend()
plt.subplots_adjust(hspace=0)
plt.savefig('wang_landau_heat_capacity_sro.svg', bbox_inches='tight')

import matplotlib.pyplot as plt
import numpy as np
from ase import Atoms
from icet import ClusterSpace
from icet.tools.structure_generation import generate_target_structure
from mchammer import WangLandauDataContainer
from mchammer.data_containers import get_average_cluster_vectors_wl
from numpy import arange

# Read data container
dc = WangLandauDataContainer.read('wl_n16.dc')

# Set up cluster space
prim = Atoms('Au', positions=[[0, 0, 0]], cell=[1, 1, 10], pbc=True)
cs = ClusterSpace(prim, cutoffs=[1.01], chemical_symbols=['Ag', 'Au'])

# Get average cluster vectors
df = get_average_cluster_vectors_wl(dc,
                                    cluster_space=cs,
                                    temperatures=arange(0.1, 6.01, 0.1),
                                    boltzmann_constant=1)

# Plot pair
_, ax = plt.subplots()
ax.set_xlabel('Temperature')
ax.set_ylabel('Pair term of cluster vector')
ax.plot(df.temperature, np.array(list(df.cv_std)).T[2], label='stddev')
ax.plot(df.temperature, np.array(list(df.cv_mean)).T[2], label='mean')
ax.legend()
plt.savefig('wang_landau_cluster_vector_vs_temperature.svg', bbox_inches='tight')

# Get low(est) energy structure
temperature = 0.1
target_cluster_vector = list(df[df.temperature == temperature].cv_mean[0])
target_concentrations = {'Ag': 0.5, 'Au': 0.5}
structure = generate_target_structure(cluster_space=cs,
                                      max_size=4,
                                      target_cluster_vector=target_cluster_vector,
                                      target_concentrations=target_concentrations)

from multiprocessing import Pool
from ase import Atoms
from icet import ClusterExpansion, ClusterSpace
from mchammer.calculators import ClusterExpansionCalculator
from mchammer.ensembles import WangLandauEnsemble
from mchammer.ensembles.wang_landau_ensemble import get_bins_for_parallel_simulations


# Define a function that runs a WL simulation for one set of parameters
def run_simulation(args: dict) -> None:
    mc = WangLandauEnsemble(structure=args['structure'],
                            calculator=calculator,
                            energy_spacing=args['energy_spacing'],
                            energy_limit_left=args['energy_limit_left'],
                            energy_limit_right=args['energy_limit_right'],
                            fill_factor_limit=args['fill_factor_limit'],
                            flatness_limit=args['flatness_limit'],
                            dc_filename=args['dc_filename'],
                            ensemble_data_write_interval=args['ensemble_data_write_interval'],
                            trajectory_write_interval=args['trajectory_write_interval'],
                            data_container_write_period=args['data_container_write_period'])
    mc.run(number_of_trial_steps=args['number_of_trial_steps'])


# Prepare cluster expansion
prim = Atoms('Au', positions=[[0, 0, 0]], cell=[1, 1, 10], pbc=True)
cs = ClusterSpace(prim, cutoffs=[1.01], chemical_symbols=['Ag', 'Au'])
ce = ClusterExpansion(cs, [0, 0, 2])

# Prepare initial configuration
structure = prim.repeat((8, 8, 1))
for k in range(len(structure) // 2):
    structure[k].symbol = 'Ag'

# Initalize calculator
calculator = ClusterExpansionCalculator(structure, ce)

# Define parameters for simulations that correspond to different bins
energy_spacing = 1
bins = get_bins_for_parallel_simulations(n_bins=6, energy_spacing=energy_spacing,
                                         minimum_energy=-128, maximum_energy=96)
args = []
for k, (energy_limit_left, energy_limit_right) in enumerate(bins):
    args.append({'structure': structure,
                 'energy_spacing': energy_spacing,
                 'energy_limit_left': energy_limit_left,
                 'energy_limit_right': energy_limit_right,
                 'fill_factor_limit': 1e-4,
                 'flatness_limit': 0.7,
                 'dc_filename': 'wl_k{}.dc'.format(k),
                 'ensemble_data_write_interval': len(structure)*100,
                 'trajectory_write_interval': len(structure)*1000,
                 'data_container_write_period': 120,
                 'number_of_trial_steps': len(structure)*int(2e9)})

# Initiate a Pool object with the desired number of processes and run
pool = Pool(processes=4)
results = pool.map_async(run_simulation, args)
results.get()

import matplotlib.pyplot as plt
import numpy as np
from mchammer import WangLandauDataContainer
from mchammer.data_containers import (get_average_observables_wl,
                                      get_density_of_states_wl)

# Get density and entropy
dcs = {}
for i in range(6):
    dc = WangLandauDataContainer.read('wl_k{}.dc'.format(i))
    dcs[i] = dc

df, _ = get_density_of_states_wl(dcs)

# Plot density
_, ax = plt.subplots(1, 3)
for i in range(len(ax)):
    ax[i].semilogy(df.energy, df.density / df.density.min())
    ax[i].set_xlabel('Energy')
ax[0].set_ylabel('Density')
ax[0].set_xlim(-140, 110)
ax[0].set_ylim(1e0, 1e18)
ax[1].set_xlim(-130, -90)
ax[1].set_ylim(1e0, 1e7)
ax[2].set_xlim(78, 98)
ax[2].set_ylim(1e0, 1e7)
plt.tight_layout()

plt.savefig('wang_landau_binned_density.svg', bbox_inches='tight')

# Compute thermodynamic averages
df = get_average_observables_wl(dcs,
                                temperatures=np.arange(0.4, 6, 0.05),
                                boltzmann_constant=1)

# Plot reference heat capacity
n_atoms = dcs[i].ensemble_parameters['n_atoms']
_, ax = plt.subplots()
ax.plot(df.temperature, df.potential_std ** 2 / df.temperature ** 2 / n_atoms, lw=5.0,
        color='lightgray', label='reference')

# Determine the heat capacity for different cutoffs
cutoffs = [-80, -40, 0, 40]
for cutoff in cutoffs:
    dcs_cutoff = {}
    for key, dc in dcs.items():
        elr = dc.ensemble_parameters['energy_limit_right']
        if not np.isnan(elr) and elr <= cutoff:
            dcs_cutoff[key] = dc
        else:
            break
    if len(dcs_cutoff) < 1:
        continue
    df = get_average_observables_wl(dcs_cutoff,
                                    temperatures=np.arange(0.4, 6, 0.05),
                                    boltzmann_constant=1)

    # Plot the heat capacity for the given cutoff
    ax.plot(df.temperature, df.potential_std ** 2 / df.temperature ** 2 / n_atoms,
            label=r'$E_c={:0.0f}$'.format(cutoff))

ax.set_xlabel('Temperature')
ax.set_ylabel('Heat capacity')
ax.legend()
plt.savefig('wang_landau_binned_heat_capacity_ecut.svg', bbox_inches='tight')