# Data container¶

The results of a MC simulation are stored in the form of a
`DataContainer`

object, which can be accessed
via the `data_container`

property of the MC
ensemble. If a file name is provided during ensemble initialization via the
data_container parameter the data container is also written to file. The
latter can then be easily read at a later time via the `read`

function of the `DataContainer`

.

The `DataContainer`

class provides ample
functionality for processing data and extracting various observables that are
briefly introduced in this section.

## Extracting data¶

The raw data as a function of MC trial step can be obtained via the
`get_data`

function, which also allows
slicing data by specifying an initial and final MC step. This is useful e.g.,
for discarding the equilibration part of a simulation:

```
energy = dc.get_data('potential', start=5000)
```

The `get_data`

function also allows
extracting several observables in parallel:

```
mctrial, energy, sro = dc.get_data('mctrial', potential', 'sro_Ag_1')
```

The available observables can be checked using the `observables`

attribute.

## Extracting trajectory¶

The atomic configuration can be extracted using the `get_trajectory`

```
traj = dc.get_trajectory()
```

Alternatively, the trajectory can be obtained via the `get_data`

function, which also allows for pairing the
snapshots in the trajectory with observables in the data container.

```
E_mix, traj = dc.get_trajectory('potential', 'trajectory')
```

## Updating data container¶

Normally observers are attached to an ensemble at the
beginning of an MC simulation via the `attach_observer`

function. They can,
however, also be applied after the fact via the `apply_observer`

function, provided the trajectory is
available via a `DataContainer`

object.

```
obs = ClusterExpansionObserver(ce, tag='new_obs')
dc = DataContainer.read('my_dc.dc')
dc.apply_observer(obs)
new_obs_data = dc.get_data('')
```

Afterwards the data container, including the new data, can be written back to
file using the `write`

function.

## Data analysis¶

Data containers also allow more detailed analysis. The `analyze_data`

function computes average, standard
deviation, correlation length, and 95% error estimate of the average for a
given observable.

```
summary = dc.analyze_data('potential')
print(summary)
```

Here, the correlation length, \(s\), is estimated from the autocorrelation function (ACF). When the ACF has decayed below \(\mathrm{e^{-2}}\) observations are said to be uncorrelated, providing an estimate of the correlation length.

An error estimate of the average can be calculated via

where \(\sigma\) is the standard deviation, \(N\) the number of samples, \(s\) the correlation length and \(t\) is the t-factor, which can be adjusted depending on the desired confidence interval.

Obtaining the autocorrelation function directly or carrying out error estimates can be done via functionality provided in the data_analysis module.