Coverage for mchammer/ensembles/hybrid_ensemble.py: 95%

142 statements  

« prev     ^ index     » next       coverage.py v7.5.0, created at 2024-12-26 04:12 +0000

1import random 

2from typing import Dict, List, Any 

3 

4from math import isclose 

5 

6from ase import Atoms 

7from ase.units import kB 

8from ase.data import atomic_numbers, chemical_symbols 

9 

10from .. import DataContainer 

11from ..calculators.base_calculator import BaseCalculator 

12from .thermodynamic_base_ensemble import ThermodynamicBaseEnsemble 

13from .vcsgc_ensemble import get_phis 

14from .semi_grand_canonical_ensemble import get_chemical_potentials 

15 

16 

17class HybridEnsemble(ThermodynamicBaseEnsemble): 

18 r""" 

19 Instances of this class allows one to combine multiple ensembles. 

20 In particular, a dictionary should be provided for each ensemble, 

21 which must include the type (:attr:`ensemble`) as well as the index of 

22 the sublattice (:attr:`sublattice_index`). In addition, it is possible 

23 to provide lists of allowed symbols (:attr:`allowed_symbols`) and site 

24 indices (:attr:`allowed_sites`) for the trial steps. Here the allowed 

25 symbols must represent a subset of the elements that can occupy the sites on 

26 the specified sublattice. Note that additional arguments are 

27 required for the SGC and VCSGC ensembles, namely chemical 

28 potentials (:attr:`chemical_potentials`) for the former and constraint 

29 parameters (:attr:`phis` and :attr:`kappa`) for the latter. For more detailed 

30 information regarding the different ensembles, please see 

31 :class:`CanonicalEnsemble <mchammer.ensembles.CanonicalEnsemble>`, 

32 :class:`SemiGrandCanonicalEnsemble 

33 <mchammer.ensembles.SemiGrandCanonicalEnsemble>`, and 

34 :class:`VCSGCEnsemble <mchammer.ensembles.VCSGCEnsemble>`. 

35 

36 This class is particularly useful for effectively sampling complex 

37 multi-component systems with several active sublattices, in which 

38 case different ensembles can be defined for each of the latter. 

39 The fact that it is possible to set the allowed chemical symbols 

40 means that one can vary the concentrations of a few selected 

41 species, with the help of a VCSGC or semi-grand canonical 

42 ensemble, while still allowing swaps between any two sites, using 

43 a canonical ensemble (see also the below example). It is advisable 

44 to carefully consider how to define the ensemble probabilities. By 

45 default the ensembles are weighted by the sizes of the 

46 corresponding sublattices, which should give suitable 

47 probabilities in most cases. As is shown in the example below, it 

48 might be prudent to provide different values if allowed symbols 

49 are provided as well as for cases where there are several 

50 ensembles that are active on different sublattices. 

51 

52 Parameters 

53 ---------- 

54 

55 structure 

56 Atomic configuration to be used in the Monte Carlo simulation; 

57 also defines the initial occupation vector. 

58 calculator 

59 Calculator to be used for calculating the potential changes 

60 that enter the evaluation of the Metropolis criterion. 

61 temperature 

62 Temperature :math:`T` in appropriate units, commonly Kelvin. 

63 ensemble_specs 

64 A list of dictionaries, which should contain the following items: 

65 

66 * :attr:`ensemble`: Can be either ``'vcsgc'``, ``'semi-grand'`` 

67 or ``'canonical'``, lowercase and uppercase letters or any 

68 combination thereof are accepted (required). 

69 * :attr:`sublattice_index`: Index for the sublattice of 

70 interest (required). 

71 * :attr:`allowed_symbols`: List of allowed chemical symbols. 

72 Default: read from :class:`ClusterSpace`. 

73 * :attr:`allowed_sites`: List of allowed sites. 

74 Default: all sites. 

75 * :attr:`chemical_potentials`: Dictionary of chemical 

76 potentials for each species :math:`\mu_i`. 

77 The key denotes the species, the value 

78 specifies the chemical potential in units that are 

79 consistent with the underlying cluster expansion. 

80 Only applicable and required for SGC ensembles. 

81 * :attr:`phis`: Dictionary with average constraint parameters 

82 :math:`\phi_i`. The key denotes the species. For a 

83 :math:`N`-component sublattice, there should be :math:`N - 1` 

84 different :math:`\phi_i` (referred to as :math:`\bar{\phi}` in [SadErh12]_). 

85 Only applicable and required for VCSGC ensembles. 

86 * :attr:`kappa`: Parameter that constrains the variance of the 

87 concentration (referred to as :math:`\bar{\kappa}` in [SadErh12]_). 

88 Only applicable and required for VCSGC ensembles. 

89 

90 probabilities 

91 Probabilities for choosing a 

92 particular ensemble with the same length as ensemble specs. 

93 If left unspecified the probabilties are scaled to match the 

94 sizes of the associated sublattices. 

95 boltzmann_constant 

96 Boltzmann constant :math:`k_B` in appropriate units, i.e., 

97 units that are consistent with the underlying cluster 

98 expansion and the temperature units. Default: eV/K. 

99 user_tag 

100 Human-readable tag for ensemble. Default: ``None``. 

101 random_seed 

102 Seed for the random number generator used in the Monte Carlo simulation. 

103 dc_filename 

104 Name of file the data container associated with the 

105 ensemble will be written to. If the file 

106 exists it will be read, the data container will be appended, 

107 and the file will be updated/overwritten. 

108 data_container_write_period 

109 Period in units of seconds at which the data container is 

110 written to file. Writing periodically to file provides both 

111 a way to examine the progress of the simulation and to 

112 back up the data. Default: 600 s. 

113 ensemble_data_write_interval 

114 Interval at which data is written to the data container. 

115 This includes for example the current value of the 

116 calculator (i.e., usually the energy) as well as ensembles 

117 specific fields such as temperature or the number of atoms 

118 of different species. 

119 Default: Number of sites in the :attr:`structure`. 

120 trajectory_write_interval 

121 Interval at which the current occupation vector of the 

122 atomic configuration is written to the data container. 

123 Default: Number of sites in the :attr:`structure`. 

124 

125 Example 

126 ------- 

127 The following snippet illustrates how to carry out a simple Monte Carlo 

128 simulation using a combination of one canonical and one VCSGC ensemble. 

129 Specifically, the concentration of one species (Au) is kept constant 

130 while the others (Ag and Pd) are varied, while swaps are still allowed. 

131 Here, the parameters of the cluster expansion are set to emulate a simple 

132 Ising model in order to obtain an example that can be run without 

133 modification. In practice, one should of course use a proper cluster 

134 expansion:: 

135 

136 >>> from ase.build import bulk 

137 >>> from icet import ClusterExpansion, ClusterSpace 

138 >>> from mchammer.calculators import ClusterExpansionCalculator 

139 

140 >>> # prepare cluster expansion 

141 >>> # the setup emulates a second nearest-neighbor (NN) Ising model 

142 >>> # (zerolet and singlet ECIs are zero; only first and second neighbor 

143 >>> # pairs are included) 

144 >>> prim = bulk('Au') 

145 >>> cs = ClusterSpace(prim, cutoffs=[4.3], 

146 ... chemical_symbols=['Ag', 'Au', 'Pd']) 

147 >>> ce = ClusterExpansion( 

148 ... cs, [0, 0, 0, 0.1, 0.1, 0.1, -0.02, -0.02, -0.02]) 

149 

150 >>> # define structure object 

151 >>> structure = prim.repeat(3) 

152 >>> for i, atom in enumerate(structure): 

153 >>> if i % 2 == 0: 

154 >>> atom.symbol = 'Ag' 

155 >>> elif i % 3 == 0: 

156 >>> atom.symbol = 'Pd' 

157 

158 >>> # the default probabilities for this case would be [0.5, 0.5], but 

159 >>> # since the VCSGC ensemble only performs flips on a subset of all 

160 >>> # sites on the sublattice, namely those originally occupied by Ag 

161 >>> # and Pd atoms, specific values will be provided 

162 >>> weights = [len(structure), 

163 ... len([s for s in structure.get_chemical_symbols() if s != 'Au'])] 

164 >>> norm = sum(weights) 

165 >>> probabilities = [w / norm for w in weights] 

166 

167 >>> # set up and run MC simulation 

168 >>> calc = ClusterExpansionCalculator(structure, ce) 

169 >>> ensemble_specs = [ 

170 ... {'ensemble': 'canonical', 'sublattice_index': 0}, 

171 ... {'ensemble': 'vcsgc', 'sublattice_index': 0, 

172 ... 'phis': {'Ag': -0.2}, 'kappa': 200, 

173 ... 'allowed_symbols':['Ag', 'Pd']}] 

174 >>> mc = HybridEnsemble(structure=structure, calculator=calc, 

175 ... ensemble_specs=ensemble_specs, 

176 ... temperature=600, probabilities=probabilities, 

177 ... dc_filename='myrun_hybrid.dc') 

178 >>> mc.run(100) # carry out 100 trial steps 

179 """ 

180 

181 def __init__(self, 

182 structure: Atoms, 

183 calculator: BaseCalculator, 

184 temperature: float, 

185 ensemble_specs: List[Dict], 

186 probabilities: List[float] = None, 

187 boltzmann_constant: float = kB, 

188 user_tag: str = None, 

189 random_seed: int = None, 

190 dc_filename: str = None, 

191 data_container: str = None, 

192 data_container_write_period: float = 600, 

193 ensemble_data_write_interval: int = None, 

194 trajectory_write_interval: int = None) -> None: 

195 

196 # define available ensembles 

197 self._ensemble_trial_steps = dict([ 

198 ('canonical', self.do_canonical_swap), 

199 ('semi-grand', self.do_sgc_flip), 

200 ('vcsgc', self.do_vcsgc_flip), 

201 ]) 

202 

203 self._ensemble_parameters = dict(temperature=temperature) 

204 

205 self._trial_steps_per_ensemble = {'ensemble_{}'.format(i): 0 for i in 

206 range(len(ensemble_specs))} 

207 

208 # process the list of ensembles and parameters 

209 self._process_ensemble_specs(ensemble_specs) 

210 

211 super().__init__( 

212 structure=structure, 

213 calculator=calculator, 

214 user_tag=user_tag, 

215 random_seed=random_seed, 

216 boltzmann_constant=boltzmann_constant, 

217 dc_filename=dc_filename, 

218 data_container=data_container, 

219 data_container_class=DataContainer, 

220 data_container_write_period=data_container_write_period, 

221 ensemble_data_write_interval=ensemble_data_write_interval, 

222 trajectory_write_interval=trajectory_write_interval) 

223 

224 # postprocess the list of ensembles and parameters 

225 self._postprocess_ensemble_args() 

226 

227 # set the probabilities 

228 self._process_probabilities(probabilities) 

229 

230 @property 

231 def temperature(self) -> float: 

232 """ Current temperature. """ 

233 return self._ensemble_parameters['temperature'] 

234 

235 @property 

236 def probabilities(self) -> List[float]: 

237 """ Ensemble propabilities. """ 

238 return self._probabilities 

239 

240 @property 

241 def trial_steps_per_ensemble(self) -> Dict[str, int]: 

242 """ Number of Monte Carlo trial steps for each ensemble. """ 

243 return self._trial_steps_per_ensemble 

244 

245 def _process_ensemble_specs(self, ensemble_specs: List[Dict]) -> None: 

246 r"""Process the list of ensembles and parameters. 

247 

248 Parameters 

249 ---------- 

250 ensemble_specs: List[Dict] 

251 A list of dictionaries, which should contain the following items: 

252 * 'ensemble', which could be either 'vcsgc'; 'semi-grand' or 'canonical', lowercase 

253 and upercase letters or any combination thereof are accepted 

254 * 'sublattice_index', index for the sublattice of interest 

255 * 'allowed_symbols', list of allowed chemical symbols 

256 * 'allowed_sites', list of indices of allowed sites 

257 * 'chemical_potentials', a dictionary of chemical potentials for each species 

258 :math:`\mu_i`; the key denotes the species, the value specifies the chemical 

259 potential in units that are consistent with the underlying cluster expansion 

260 * 'phis ', dictionary with average constraint parameters :math:`\phi_i`; the key 

261 denotes the species; for a N-component sublattice, there should be N-1 

262 different `\phi_i` 

263 * 'kappa', parameter that constrains the variance of the concentration 

264 """ 

265 

266 ensemble_args = [] 

267 

268 for ind, ensemble_spec in enumerate(ensemble_specs): 

269 

270 ensemble_arg: Dict[str, Any] = {} 

271 tag = f'ensemble_{ind}' 

272 ensemble_arg['tag'] = tag 

273 

274 # check the ensemble name 

275 if 'ensemble' not in ensemble_spec: 

276 raise ValueError( 

277 f"The dictionary {ensemble_spec} lacks the required key 'ensemble'") 

278 ensemble = ensemble_spec['ensemble'].lower() 

279 if ensemble not in self._ensemble_trial_steps.keys(): 

280 msg = ['Ensemble not available'] 

281 msg += ['Please choose one of the following:'] 

282 for key in self._ensemble_trial_steps.keys(): 

283 msg += [' * ' + key] 

284 raise ValueError('\n'.join(msg)) 

285 ensemble_arg['ensemble'] = ensemble 

286 self._ensemble_parameters[tag] = ensemble 

287 

288 # check that all required keys, and no unknown keys, are present 

289 keys = ['ensemble', 'sublattice_index', 'allowed_symbols', 'allowed_sites'] 

290 if ensemble == 'semi-grand': 

291 keys = ['chemical_potentials'] + keys 

292 elif ensemble == 'vcsgc': 

293 keys = ['phis', 'kappa'] + keys 

294 for key in keys[:-2]: 

295 if key not in ensemble_spec: 

296 raise ValueError(f"The dictionary {ensemble_spec} lacks the key '{key}'," 

297 f' which is required for {ensemble} ensembles') 

298 for key in ensemble_spec.keys(): 

299 if key not in keys: 

300 raise ValueError(f"Unknown key '{key}', for a {ensemble} ensemble," 

301 f' in the dictionary {ensemble_spec}') 

302 

303 # record the sublattice index 

304 ensemble_arg['sublattice_index'] = ensemble_spec['sublattice_index'] 

305 

306 # process chemical potentials 

307 if 'chemical_potentials' in ensemble_spec: 

308 chemical_potentials = get_chemical_potentials(ensemble_spec['chemical_potentials']) 

309 ensemble_arg['chemical_potentials'] = chemical_potentials 

310 for atnum, chempot in chemical_potentials.items(): 

311 mu_sym = '{}_mu_{}'.format(tag, chemical_symbols[atnum]) 

312 self._ensemble_parameters[mu_sym] = chempot 

313 

314 # process phis 

315 if 'phis' in ensemble_spec: 

316 phis = get_phis(ensemble_spec['phis']) 

317 ensemble_arg['phis'] = phis 

318 for sym, phi in phis.items(): 

319 if isinstance(sym, str): 319 ↛ 320line 319 didn't jump to line 320, because the condition on line 319 was never true

320 chemical_symbol = sym 

321 else: 

322 chemical_symbol = chemical_symbols[sym] 

323 phi_sym = '{}_phi_{}'.format(tag, chemical_symbol) 

324 self._ensemble_parameters[phi_sym] = phi 

325 

326 # process kappa 

327 if 'kappa' in ensemble_spec: 

328 ensemble_arg['kappa'] = ensemble_spec['kappa'] 

329 self._ensemble_parameters['{}_kappa'.format(tag)] = ensemble_spec['kappa'] 

330 

331 # record the allowed chemical symbols 

332 if 'allowed_symbols' in ensemble_spec: 

333 ensemble_arg['allowed_symbols'] = ensemble_spec['allowed_symbols'] 

334 

335 # record the allowed sites 

336 if 'allowed_sites' in ensemble_spec: 336 ↛ 337line 336 didn't jump to line 337, because the condition on line 336 was never true

337 ensemble_arg['allowed_sites'] = ensemble_spec['allowed_sites'] 

338 

339 ensemble_args.append(ensemble_arg) 

340 

341 self._ensemble_args = ensemble_args 

342 

343 def _postprocess_ensemble_args(self): 

344 """Process the list of dictionaries with ensemble specific parameters. 

345 """ 

346 

347 for i in range(len(self._ensemble_args)): 

348 

349 # check the sublattice index 

350 self._check_sublattice_index(self._ensemble_args[i]['sublattice_index']) 

351 

352 # extract the allowed species 

353 if 'allowed_symbols' in self._ensemble_args[i]: 

354 self._ensemble_args[i]['allowed_species'] =\ 

355 self._extract_allowed_species(self._ensemble_args[i]['allowed_symbols'], 

356 self._ensemble_args[i]['sublattice_index']) 

357 del self._ensemble_args[i]['allowed_symbols'] 

358 else: 

359 self._ensemble_args[i]['allowed_species'] = None 

360 

361 # handle lack of allowed sites 

362 if 'allowed_sites' not in self._ensemble_args[i]: 362 ↛ 365line 362 didn't jump to line 365, because the condition on line 362 was never false

363 self._ensemble_args[i]['allowed_sites'] = None 

364 

365 if self._ensemble_args[i]['ensemble'] == 'vcsgc': 

366 # Check that each sublattice has N - 1 phis 

367 count_specified_elements = 0 

368 if self._ensemble_args[i]['allowed_species'] is None: 368 ↛ 372line 368 didn't jump to line 372, because the condition on line 368 was never false

369 allowed_species =\ 

370 self.sublattices[self._ensemble_args[i]['sublattice_index']].atomic_numbers 

371 else: 

372 allowed_species = self._ensemble_args[i]['allowed_species'] 

373 for number in allowed_species: 

374 if number in self._ensemble_args[i]['phis'].keys(): 

375 count_specified_elements += 1 

376 if count_specified_elements != len(allowed_species) - 1: 

377 raise ValueError('phis must be set for N - 1 elements on a sublattice with' 

378 ' N elements') 

379 

380 def _check_sublattice_index(self, sublattice_index: int): 

381 """Check the :attr:`sublattice_index` item in the :attr:`ensemble_spec` dictionary. 

382 

383 Parameters 

384 ---------- 

385 sublattice_index 

386 Specific sublattice to consider provided as as an index or a symbol. 

387 """ 

388 

389 if not isinstance(sublattice_index, int): 

390 raise TypeError("'sublattice_index' must be an integer, not" 

391 f' {format(type(sublattice_index))}') 

392 

393 # check that the sublattice exists 

394 if sublattice_index not in range(len(self.sublattices)): 

395 raise ValueError('There is no sublattice with index {}'.format(sublattice_index)) 

396 

397 # check that the sublattice is active 

398 if len(self.sublattices[sublattice_index].chemical_symbols) == 1: 

399 raise ValueError('The sublattice {} is inactive'.format(sublattice_index)) 

400 

401 def _extract_allowed_species( 

402 self, 

403 allowed_symbols: List[str], 

404 sublattice_index: int, 

405 ) -> List[int]: 

406 """Check and extract the allowed species from the 

407 :attr:`allowed_symbols` in the :attr:`ensemble_spec` 

408 dictionary. 

409 

410 Parameters 

411 ---------- 

412 allowed_symbols 

413 List of allowed chemical symbols. 

414 sublattice_index 

415 Index for the relevant sublattice. 

416 

417 """ 

418 

419 if not isinstance(allowed_symbols, list) or not all( 

420 [isinstance(i, str) for i in allowed_symbols]): 

421 raise TypeError( 

422 f"'allowed_symbols' must be a List[str], not {type(allowed_symbols)}") 

423 for symbol in allowed_symbols: 

424 if symbol not in self.sublattices[sublattice_index].chemical_symbols: 

425 raise ValueError('The species {} is not allowed on sublattice' 

426 ' {}'.format(symbol, sublattice_index)) 

427 

428 return [atomic_numbers[s] for s in allowed_symbols] 

429 

430 def _process_probabilities(self, probabilities: List[float]): 

431 """Process the list of probabilities. 

432 

433 Parameters 

434 ---------- 

435 probabilities 

436 Probabilities for choosing a particular ensemble with the same 

437 length as :attr:`self._ensemble_args`. 

438 """ 

439 

440 if probabilities is None: 

441 # use the sizes of the associated sublattices when calculating the ensemble 

442 # probabilities 

443 weights = [len(self.sublattices[ensemble_arg['sublattice_index']].indices) for 

444 ensemble_arg in self._ensemble_args] 

445 norm = sum(weights) 

446 probabilities = [w / norm for w in weights] 

447 else: 

448 if len(probabilities) != len(self._ensemble_args): 

449 raise ValueError('The number of probabilities must be match the number of' 

450 ' ensembles') 

451 

452 if not isclose(sum(probabilities), 1.0): 

453 raise ValueError('The sum of all probabilities must be equal to 1') 

454 

455 self._probabilities = probabilities 

456 

457 def _do_trial_step(self): 

458 """ Carries out one Monte Carlo trial step. """ 

459 # randomly pick an ensemble 

460 ensemble_arg = random.choices(self._ensemble_args, weights=self._probabilities)[0] 

461 

462 # count number of trial steps for each ensemble 

463 self._trial_steps_per_ensemble[ensemble_arg['tag']] += 1 

464 

465 if ensemble_arg['ensemble'] == 'canonical' and not self.configuration.is_swap_possible( 465 ↛ 467line 465 didn't jump to line 467, because the condition on line 465 was never true

466 ensemble_arg['sublattice_index'], ensemble_arg['allowed_species']): 

467 return 0 

468 else: 

469 arguments = {key: val for key, val in ensemble_arg.items() if key not in 

470 ['ensemble', 'tag']} 

471 return self._ensemble_trial_steps[ensemble_arg['ensemble']](**arguments) 

472 

473 def _get_ensemble_data(self) -> Dict: 

474 """ 

475 Returns a dict with the default data of the ensemble. This includes 

476 atom counts and free energy derivative. 

477 """ 

478 data = super()._get_ensemble_data() 

479 

480 ensemble_types = [e['ensemble'] for e in self._ensemble_args] 

481 

482 # free energy derivative 

483 if 'vcsgc' in ensemble_types: 483 ↛ 491line 483 didn't jump to line 491, because the condition on line 483 was never false

484 for ensemble_arg in self._ensemble_args: 

485 if 'vcsgc' == ensemble_arg['ensemble']: 

486 data.update(self._get_vcsgc_free_energy_derivatives( 

487 ensemble_arg['phis'], ensemble_arg['kappa'], 

488 ensemble_arg['sublattice_index'])) 

489 

490 # species counts 

491 if any([e in ensemble_types for e in ['vcsgc', 'semi-grand']]): 491 ↛ 494line 491 didn't jump to line 494, because the condition on line 491 was never false

492 data.update(self._get_species_counts()) 

493 

494 return data