mmaps

-> What does this program do?

This is the multicomponent version of the maps code.

It gradually constructs a increasingly more accurate cluster expansion.
A user-provided script running concurrently is responsible for notifying
maps when computer time is available. maps creates files describing
structures whose energy should be calculated. The user-provided script
sets up the runs needed to calculate the energy of these structures.
As maps becomes aware of more and more structural energies, it gradually
improves the precision of the cluster expansion, which is continously
written to an output file.
The code terminates when a stop file is created by typing, for instance,
touch stop

NOTE: Fully functional scripts are included with the package:
pollmach and runstruct_vasp.
For for more information type
  pollmach
  runstruct_vasp -h

-> Format of the input file defining the lattice (specified by the -l option)

First, the coordinate system a,b,c is specified, either as
[a] [b] [c] [alpha] [beta] [gamma]
or as:
[ax] [ay] [az]
[bx] [by] [bz]
[cx] [cy] [cz]
Then the lattice vectors u,v,w are listed, expressed in the coordinate system just defined:
[ua] [ub] [uc]
[va] [vb] [vc]
[wa] [wb] [wc]
Finally, atom positions and types are given, expressed in the same coordinate system
as the lattice vectors:
[atom1a] [atom1b] [atom1c] [atom1types]
[atom2a] [atom2b] [atom2c] [atom2types]
etc.

-The atom type is a comma-separated list of the atomic
 symbols of the atoms that can sit the lattice site.
-When only one symbol is listed, this site is ignored for the purpose
 of calculating correlations, but not for determining symmetry.

Examples:

The fcc lattice of the Cu-Au system:
3.8 3.8 3.8 90 90 90
0   0.5 0.5
0.5 0   0.5
0.5 0.5 0
0 0 0 Cu,Au

A lattice for the Li_x Co_y Al_(1-y) O_2 system:
 0.707 0.707 6.928 90 90 120
 0.3333  0.6667 0.3333
-0.6667 -0.3333 0.3333
 0.3333 -0.3333 0.3333
 0       0      0       Li,Vac
 0.3333  0.6667 0.0833  O
 0.6667  0.3333 0.1667  Co,Al
 0       0      0.25    O

Optional input file: ref_energy.in

  Contains the reference energy (per active site, i.e. those that can host more than one specie)
   to be subtracted to get formation energies.
  The atomic reference energies must be in the same order as in the atoms.out file.
  If this file is omitted, the energies of the structures with the most
  extreme compositions are used to determine the reference energies,
  which are then output to the ref_energy.out file.

Optional input file: nbclusters.in
  Allows the user to manually select which clusters to include in the fit.
  This file should contains:
    number of pairs to include
    number of triplets to include
    etc.
  This file can be changed while maps is running. However, you must type
   touch refresh
  to tell maps to reread it.

Optional input file: crange.in  (or as specified by the -cr option)
  If present, this file
  selects the range of concentration over which the cluster expansion
  is to be fitted. This controls both where the correct ground states
  are inforced in the fitting process and the range of concentration
  of the generated structures. Occasionally, a structure outside that
  range is generated, to verify the ground state hull.
  Here is an example of such file:
   1.0*Al -1.0*Li +0.1*Co >= 0.5
  Multiple constraints can be listed (on separate lines).
  Make sure to include a numerical prefactor for each species even
  if it is 1.0. Do not put a space between '-' and a number.
  Note that the crange.in file is read (if present) even if the -cr option is
  not specified.

Optional input file: weights.in
  Forces specific weights to assign to each structure when fitting its energy.
  These weights may be updated by the code internally to better reproduce the ground state line.
  Format: one weight per line, in the same order as in fit.out
  This should only be used to fine-tune the fit at the end when
  no new structures will be added (since weights.in would have
  to change every time a new structure us added).

-> Output files

maps.log
Contains possible warnings:
  'Not enough known energies to fit CE'
  'True ground states not = fitted ground states'
  'New ground states predicted, see predstr.out'

These warning should disappear as more structural energies become available
and the following messages should be displayed:
  'Among structures of known energy, true and predicted ground states agree.'
  'No other ground states of xx atoms/unit cell or less exist.'

This file also gives the crossvalidation score of the current fit
(before the weighting is turned on in order to get the correct ground states).

atoms.out
Lists all atomic species given in the input files.

fit.out
Contains the results of the fit, one structure per line and each line
has the following information:
  concentration energy fitted_energy (energy-fitted_energy) weight index
'concentration' a vector of the atom fraction of all species (in the same
                order as in atoms.out).
'energy' is per site (a site is a place where more than one atom type can sit)
'weight' is the weight of this structure in the fit.
'index' is the name of the directory associated with this structure.

predstr.out
Contains the predicted energy (per site) of all structures maps has in memory but
whose true energy is unknown.
Format: one structure per line, and each line has the following information:
  concentration predicted_energy index status
index is the structure number (or -1 if not written to disk yet).
status is either
   b for busy (being calculated),
   e for error,
   u for unknown (not yet calculated) or
   g if that structure is predicted to be a ground state (g can be combined with the above).
To list all predicted ground states, type
grep 'g' predstr.out

gs.out
Lists the ground state energies, one structure per line and each line
has the following information:
  concentration energy fitted_energy (energy-fitted_energy) index

gs_connect.out
Indicates which ground states touch each face of the ground state convex hull.

gs_str.out
Lists the ground state structures, in the same format as the n/str.out files
(see below). Each structure is terminated by the word 'end' on a line by itself,
followed by a blank line.

chempot.out
This file contains 
(i) the values of the chemical potentials that stabilize all
    fixed-composition multiphase equilibria of the system at 0K.
    (e.g. in a n-nary system, all the n-phase equilibria)
(ii) values of the chemical potentials that stabilize each of the ground states at 0K.
This data is useful to set up Monte Carlo simulations with the memc2 code.

eci.out
Lists the eci. (They have already been divided by multiplicity.)
The corresponding clusters are in clusters.out

clusters.out
For each cluster, the first line is the multiplicity, the second line is the
cluster diameter, and the third line is the number of points in the cluster.
The remaining lines are the coordinates of the points in the cluster
(in the coordinate system specified in the input file defining the lattice).
A blank line separates each cluster.

n/str.out
Same format as the lattice file, except that
-The coordinate system is always written as 3x3 matrix
-Only one atom is listed for each site.

ref_energy.out
Reference energies used to calculate formation energies.
(Usually: energy of the pure end members OR values given in
ref_energy.in if provided.)

The standard output reports the current progress of the calculations.
During the fit of the cluster expansion, each line of numbers displayed has the following meaning:
1) The current number of point, pair, triplet, etc. clusters
2) An indicator of whether the predicted ground states agree with the true ones (1) or not (0)
3) The CV score (per atom, as of version 2.88; per cell earlier).

-> Communication protocol between maps and the script driving the
   energy method code (e.g. ab initio code)
   (Only those who want to customize the code need to read this section.
   The scripts described in this section are provided with the atat
   distribution in the glue/ subdirectory.)

Unless otherwise specified all files mentioned reside in the directory where
maps was started. All paths are relative to the startup directory.

+The script should first wait for computer time to be available before creating
 a file called 'ready'.
-Upon detecting that the 'ready' file has been created,
 maps responds by creating a subdirectory 'n' (where 'n' is a number) and a file
 'n/str.out' containing a description of a structure whose energy needs to be
 calculated.
-maps creates a file called 'n/wait' to distinguish this directory
 from other ones created earlier.
-maps deletes the 'ready' file.
+Upon detecting that the 'ready' file has disappeared,
 the script should now look for the 'n/wait' file, start the calculations
 in the directory 'n' and delete file 'n/wait'.
+If anything goes wrong in the calculations, the script should create a file
 'n/error'.
+When the calculations terminate successfully, the energy per unit cell of the structure
 should be copied to the file 'n/energy'.
 (NOTE: use energy per unit cell of the structure NOT per unit cell of the lattice).
-maps continuously scans all the subdirectories 'n' for 'n/energy' or 'n/error'
 files and updates the cluster expansion accordingly.
-maps updates the cluster expansion whenever a file called 'refresh' is created
 (maps then deletes it).
-maps terminates when a 'stop' file is created.

Note that the script can ask maps to create new structure directories even before
the energy of the current structure has been found.
Note that human intervention is allowed: an 'n/error' file can be
manually created if an error is later found in a run.
Users can also manually step up all runs if they wish so, as long
as they follow the protocol.

Example of script
(portions in /* */ have to be filled in with the appropriate code):

#!/bin/csh

while (! -e stop)
  /* check machine load here */
  if ( /* load low enough */ ) then
    touch ready
    while (-e ready)
      sleep 30
    end
    cd `ls */wait | sed 's+/.*++g' | head -1`
    rm -f wait
    /* convert str.out to the native format of ab initio code */
    /* in background: run code and create either energy file or error file */
    cd ..
  endif
  sleep 180
end

-> Using maps with vasp

The script runstruct_vasp, when run from within directory 'n',
1) converts 'vasp.wrap' and 'n/str.out' into all the necessary files to run vasp,
2) runs vasp
3) extract all the information from the output files and writes in a format
   readable by maps.

An example of vasp.wrap is:
[INCAR]
PREC = high
ISMEAR = -1
SIGMA = 0.1
NSW=41
IBRION = 2
ISIF = 3
KPPRA = 1000
DOSTATIC

See ezvasp documentation for more information.

-> Importing structures into maps

MAPS continuously scans all the first-level subdirectories containing
a file called str.out and tries to map them onto superstructures of the
lattice provided. This lets you 'import' structures from another source
into MAPS. A word of caution: the imported structures must be
unrelaxed and no effort is made to rotate or scale them in order to
match the lattice (aside from space group symmetry operations).

-> Newly added Bayesian algorithm note:

To activate Bayesian algorithm, type -fa=bayesian 
All the input and output files are maintained in the same format   
If the Bayesian algorithm starts from an existing cluster expansion
(ATAT default method)
Then it will tag a few larger structures that are linearly dependent with other
structures as error before start the fit. Detailed documentation of the code can
be viewed at 
https://doi.org/10.1016/j.commatsci.2023.112571



[email protected] Wed, Dec 6, 2023 12:55:16 PM