mdse.parser

YAML configuration file parser for MDSE.

This module is responsible for reading and interpreting YAML files that define molecular dynamics simulations. It provides functionalities to:

Read a base YAML configuration file.
“Un-nest” or expand simulation parameters. If a parameter like temperature is defined as a list or a range, this module generates a separate, complete simulation configuration for each value.
Handle command-line overwrites for specific configuration values.
Expand file paths, creating multiple simulations if a path points to a directory.

The simulation configuration files have a specific format that need to be followed, the structure looks like

[Name]:
  CRYSTAL:
    TYPE: [DATABASE, FILE or BULK]
    Name: [name of material or sim]
    Supercell: (optional)
      - [x]
      - [y]
      - [z]
    (only required for FILE and DATABASE)
    Filepath: [relative path from this file to crystal file or database]

    [The following are only required for DATABSE]
    Structure_folder: [path to generated structure files]
    Query:
      [key]: [value]

    [The following are only required for BULK]
    Structure: [fcc, bcc, etc.]
    Lattice_a: [Å]
    Lattice_b: [Å] (optional)
    Lattice_c: [Å] (optional)
    Cubic: [True/False cubic unitcell]

  ENSAMBLE:
    Ensamble: [NVT, NVE, NPT]
    Temp: [temperature in K]
    [required for NVT]
    ThermoTime: [steps between thermostat use]
    [required for NPT]
    Pressure: [eV/Å^3]
    BaroTime: [steps between barostat use]
  SIMULATION:
    Timestep: [timestep in fs]
    Length: [number of timesteps]
    TrajInterval: [frequence of write]
    Calculator: [MACE, LennardJones, EMT]
    CalculatorParams:
        [param]: [value] (see documentation of specific calculator)
    Create_traj: [Whether to create a trajectory file for sim]
  RESULT:
    Properties:
      - [material property or all]
      - [prop]

The Temp, Pressure settings can also have a _list and _range extension which look like

Temp_list:
  - [K]
  - ...
  .

or

Temp_range:
  start: [K]
  stop: [K]
  step: [stepsize]

which create multiple simulations for each element in the list or range.

Functions

`expand_parameter`(simulation_to_expand, parameter)	Expands a simulation based on a single parameter specified as a list or range.
`get_files`(simulations, config_file_path)	Expand simulations where the crystal source is a directory.
`main_read`(filename[, overwrite_config])	Reads from a YAML file, then un-nests the MD simulations by expanding any parameters specified as lists or ranges.
`nest_lennard_jones`(og_config, overwrite_config)	Special handler to overwrite nested Lennard-Jones parameters.
`read_yaml_simulations`(filename)	Reads a YAML file containing MD simulation configurations and returns it as a dictionary.
`unnest_simulation_parameters`(all_simulations)	Expands nested parameters (lists or ranges) in the simulation configurations and applies command-line overwrites.

mdse.parser.expand_parameter(simulation_to_expand, parameter, overwrite_config={})

Expands a simulation based on a single parameter specified as a list or range.

If the specified parameter is found in the simulation configuration as a list (e.g., Temp_list) or a range (e.g., Temp_range), this function generates a new simulation configuration for each value. If the parameter is a single value or not present, it returns the original simulation.

Parameters:

simulation_to_expand (tuple[str, dict]) – A tuple containing the simulation name and its configuration dictionary.
parameter (str) – The base name of the parameter to expand (e.g., ‘Temp’, ‘Pressure’).
overwrite_config (dict, optional) – A dictionary of command-line overwrites that may apply to the list or range itself. Defaults to an empty dictionary.

Returns:

A list of expanded simulations, each as a (name, config) tuple.

Return type:

list[tuple[str, dict]]

mdse.parser.get_files(simulations, config_file_path)

Expand simulations where the crystal source is a directory.

If a simulation’s CRYSTAL.Filepath points to a directory, this function creates a new, separate simulation configuration for each file found within that directory. If the path is a single file or not a file-based crystal source, the simulation is passed through unchanged.

Parameters:

simulations (list[dict]) – A list of simulation configurations (potentially with directory paths).
config_file_path (str or Path) – The path to the original YAML config file, used to resolve relative directory paths.

Returns:

An expanded list of simulation configurations where all directory paths have been resolved to individual files.

Return type:

list[dict]

mdse.parser.main_read(filename, overwrite_config={})

Reads from a YAML file, then un-nests the MD simulations by expanding any parameters specified as lists or ranges.

This is the main entry point for parsing. It orchestrates reading the YAML file, expanding parameter lists/ranges, applying overwrites, and finally expanding directory paths into individual file paths.

Parameters:

filename (str or Path) – Path to the main YAML configuration file.
overwrite_config (dict, optional) – A dictionary of parameters to overwrite. Defaults to an empty dict.

Returns:

A final list of fully specified, individual simulation configurations.

Return type:

list[dict]

mdse.parser.nest_lennard_jones(og_config, overwrite_config)

Special handler to overwrite nested Lennard-Jones parameters.

The LennardJones calculator can have parameters like epsilon and sigma defined as nested lists (matrices). This function correctly applies overwrites to specific elements within these nested structures.

Examples

To overwrite the first row of the epsilon matrix for a Lennard-Jones potential, you can target a specific index in the CalculatorParams.

Command:

mdse simulate ... -c CalculatorParams.epsilon.0=0.3

Effect on Configuration:

This command transforms the CalculatorParams from:

{
    "elements": [0],
    "epsilon": [[0.226738]],
    "sigma": [[0.70641]],
    "rCut": [[1.3]]
}

to:

{
    "elements": [0],
    "epsilon": [[0.3]],
    "sigma": [[0.70641]],
    "rCut": [[1.3]]
}

Parameters:

og_config (dict) – The original Lennard-Jones parameter dictionary from the simulation configuration.
overwrite_config (dict) – The overwrite values for the Lennard-Jones parameters, typically parsed from the command line.

Returns:

The updated Lennard-Jones parameter dictionary.

Return type:

dict

Notes

This function modifies a copy of the original config and returns it.

mdse.parser.read_yaml_simulations(filename)

Reads a YAML file containing MD simulation configurations and returns it as a dictionary.

Parameters:: filename (str or Path) – Path to the configuration YAML file.
Returns:: A dictionary containing the simulation configurations parsed from the YAML file.
Return type:: dict

mdse.parser.unnest_simulation_parameters(all_simulations, overwrite_config={})

Expands nested parameters (lists or ranges) in the simulation configurations and applies command-line overwrites.

This function iterates through each simulation defined in the input dictionary. For parameters specified as a list (e.g., Temp_list) or a range (e.g., Pressure_range), it generates distinct simulation configurations for each value.

After expansion, it applies any overwrite_config values, modifying the generated configurations.

Parameters:

all_simulations (dict) – A dictionary of simulation configurations, as read from the YAML file.
overwrite_config (dict, optional) – A dictionary of parameters to overwrite in the configurations. Defaults to an empty dictionary.

Returns:

A list of fully expanded and overwritten simulation configurations.

Return type:

list[dict]