site_analysis.tools

Utility functions for site analysis.

This module provides helper functions for finding coordination environments, mapping atoms between structures, and handling periodic boundary conditions. Most functions accept numpy arrays and species lists rather than pymatgen Structure objects.

Key functions include:

Coordination and site analysis: - get_coordination_indices: Find atoms with specific coordination environments - indices_for_species: Get atom indices matching a species string

Structure mapping and comparison: - site_index_mapping: Map site indices between two structures - calculate_species_distances: Calculate distances between matching atoms

Periodic boundary handling: - x_pbc: Generate fractional coordinates for all periodic images in neighbouring cells

These utilities provide low-level functionality that can be used directly or are used internally by the higher-level site and trajectory analysis classes.

calculate_species_distances(frac_coords1: ndarray, frac_coords2: ndarray, lattice_matrix: ndarray, species1: list[str], species2: list[str], species: list[str] | None = None) tuple[dict[str, list[float]], list[float]][source]

Calculate minimum distances between atoms of the same species.

For each species, computes the distance from each atom of that species in frac_coords1 to the nearest atom of the same species in frac_coords2.

Parameters:
  • frac_coords1 – Fractional coordinates of first structure, shape (N, 3).

  • frac_coords2 – Fractional coordinates of second structure, shape (M, 3).

  • lattice_matrix – (3, 3) lattice matrix where rows are lattice vectors.

  • species1 – Species strings for each atom in frac_coords1.

  • species2 – Species strings for each atom in frac_coords2.

  • species – Optional filter - only include these species. If None, includes all species present in both structures.

Returns:

A tuple of (species_distances, all_distances) where species_distances maps species to lists of minimum distances, and all_distances is a flat list of all minimum distances.

get_coordination_indices(frac_coords: ndarray, lattice_matrix: ndarray, species: list[str], centre_species: str, coordination_species: str | list[str], cutoff: float, n_coord: int | list[int]) dict[int, list[int]][source]

Find atoms with exactly the specified coordination environment.

For each atom of centre_species, finds environments with exactly n_coord coordinating atoms of coordination_species within the cutoff distance.

Parameters:
  • frac_coords – Fractional coordinates for all atoms, shape (N, 3).

  • lattice_matrix – Lattice matrix (3x3) for distance calculations.

  • species – Species strings for each atom.

  • centre_species – Species string identifying the atoms at the centres.

  • coordination_species – Species string or list of strings identifying the coordinating atoms.

  • cutoff – Distance cutoff for neighbour search in Angstroms.

  • n_coord – Number(s) of coordinating atoms required for each environment. If an int is provided, the same number is used for all centre atoms. If a list is provided, it should have the same length as the number of centre atoms found.

Returns:

Dictionary mapping centre atom indices to lists of coordinating atom indices. Only includes environments with exactly n_coord coordinating atoms within cutoff.

Raises:

ValueError – If no centre atoms are found, or if a list of n_coord has incorrect length.

get_nearest_neighbour_indices(structure: Structure, ref_structure: Structure, vertex_species: list[str], n_coord: int) list[list[int]][source]

Returns the atom indices for the N nearest neighbours to each site in a reference structure.

Parameters:
  • structure (pymatgen.Structure) – A pymatgen Structure object, used to select the nearest neighbour indices.

  • ref_structure (pymatgen.Structure) – A pymatgen Structure object. Each site is used to find the set of N nearest neighbours (of the specified atomic species) in structure.

  • vertex_species (list(str)) – list of strings specifying the atomic species of the vertex atoms, e.g. [ 'S', 'I' ].

  • n_coord (int) – Number of matching nearest neighbours to return for each site in ref_structure.

Returns:

N_sites x N_neighbours nested list of vertex atom indices.

Return type:

(list(list(int))

Raises:

ValueError – If structure or ref_structure is empty, if vertex_species is empty, if n_coord is not positive, if no atoms match vertex_species, or if there are fewer matching atoms than n_coord.

get_vertex_indices(structure: Structure, centre_species: str, vertex_species: str | list[str], cutoff: float, n_vertices: int | list[int]) list[list[int]][source]

DEPRECATED: Find the atom indices for atoms defining the vertices of coordination polyhedra.

This function is deprecated and will be removed in a future version.

Please use one of the following alternatives: - get_coordinating_indices(): For finding atoms with exact coordination environments - ReferenceBasedSites workflow: For generating sites based on reference structures

Given the elemental species of a set of central atoms, A, and of the polyhedral vertices, B, this function finds: for each A, then N closest neighbours B (within some cutoff). The number of neighbours found per central atom can be a single value for all A, or can be provided as a list of values for each A.

Parameters:
  • structure – A pymatgen Structure object.

  • centre_species – Species string identifying the atoms at the centres.

  • vertex_species – Species string or list of strings identifying the vertex atoms.

  • cutoff – Distance cutoff for neighbour search.

  • n_vertices – Number(s) of nearest neighbours to return for each set of vertices. If an int is passed, this should be the same length as the number of atoms of centre species A.

Returns:

Nested list of integers, giving the atom indices for each

coordination environment.

Return type:

list(list(int))

indices_for_species(all_species: list[str], target: str) list[int][source]

Return indices where all_species matches target.

Parameters:
  • all_species – List of species strings for all atoms.

  • target – Species string to match.

Returns:

List of indices where species matches target.

site_index_mapping(frac_coords1: ndarray, frac_coords2: ndarray, lattice_matrix: ndarray, species1: list[str], species2: list[str], species1_filter: str | list[str] | None = None, species2_filter: str | list[str] | None = None, one_to_one_mapping: bool = True, return_mapping_distances: bool = False) ndarray | tuple[ndarray, ndarray][source]

Compute the site index mapping based on closest distances.

For each selected site in frac_coords1 (filtered by species1_filter), finds the closest site in frac_coords2 (filtered by species2_filter) using minimum-image convention distances.

Parameters:
  • frac_coords1 – Fractional coordinates to map from, shape (N, 3).

  • frac_coords2 – Fractional coordinates to map to, shape (M, 3).

  • lattice_matrix – Lattice matrix (3x3) for distance calculations.

  • species1 – Species strings for each site in frac_coords1.

  • species2 – Species strings for each site in frac_coords2.

  • species1_filter – If given, only map from sites whose species are in this list. Defaults to all species in species1.

  • species2_filter – If given, only map to sites whose species are in this list. Defaults to all species in species2.

  • one_to_one_mapping – If True, raise ValueError when the mapping is not one-to-one. Default is True.

  • return_mapping_distances – If True, also return the distances for each mapped pair.

Returns:

Array of mapped indices into frac_coords2. If return_mapping_distances is True, returns a tuple of (mapping, distances).

Raises:

ValueError – If one_to_one_mapping is True and the mapping is not one-to-one.

species_string_from_site(site: Site) str[source]

Extract the species string from a pymatgen Site object.

Parameters:

site – A pymatgen Site object

Returns:

String representation of the site’s species

x_pbc(x: ndarray)[source]

Return an array of fractional coordinates mapped into all positive neighbouring periodic cells.

Parameters:

x (np.array) – Input fractional coordinates.

Returns:

(8,3) numpy array of all mapped fractional coordinates, including the

original coordinates in the origin calculation cell.

Return type:

np.array

Example

>>> x = np.array([0.1, 0.2, 0.3])
>>> x_pbc(x)
array([[0.1, 0.2, 0.3],
       [1.1, 0.2, 0.3],
       [0.1, 1.2, 0.3],
       [0.1, 0.2, 1.3],
       [1.1, 1.2, 0.3],
       [1.1, 0.2, 1.3],
       [0.1, 1.2, 1.3],
       [1.1, 1.2, 1.3]])