API reference
This section contains the automatic API reference for Cif
and CifEnsemble
modules in the cifkit
package.
cifkit
Cif
CN_best_methods
property
Determines the optimal coordination method for each atomic site.
For each atomic site, the coordination polyhedron is generated for each method
in self.CN_max_gap_per_site
. The method with the smallest value of
polyhedron_metrics["distance_from_avg_point_to_center"]
, indicating the highest
symmetry of the polyhedron, is selected as the "best method" among the four
methods used to determine the CN gap in self.CN_max_gap_per_site
.
Returns:
Type | Description |
---|---|
dict[str, dict[str, float | int | str]]]
|
Dictionary where each key represents an atomic site, and the corresponding value is a dictionary containing:
|
Examples:
CN_max_gap_per_site
property
Determines the maximum gap in coordination number (CN) for each atomic site.
For each atomic site, considers the first 20 nearest neighbors. The distances to these neighbors are normalized based on four methods:
dist_by_shortest_dist
: Normalization by the shortest distance from the site.dist_by_CIF_radius_sum
: Normalization by the sum of CIF radii.dist_by_CIF_radius_refined_sum
: Normalization by the sum of refined CIF radii.dist_by_Pauling_radius_sum
: Normalization by the sum of Pauling radii.
The radius sums are calculated for each element pair involved. For each normalization method, the maximum gap is determined as the largest difference between consecutive normalized distances (i.e., the difference between the nth and (n-1)th neighbors).
This CN gap provides insight into the bonding relevance for each site.
Returns:
Type | Description |
---|---|
dict of dict of dict
|
A dictionary where each key represents an atomic site, mapping to another dictionary with normalization methods as keys. Each normalization method contains a dictionary with:
|
Examples:
>>> cif.CN_max_gap_per_site
{
"In1": {
"dist_by_shortest_dist": {"max_gap": 0.306, "CN": 14},
"dist_by_CIF_radius_sum": {"max_gap": 0.39, "CN": 14},
"dist_by_CIF_radius_refined_sum": {"max_gap": 0.341, "CN": 12},
"dist_by_Pauling_radius_sum": {"max_gap": 0.398, "CN": 14},
},
"U1": {
"dist_by_shortest_dist": {"max_gap": 0.197, "CN": 11},
"dist_by_CIF_radius_sum": {"max_gap": 0.312, "CN": 11},
"dist_by_CIF_radius_refined_sum": {"max_gap": 0.27, "CN": 17},
"dist_by_Pauling_radius_sum": {"max_gap": 0.256, "CN": 17},
},
"Rh1": {
"dist_by_shortest_dist": {"max_gap": 0.315, "CN": 9},
"dist_by_CIF_radius_sum": {"max_gap": 0.347, "CN": 9},
"dist_by_CIF_radius_refined_sum": {"max_gap": 0.418, "CN": 9},
"dist_by_Pauling_radius_sum": {"max_gap": 0.402, "CN": 9},
},
"Rh2": {
"dist_by_shortest_dist": {"max_gap": 0.31, "CN": 9},
"dist_by_CIF_radius_sum": {"max_gap": 0.324, "CN": 9},
"dist_by_CIF_radius_refined_sum": {"max_gap": 0.397, "CN": 9},
"dist_by_Pauling_radius_sum": {"max_gap": 0.380, "CN": 9},
},
}
connections_flattened
property
Transform site connections into a sorted list of tuples, each containing a pair of alphabetically sorted element symbols and the distance between them.
Returns:
Type | Description |
---|---|
list[tuple[tuple[str, str], float]]
|
A sorted list of tuples, each containing a pair of alphabetically sorted element symbols and the distance between them. |
Examples:
radius_sum
property
Retrieve the sum of CIF radius, CIF_refined radius, and Pauling C12 radius for the shortest bonding pairs of elements.
Returns:
Type | Description |
---|---|
dict[str:dict[str:float]]
|
Dictionary where each key is a radius type and the value is a dictionary with the key being a bond pair of elements and the value being the total radius in Angstroms. |
Examples:
>>> cif.radius_values
>>> {
"CIF_radius_sum": {
"In-In": 3.248,
"In-Rh": 2.969,
"In-U": 3.001,
"Rh-Rh": 2.69,
"Rh-U": 2.722,
"U-U": 2.754,
},
"CIF_radius_refined_sum": {
"In-In": 2.657,
"In-Rh": 2.697,
"In-U": 2.943,
"Rh-Rh": 2.737,
"Rh-U": 2.983,
"U-U": 3.229,
},
"Pauling_radius_sum": {
"In-In": 3.32,
"In-Rh": 3.002,
"In-U": 3.176,
"Rh-Rh": 2.684,
"Rh-U": 2.858,
"U-U": 3.032,
},
}
radius_values
property
Retrieve CIF radius, CIF_refined radius, and Pauling C12 radius for each element.
This property uses lazy loading to compute or retrieve radius values only when
needed, optimizing performance. The CIF radius and Pauling C12 radius are standard
values sourced from data/radius.py
for each element. In contrast, the
CIF_refined radius is calculated based on bonding distances to ensure accuracy
across different environments.
- CIF_radius: The standard radius value commonly determined from elemental .cif files, the approximate size of an atom within a crystal structure.
- CIF_radius_refined: An optimized radius calculated to ensure that, across all bonding pairs, the sum of the two radii in a bonded pair attempts to matches the shortest unique observed bond distances. This refinement is designed to improve packing efficiency within a coordination polyhedron.
- Pauling_radius_CN12: The Pauling radius of the element, calculated with a coordination number (CN) of 12, providing a basis for comparison with other radius types.
Returns:
Type | Description |
---|---|
dict[str, dict[str, float]]
|
A dictionary where each key is an atomic label (e.g., "In", "Rh", "U"), and the corresponding value is a dictionary with radius information in Angstroms:
|
Examples:
shortest_bond_pair_distance
property
Determine the minimum distance for all possible unique pair of elements. This property uses lazily loaded connections to compute the distance if they are not already available.
Returns:
Type | Description |
---|---|
dict[tuple[str, str], float]
|
Dictionary where each key is a tuple of element symbols and the float value is the distance between pair of elements in Angstroms. |
Examples:
shortest_distance
property
Lazily retrieve the shortest atomic distance within the crystal
structure. This property is lazily loaded and ensures all necessary
connections are computed beforehand using the @ensure_connections
decorator. The computation calculates the minimum distance between any
pairs of atoms based on the connection data.
Returns:
Type | Description |
---|---|
float
|
The shortest distance between any two connected atoms in the crystal structure, in Angstroms. |
shortest_site_pair_distance
property
Retrieves the shortest distance from each unique atomic site in the crystal structure. This property uses lazily loaded connections to compute these distances if they are not already available.
Returns:
Type | Description |
---|---|
dict[str, tuple[str, float]]
|
dictionary where each key is an atomic label and the value is a tuple containing the label of the closest atomic site and the shortest distance to it in Angstroms |
Examples:
__init__(file_path, is_formatted=False, logging_enabled=False)
Initializes an object from a .cif file.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
file_path
|
str
|
Path to the .cif file. |
required |
is_formatted
|
bool
|
If False, preprocess the .cif file to ensure compatibility with the gemmi library. Default is False. |
False
|
logging_enabled
|
bool
|
Enables detailed logging during initialization and for distance calculations. Default is False. |
False
|
Attributes:
Name | Type | Description |
---|---|---|
file_path |
str
|
Path to the CIF file from which data is loaded. |
logging_enabled |
bool
|
Enables detailed logging for initialization and distance alculations if set to True. |
file_name |
str
|
Base name of the CIF file, extracted from |
file_name_without_ext |
str
|
File name without its extension, useful for referencing or generating derivative files. |
db_source |
str
|
Source database (e.g., ICSD, MP, CCDC, PCD) from which the CIF file originates, determined at runtime. |
unitcell_lengths |
list[float]
|
List of unit cell lengths for the crystal structure, typically in Angstroms. |
unitcell_angles |
list[float]
|
List of unit cell angles in radians, ordered by alpha, beta, gamma. |
site_labels |
list[str]
|
Lists all unique atomic site labels. |
unique_elements |
set[str]
|
Set of unique chemical elements present in the CIF file. |
atom_site_info |
dict[str, any]
|
Dictionary containing detailed information about each atomic site including element, site occupancy, fractional coordinates, symmetry, and multiplicity. |
composition_type |
int
|
Number of unique elements present in the .cif file, e.g., 1 for unary, 2 for binary, etc. |
tag |
str
|
Additional tag associated with the CIF data, parsed from the third line of PCD .cif files. |
bond_pairs |
set[tuple[str, str]]
|
Set of tuples representing bonded pairs of elements. |
site_label_pairs |
set[tuple[str, str]]
|
Set of tuples representing pairs of atomic site labels. |
bond_pairs_sorted_by_mendeleev |
set[tuple[str, str]]
|
Set of bonded pairs sorted according to Mendeleev Numbers. |
site_label_pairs_sorted_by_mendeleev |
set[tuple[str, str]]
|
Set of site label pairs sorted by Mendeleev Numbers. |
site_mixing_type |
str
|
Descriptor of the mixing type, categorized into four types: Full occupancy is assigned when a single atomic site occupies the fractional coordinate with an occupancy value of 1. Full occupancy with mixing is assigned when multiple atomic sites collectively occupy the fractional coordinate to a sum of 1. Deficiency without mixing is assigned when a single atomic site occupying the fractional coordinate with a sum less than 1. Deficiency with atomic mixing is assigned when multiple atomic sites occupy the fractional coordinate with a sum less than 1. |
is_radius_data_available |
bool
|
Indicates whether Pauling and CIF atomic radii are available for all elements in the .cif file. |
mixing_info_per_label_pair |
dict
|
Dictionary mapping pairs of labels to their mixing information. |
mixing_info_per_label_pair_sorted_by_mendeleev |
dict
|
Same as |
unitcell_points |
list[list[tuple[float, float, float, str]]]
|
List of points defining the unit cell; each point contains fractional coordinates and a site label. |
supercell_points |
list[list[tuple[float, float, float, str]]]
|
List of points defining the supercell of the cell For each .cif file, a unit cell is generated by applying the symmetry operations. A supercell is generated by applying ±1 shifts from the unit cell. |
unitcell_atom_count |
int
|
Total count of atoms within the unit cell. |
supercell_atom_count |
int
|
Total count of atoms within the generated supercell incorporating ±1, ±1, ±1 translations. |
connections |
None or dict
|
Initially None, intended to store connection data related to the crystal structure. Connections are computed lazily and are only calculated when first needed by a method or property requiring them. |
compute_connections(cutoff_radius=10.0)
Compute onnection network, shortest distances, bond counts, and coordination numbers (CN). These prperties are lazily loaded to avoid unnecessary computation during the initialization and pre-processing step.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
cutoff_radius
|
float
|
The distance threshold in Angstroms used to consider two atoms as connected, by default 10.0 |
10.0
|
plot_polyhedron(site_label, show_labels=True, is_displayed=False, output_dir=None)
Function to plot a polyhedron structure and optionally saves it.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
site_label
|
str
|
Central site label for the polyhedron |
required |
show_labels
|
bool
|
Whether to display vertex labels, by default True |
True
|
is_displayed
|
bool
|
Display plot interactively, by default False |
False
|
output_dir
|
str
|
Directory to save the plot, by default None |
None
|
CifEnsemble
CN_unique_values_by_best_methods: set[str]
property
Returns:
Type | Description |
---|---|
set[str]
|
Unique coordination number by best methods from all .cif files. |
CN_unique_values_by_min_dist_method: set[str]
property
Returns:
Type | Description |
---|---|
set[str]
|
Unique coordination number values by minimum distance method from all .cif files. |
unique_composition_types: set[int]
property
unique_elements: set[str]
property
unique_formulas: set[str]
property
unique_site_mixing_types: set[int]
property
unique_space_group_names: set[str]
property
unique_space_group_numbers: set[str]
property
unique_structures: set[str]
property
unique_tags: set[str]
property
__init__(cif_dir_path, add_nested_files=False, preprocess=True, logging_enabled=False)
Initialize a CifEnsemble object, containing a collection of Cif objects.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
cif_dir_path
|
str
|
Path to the folder path containing .cif file(s). |
required |
add_nested_files
|
bool
|
Option to include .cif files contained in sub-directories within cif_dir_path , by default False |
False
|
preprocess
|
bool
|
Option to edit .cif files before initializing each into a Cif object, by default True. Preprocess modifies atomic site labels in atom_site_label. Some site labels may contain a comma or a symbol like M due to atomic mixing. It reformats each atom_site_label so it can be parsed into an element type matching atom_site_type_symbol. For PCD databases, addresses in publ_author_address often have an incorrect format requiring manual modifications. It also relocates any ill-formatted files, such as those with duplicate labels in atom_site_label, missing fractional coordinates, or files requiring supercell generation. |
True
|
logging_enabled
|
bool
|
Option to log while pre-processing Cif objects, by default False |
False
|
Attributes:
Name | Type | Description |
---|---|---|
dir_path |
str
|
Path to the folder containing .cif files |
file_paths |
list[str]
|
List of file paths to .cif files |
cifs |
list[Cif]
|
List of Cif objects |
file_count |
int
|
Number of .cif files in the folder |
logging_enabled |
bool
|
Option to log while pre-processing Cif objects |
copy_cif_files(file_paths, to_directory_path)
Copy a set of CIF files to a destination directory.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
file_paths
|
set[str]
|
Set of file paths to CIF files. |
required |
to_directory_path
|
str
|
Destination directory path. |
required |
Examples:
generate_CN_by_best_methods_histogram(display=False, output_dir=None)
Generate a histogram of the 'CN_by_best_methods' property from CIF files.
This method creates a histogram based on the 'CN_by_best_methods' statistics of the CIF files. If 'output_dir' is specified, the histogram image (.png) will be saved to that directory. If 'output_dir' is not specified, the image will be saved to the directory specified by 'self.dir_path'.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
display
|
bool
|
If True, the plot is displayed using plt.show(). Default is False. |
False
|
output_dir
|
str
|
The directory path where the histogram should be saved. If None, the histogram is saved in the directory defined by 'self.dir_path'. |
None
|
generate_CN_by_min_dist_method_histogram(display=False, output_dir=None)
Generate a histogram of the 'CN_by_min' property from CIF files.
This method creates a histogram based on the 'CN_by_min' statistics of the CIF files. If 'output_dir' is specified, the histogram image (.png) will be saved to that directory. If 'output_dir' is not specified, the image will be saved to the directory specified by 'self.dir_path'.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
display
|
bool
|
If True, the plot is displayed using plt.show(). Default is False. |
False
|
output_dir
|
str
|
The directory path where the histogram should be saved. If None, the histogram is saved in the directory defined by 'self.dir_path'. |
None
|
generate_composition_type_histogram(display=False, output_dir=None)
Generate a histogram of the 'composition_type' property from CIF files.
This method creates a histogram based on the 'composition_type' statistics of the CIF files. If 'output_dir' is specified, the histogram image (.png) will be saved to that directory. If 'output_dir' is not specified, the image will be saved to the directory specified by 'self.dir_path'.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
display
|
bool
|
If True, the plot is displayed using plt.show(). Default is False. |
False
|
output_dir
|
str
|
The directory path where the histogram should be saved. If None, the histogram is saved in the directory defined by 'self.dir_path'. |
None
|
generate_elements_histogram(display=False, output_dir=None)
Generate a histogram of the 'unique_elements' property from CIF files.
This method creates a histogram based on the 'unique_elements' statistics of the CIF files. If 'output_dir' is specified, the histogram image (.png) will be saved to that directory. If 'output_dir' is not specified, the image will be saved to the directory specified by 'self.dir_path'.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
display
|
bool
|
If True, the plot is displayed using plt.show(). Default is False. |
False
|
output_dir
|
str
|
The directory path where the histogram should be saved. If None, the histogram is saved in the directory defined by 'self.dir_path'. |
None
|
generate_formula_histogram(display=False, output_dir=None)
Generate a histogram of the 'formula' property from CIF files.
This method creates a histogram based on the 'formula' statistics of the CIF files. If 'output_dir' is specified, the histogram image (.png) will be saved to that directory. If 'output_dir' is not specified, the image will be saved to the directory specified by 'self.dir_path'.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
display
|
bool
|
If True, the plot is displayed using plt.show(). Default is False. |
False
|
output_dir
|
str
|
The directory path where the histogram should be saved. If None, the histogram is saved in the directory defined by 'self.dir_path'. |
None
|
generate_site_mixing_type_histogram(display=False, output_dir=None)
Generate a histogram of the 'site_mixing_type' property from CIF files.
This method creates a histogram based on the 'site_mixing_type' statistics of the CIF files. If 'output_dir' is specified, the histogram image (.png) will be saved to that directory. If 'output_dir' is not specified, the image will be saved to the directory specified by 'self.dir_path'.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
display
|
bool
|
If True, the plot is displayed using plt.show(). Default is False. |
False
|
output_dir
|
str
|
The directory path where the histogram should be saved. If None, the histogram is saved in the directory defined by 'self.dir_path'. |
None
|
generate_space_group_name_histogram(display=False, output_dir=None)
Generate a histogram of the 'space_group_name' property from CIF files.
This method creates a histogram based on the 'space_group_name' statistics of the CIF files. If 'output_dir' is specified, the histogram image (.png) will be saved to that directory. If 'output_dir' is not specified, the image will be saved to the directory specified by 'self.dir_path'.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
display
|
bool
|
If True, the plot is displayed using plt.show(). Default is False. |
False
|
output_dir
|
str
|
The directory path where the histogram should be saved. If None, the histogram is saved in the directory defined by 'self.dir_path'. |
None
|
generate_space_group_number_histogram(display=False, output_dir=None)
Generate a histogram of the 'space_group_number' property from CIF files.
This method creates a histogram based on the 'space_group_number' statistics of the CIF files. If 'output_dir' is specified, the histogram image (.png) will be saved to that directory. If 'output_dir' is not specified, the image will be saved to the directory specified by 'self.dir_path'.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
display
|
bool
|
If True, the plot is displayed using plt.show(). Default is False. |
False
|
output_dir
|
str
|
The directory path where the histogram should be saved. If None, the histogram is saved in the directory defined by 'self.dir_path'. |
None
|
generate_structure_histogram(display=False, output_dir=None)
Generate a histogram of the 'structure' property from CIF files.
This method creates a histogram based on the 'structure' statistics of the CIF files. If 'output_dir' is specified, the histogram image (.png) will be saved to that directory. If 'output_dir' is not specified, the image will be saved to the directory specified by 'self.dir_path'.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
display
|
bool
|
If True, the plot is displayed using plt.show(). Default is False. |
False
|
output_dir
|
str
|
The directory path where the histogram should be saved. If None, the histogram is saved in the directory defined by 'self.dir_path'. |
None
|
generate_supercell_size_histogram(display=False, output_dir=None)
Generate a histogram of the 'supercell_count' property from CIF files.
This method creates a histogram based on the 'supercell_count' statistics of the CIF files. If 'output_dir' is specified, the histogram image (.png) will be saved to that directory. If 'output_dir' is not specified, the image will be saved to the directory specified by 'self.dir_path'.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
display
|
bool
|
If True, the plot is displayed using plt.show(). Default is False. |
False
|
output_dir
|
str
|
The directory path where the histogram should be saved. If None, the histogram is saved in the directory defined by 'self.dir_path'. |
None
|
generate_tag_histogram(display=False, output_dir=None)
Generate a histogram of the 'tag' property from CIF files.
This method creates a histogram based on the 'tag' statistics of the CIF files. If 'output_dir' is specified, the histogram image (.png) will be saved to that directory. If 'output_dir' is not specified, the image will be saved to the directory specified by 'self.dir_path'.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
display
|
bool
|
If True, the plot is displayed using plt.show(). Default is False. |
False
|
output_dir
|
str
|
The directory path where the histogram should be saved. If None, the histogram is saved in the directory defined by 'self.dir_path'. |
None
|
move_cif_files(file_paths, to_directory_path)
Move a set of CIF files to a destination directory.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
file_paths
|
set[str]
|
Set of file paths to CIF files. |
required |
to_directory_path
|
str
|
Destination directory path. |
required |
Examples: