Getting started
The recommended way is to use the SAF app, a command-line interface (CLI) application that can automatically detect folders containing .cif
files.
Method 1. Using SAF Application
First, download the SAF application from the GitHub repository. You can clone (download) the files using the following command:
git clone https://github.com/bobleesj/structure-analyzer-featurizer-app.git
Note
Alternatively, you can download the ZIP file from the GitHub repository (https://github.com/bobleesj/structure-analyzer-featurizer-app) by clicking the green Code button and Download ZIP. After downloading, extract the contents of the ZIP file to a directory of your choice.
Next, navigate to the directory and install the required package using pip:
cd structure-analyzer-featurizer-app
pip install structure-analyzer-featurizer
You can then run the application by executing the following command:
python main.py
Upon running python main.py
, you will be prompted to choose from one of the following options:
Folders with .cif files detected:
1. 20240902_PCD_demo_files (20 files)
2. 20240902_ICSD_demo_files (20 files)
Would you like to process each folder above sequentially?
(Default: Y) [Y/n]:
Press Enter
to generate structure features for the .cif
files in the chosen folder. At the end, .csv
files will be saved in the chosen project directory, including csv/<composition-type>_features.csv
and csv/universal_features.csv
.
Note
Are you having trouble running code? Learn to use conda environments by following the instructions provided here.
Method 2. Import SAF in Python file or Jupyter notebook
You might be interested in generating compositional features without using the SAF application.
pip install structure-analyzer-featurizer
This will install key packages such as cifkit
and bobleesj.utils
that are required to run the SAF package. Then, you can generate features by calling the function provided in the SAF package directly.
from cifkit import Cif
from SAF.features.generator import (
compute_binary_features,
compute_quaternary_features,
compute_ternary_features,
)
try:
if len(cif.unique_elements) == 2:
features, uni_features = compute_binary_features(file_path)
binary_data.append(features)
if len(cif.unique_elements) == 3:
features, uni_features = compute_ternary_features(file_path)
ternary_data.append(features)
if len(cif.unique_elements) == 4:
features, uni_features = compute_quaternary_features(file_path)
except Exception as e:
print(f"Error found for {file_path}. Reason: {e}")
How can I specify the elements for A
, B
in binary, R
, M
, X
in ternary, and A
, B
, C
, D
in quaternary systems?
By default, SAF
automatically orders the elements from highest to lowest Mendeleev number. The Mendeleev number for each element is parsed from the bobleesj.utils
Python package. If you want to specify the order of the elements, you can provide a custom label mapping dictionary to the compute_binary_features
, compute_ternary_features
, or compute_quaternary_features
functions, as shown below.
custom_labels = {
2: {"A": ["Fe", "Co"], "B": ["Si", "Ga"]},
3: {"R": ["Sc", "Y"], "M": ["Fe", "Co"], "X": ["Si", "Ga"]},
4: {"A": ["Sc", "Y"], "B": ["Fe", "Co"], "C": ["Si", "Ga"], "D": ["Gd", "Tb", "Dy"]},
}
file_path = "path/to/your/cif_file.cif"
compute_binary_features(file_path, custom_labels=custom_labels)
Alternatively, you can provide a custom label mapping dictionary using this template Excel file and the ElementSorter
class from the bobleesj.utils.sorters.element_sorter
module:
from bobleesj.utils.sorters.element_sorter import ElementSorter
excel_file = "path/to/your/custom_labels.xlsx"
element_sorter = ElementSorter(excel_path=excel_file)
custom_labels = element_sorter.label_mapping
compute_binary_features(file_path, custom_labels=custom_labels)