Overview¶
cctk: a Python-based computational chemistry toolkit.
cctk simplifies routine tasks in computational chemistry: preparing input files with scripts, checking whether jobs ran successfully, extracting energies and geometries, etc. All cctk operations are carried out using Python scripts. The prototypical workflow involves:
Reading in output files from a quantum chemistry program like Gaussian.
Analyzing the extracted data (e.g., determining which structure is lowest in energy).
Writing out new input files for further calculations.
Further analysis or visualization with pandas or matplotlib.
cctk Objects¶
Use these three main classes to interact with external quantum chemistry programs:
1. Molecule
¶
A single molecular geometry.
Field |
Description |
---|---|
|
the atomic number for each atom |
|
xyz coordinates |
|
the connectivity as a networkx graph |
|
the overall charge |
|
the spin multiplicity |
All arrays that refer to atoms in cctk are 1-indexed (i.e., 1, 2, …, n). Thus, both the
atomic_numbers
andgeometry
fields are 1-indexed. In contrast, all arrays that refer to non-atoms are 0-indexed.Various methods are available to measure or set geometric parameters (bond distances, bond angles, or dihedral angles).
2. Ensemble
¶
A collection of molecules and associated properties.
Each
Molecule
in theEnsemble
is associated with its properties (filenames, energies, NMR shieldings, etc.) using a dictionary. For example, a conformation of pentane might be mapped to thisdict
:properties_dict = { 'energy': -0.0552410743198, 'scf_iterations': 2, 'link1_idx': 0, 'filename': 'test/static/pentane_conformation_1.out', ... }To access
Ensemble
information, use the following syntax:
Syntax |
Result |
---|---|
|
iterator over all molecules |
|
the i-th molecule (0-indexed) |
|
the second and third molecules as a list |
|
the last molecule |
|
iterator over (molecule, property dictionary) tuples |
|
the property dictionary associated with |
|
one-dimensional array of energies, with |
|
two-dimensional array of filenames and energies, with |
|
list of molecules |
|
list of the property dictionaries |
|
|
|
|
Thus, Ensembles can be indexed or sliced to return smaller Ensembles. Note that while all such sub-Ensembles are new
Ensemble
objectes, they are essentially views of the originalEnsemble
, rather than deep copies.A
ConformationalEnsemble
is a special case of anEnsemble
in which each structure corresponds to the same molecule. This allows for RMSD calculation, structural alignment, and redundant conformer elimination to be carried out as desired (see tutorials).
3. GaussianFile
¶
The results of a Gaussian job or the contents of an input file:
gaussian_file = cctk.GaussianFile.read_file(filename)
filename
may be a Gaussian output file (.out
/.log
) or a Gaussian input file (.gjf
/.com
).Important: cctk assumes that all Gaussian jobs will be run in verbose mode (
#p
in the route card). Parsing will not work correctly without#p
.As usual, molecules and their properties are stored in
gaussian_file.ensemble
:ensemble = first_link.ensemble energies = list(ensemble[:,"energy"]) # [-40.5169484082, -40.5183831835, -40.5183831835]) ensemble = second_link.ensemble shieldings = ensemble[-1,"isotropic_shielding"] # [192.9242, 31.8851, 31.8851, 31.8851, 31.8851]Per cctk convention (vide infra),
energies
is 0-indexed, butshieldings
is 1-indexed. (The-1
refers to the last geometry.)(Note: if a Gaussian input file is read, no properties will be available, so the properties_dict for each molecule will be empty.)
Some Gaussian output files are composites of multiple jobs using the Link1 directive. In that case,
GaussianFile.read_file(filename)
will return oneGaussianFile
object per Link1 section.For example, this is a two-step job:
gaussian_file = cctk.GaussianFile.read_file("test/static/methane2.out") assert len(gaussian_file), 2 first_link = gaussian_file[0] second_link = gaussian_file[1]cctk will also interpret common job types via the
cctk.JobType
enum:# first_link.job_types = [JobType.OPT, JobType.FREQ, JobType.SP]
Field |
Description |
---|---|
|
|
|
list of what kind of jobs were run |
|
number of successful terminations |
|
dictionary containing Link0 information (memory, processors, checkpoint filename, etc.) |
|
route card (must start with |
|
title of Gaussian file |
|
footer (optional) |
|
how long this |
Limited support for other file formats is available (see Features section of documentation).
Indexing¶
In cctk, arrays whose contents refer to atoms are always 1-indexed; other arrays are 0-indexed.
Thus, arrays of atomic numbers, positions, or NMR shieldings are 1-indexed, while arrays of molecules, files, or molecular property values are 0-indexed.
1-indexed arrays are implemented via cctk.OneIndexedArray
, a custom subclass of np.ndarray
.
For example:
molecule.geometry[1]
will return the coordinates of the first atom of the Molecule
. However:
ensemble.molecules[0]
returns the first molecule of the Ensemble
.