cctk.Molecule¶

class
cctk.
Molecule
(atomic_numbers, geometry, name=None, bonds=None, charge=0, multiplicity=1, checks=True)[source]¶ Bases:
object
Class representing a single molecular geometry.
In contrast to typical Python behavior,
atomic_numbers
andgeometry
are indexed from one, to simplify interfacing with computational chemistry programs. This has been done by defining a custom wrapper fornumpy.ndarray
calledcctk.OneIndexedArray
.All other datatypes are indexed from 0.

name
¶ for identification, optional
 Type
str

atomic_numbers
¶ list of atomic numbers
 Type
cctk.OneIndexedArray, dtype=np.int8

geometry
¶ list of 3tuples of xyz coordinates  same ordering as
atomic_numbers
 Type

bonds
¶ connectivity graph or list of 2tuples, with each element representing the 1indexed atom number of a bonded pair
 Type
nx.Graph or list of tuples

charge
¶ the charge of the molecule
 Type
int

multiplicity
¶ the spin state of the molecule (1 corresponds to singlet, 2 to doublet, 3 to triplet, etc. – so a multiplicity of 1 is equivalent to S=0)
 Type
int

vibrational_modes
¶ vibrational modes
 Type
list of cctk.VibrationalMode

__init__
(atomic_numbers, geometry, name=None, bonds=None, charge=0, multiplicity=1, checks=True)[source]¶ Create new Molecule object, and assign connectivity if needed.
bonds
must be a list of edges (i.e. an n x 2numpy
array).If
checks
is True, the atomic numbers in bonds will all be checked for consistency. This option can be disabled by settingchecks
to False, but this is not recommended for external data.
Methods
__init__
(atomic_numbers, geometry[, name, …])Create new Molecule object, and assign connectivity if needed.
add_atom
(symbol, coordinates)Add an atom with symbol
symbol
at positioncoordinates
.add_atom_at_centroid
(symbol, atom_numbers[, …])Adds atom with symbol
symbol
at the centroid of the atoms inatom_numbers
.add_bond
(atom1, atom2[, bond_order, check])Adds a new bond to the bond graph, or updates the existing bond order.
are_connected
(atom1, atom2)Wrapper to tell if two atoms are connected.
assign_connectivity
([cutoff, …])Automatically recalculates bonds based on covalent radii.
atom_string
(atom)Returns the elemental symbol and the atom number for a given atom.
Return list of atomic symbols.
atoms_moving_in_imaginary
([max_num, …])Returns atoms moving in imaginary, ranked by how much they’re moving.
calculate_mass_spectrum
(**kwargs)Generates list of m/z values.
center
()Moves the centroid to the origin.
center_periodic
(center, side_length)Adjusts a molecule to be in the center of a cube, moving all other molecules accordingly.
check_for_conflicts
([min_buffer, group1, group2])Automatically checks for conflicts based on covalent radii.
combine_molecules
(molecule1, molecule2)Combine two molecules into one final molecule.
compute_energy
([nprocs])Compute energy of molecule at the GFN2xtb level of theory.
coulomb_analysis
(atoms1, atoms2, charges)Computes the net Coulomb forces between atoms
atoms1
and atomsatoms2
.csearch
([nprocs, constraints, logfile, …])Optimize molecule at the GFN2xtb level of theory.
epimerize
(center_atom, substituent1, …)Epimerizes
center_atom
by exchanging the groups corresponding tosubstituent1
andsubstituent2
.equal
(mol1, mol2)Atomic numbers, geometry, charge, and multiplicity all must match.
formula
([return_dict])Returns the atomic formula.
fragment
()Returns list of
cctk.Molecule
objects based on the bondconnected components ofself
.from_string
(string[, check_version])Loads a
cctk.Molecule
object from a string.get_adjacent_atoms
(atom)Returns a list of the neighbors of
atom
.get_angle
([atom1, atom2, atom3, check, …])Wrapper to compute angle between three atoms.
get_atomic_number
(atom)Get the atomic number for a given atom.
get_atomic_symbol
(atom)Get the atomic symbol for a given atom.
Get a list of atomic symbols for this Molecule.
get_atoms_by_symbol
(symbol)Returns all the numbers of atoms of type
symbol
in the molecule.get_bond_order
(atom1, atom2)Wrapper to get bond order between two atoms.
Returns a list of all the connected components in a molecule.
get_dihedral
([atom1, atom2, atom3, atom4, …])Wrapper to compute dihedral angle between four atoms.
get_distance
([atom1, atom2, check, _dist, atoms])Wrapper to compute distance between two atoms.
Returns a list of all the heavy atoms in the molecule (i.e., not hydrogen), for array indexing.
Determine how many atoms are in this Molecule.
get_sq_distance
(atom1, atom2[, check])Wrapper to compute squared distance between two atoms – optimized for speed!
Returns lists of symmetric atoms, as defined in
cctk.load_group
.get_vector
(atom[, atom2, check])Get the geometry vector for a given atom.
is_atom_in_ring
(atom)limit_solvent_shell
([solute, num_atoms, …])Automatically detects solvent molecules and removes them until you have a set number of solvents or atoms.
new_from_name
(name)Create a new
Molecule
instance usingrdkit
.new_from_smiles
(smiles)Create a new
Molecule
instance usingrdkit
.Returns a list of the number of neighbors of each atom.
optimize
([inplace, nprocs, return_energy])Optimize molecule at the GFN2xtb level of theory.
optimize_dihedral
(atom1, atom2, atom3, atom4)Minimizes the value of
self.rms_distance_between_atoms
for the given dihedral, in onedegree increments.perturb
([size])This function can be used to generate a slightly different molecule in cases where numerical (or geometric) converge is problematic.
remove_atom
(number)Remove the atom with number
number
.remove_bond
(atom1, atom2)Alias for
self.add_bond(atom1, atom2, bond_order=0)
– more intuitive nomenclature.renumber_to_match
(model[, check_chirality])Renumbers atoms to match
model
(must have isomorphic bond graph).Returns the RMS distance (in Angstroms) between every pair of atoms  a quick, easytocalculate proxy for minimizing steric clashes.
rotate_molecule
(axis, degrees)Rotates the whole molecule around the given axis.
set_angle
([atom1, atom2, atom3, angle, …])Adjusts the
atom1
–atom2
–atom3
bond angle to be a fixed value by movingatom3
.set_dihedral
([atom1, atom2, atom3, atom4, …])Adjusts the
atom1
–atom2
–atom3
–atom4
dihedral angle to be a fixed value by moving atom 4.set_distance
([atom1, atom2, distance, move, …])Adjusts the
atom1
–atom2
bond length to be a fixed distance by moving atom2.swap_atom_numbers
(atom1, atom2)Interchanges the numbers of
atom1
andatom2
.Save the current molecule as a string, for subsequent loading.
translate_molecule
(vector)Translates the whole molecule by the given vector.
volume
([pts_per_angstrom, qhull])Returns volume calculated using the Gavezotti algorithm (JACS, 1983, 105, 5220).

add_atom
(symbol, coordinates)[source]¶ Add an atom with symbol
symbol
at positioncoordinates
. Parameters
symbol (str) – symbol of the atom (e.g. “Cl”, “Ar”, “C”)
coordinates (list) – the coordinates to add
 Returns
the Molecule object

add_atom_at_centroid
(symbol, atom_numbers, weighted=False)[source]¶ Adds atom with symbol
symbol
at the centroid of the atoms inatom_numbers
.If
weighted
isTrue
, then the centroid calculation will take into account the atomic numbers of the atoms in question (placing the atom closer to more massive atoms).Otherwise, the average is unweighted.
 Parameters
symbol (str) – the atomic symbol of the atom to be added
atom_numbers (list) – which atoms to put the new atom between
weighted (Bool) – if the centroid calculation should be weighted (see above)
 Returns
the Molecule object

add_bond
(atom1, atom2, bond_order=1, check=True)[source]¶ Adds a new bond to the bond graph, or updates the existing bond order. Will not throw an error if the bond already exists.
 Parameters
atom1 (int) – the number of the first atom
atom2 (int) – the number of the second atom
bond_order (int) – bond order of bond between atom1 and atom2

assign_connectivity
(cutoff=0.2, periodic_boundary_conditions=None)[source]¶ Automatically recalculates bonds based on covalent radii. If two atoms are closer than the sum of their covalent radii +
cutoff
Angstroms, then they are considered bonded. Parameters
cutoff (float) – the threshold (in Angstroms) for how close two covalent radii must be to be considered bonded
 Returns
self

atom_string
(atom)[source]¶ Returns the elemental symbol and the atom number for a given atom.
For example,
methane.atom_string(1)
might return “C1”. Parameters
atom (int) – number of the atom
 Returns
the aforementioned atom string

atoms_moving_in_imaginary
(max_num=5, percent_cutoff=0.03, return_string=False)[source]¶ Returns atoms moving in imaginary, ranked by how much they’re moving.
 Parameters
max_num (int) – how many atoms max to return
percent_cutoff (float) – threshold for what percent of total TS movement qualifies as “movement”
return_string (bool) – whether or not to return a formatted string report
 Returns
list of atomic numbers or string

calculate_mass_spectrum
(**kwargs)[source]¶ Generates list of m/z values.
Final weights rounded to one decimal point (because of lowres MS).

center_periodic
(center, side_length)[source]¶ Adjusts a molecule to be in the center of a cube, moving all other molecules accordingly. Bonded subgroups will be moved as a unit.
For analysis of MD files with periodic boundary conditions.
 Parameters
center (int) – atomic number to center
side_length (float) – length of side, in Å

check_for_conflicts
(min_buffer=1, group1=None, group2=None)[source]¶ Automatically checks for conflicts based on covalent radii. If two atoms are closer than the sum of their covalent radii + buffer, then they are considered clashing. If group1 and group2 are selected, then conflicts will only be evaluated between these two groups of atoms.
 Parameters
min_buffer (float) – the threshold (in Angstroms) for how close two covalent radii must be to be considered clashing. 1.0 A is default, empirically.
group1 (list) – atoms to evaluate against group2 (if None, defaults to all atoms)
group2 (list) – atoms to evaluate against group1 (if None, defaults to all atoms)
 Returns
True if there are no conflicts ValueError if there is a conflict

classmethod
combine_molecules
(molecule1, molecule2)[source]¶ Combine two molecules into one final molecule.
Bonding information is not currently preserved.

compute_energy
(nprocs=1)[source]¶ Compute energy of molecule at the GFN2xtb level of theory.
 Parameters
nprocs (int) – number of processors to use

coulomb_analysis
(atoms1, atoms2, charges)[source]¶ Computes the net Coulomb forces between atoms
atoms1
and atomsatoms2
.

csearch
(nprocs=1, constraints=[], logfile=None, noncovalent=False, use_tempdir=True)[source]¶ Optimize molecule at the GFN2xtb level of theory.
 Parameters
nprocs (int) – number of processors to use
constraints (list) – atoms numbers to freeze
noncovalent (bool) – whether or not to use noncovalent settings
logfile (str) – file to write ongoing
crest
output touse_tempdir (bool) – write intermediate files to hidden directory (as opposed to current directory)
 Returns
ConformationalEnsemble

epimerize
(center_atom, substituent1, substituent2)[source]¶ Epimerizes
center_atom
by exchanging the groups corresponding tosubstituent1
andsubstituent2
. Both substituents must be bonded to the center atom! Parameters
center_atom (int) – number of middle atom
substituent1 (int) – number of 1st atom
substituent1 – number of 2nd atom
 Returns
new
Molecule
object (does not modify inplace)

classmethod
equal
(mol1, mol2)[source]¶ Atomic numbers, geometry, charge, and multiplicity all must match. Name is irrelevant.

formula
(return_dict=False)[source]¶ Returns the atomic formula.
If
return_dict
isTrue
, then returns adictionary
with keys elemental symbols and values the number of occurrences.For instance,
water.formula()
would return{'O': 1, 'H': 2}
.If
return_dict
isFalse
, then returns a stringified version of the formula according to standard rules.For instance,
water.formula()
would returnH2O
. Parameters
return_dict (Bool) – if the method should return a string or a dictionary
 Returns
a dictionary or string representing the molecule’s formula

fragment
()[source]¶ Returns list of
cctk.Molecule
objects based on the bondconnected components ofself
.

classmethod
from_string
(string, check_version=True)[source]¶ Loads a
cctk.Molecule
object from a string. Parameters
string (str) – stringified version of the molecule
check_version (bool) – whether version consistency should be enforced

get_adjacent_atoms
(atom)[source]¶ Returns a list of the neighbors of
atom
. Ifatom
has no neighbors, an empty list will be returned.

get_angle
(atom1=None, atom2=None, atom3=None, check=True, _angle=<function compute_angle_between>, atoms=None)[source]¶ Wrapper to compute angle between three atoms.
This function is relatively slow (ratelimiting for certain applications), so performance boosts have been implemented (e.g. preloading
_angle
). Parameters
atom1 (int) – number of the first atom
atom2 (int) – number of the second atom
atom3 (int) – number of the third atom
check (Bool) – whether to validate input data (can be overridden to prevent slow doublechecking)
_angle (function) – function usd to compute angle
atoms (list) – list of atom numbers
 Returns
the angle, in degrees

get_atomic_number
(atom)[source]¶ Get the atomic number for a given atom.
 Parameters
atom (int) – number of the first atom
 Returns
the atomic number of that atom
 Return type
atomic_number (int)

get_atomic_symbol
(atom)[source]¶ Get the atomic symbol for a given atom.
 Parameters
atom (int) – number of the first atom
 Returns
the atomic symbol of that atom
 Return type
atomic_symbol (str)

get_atomic_symbols
()[source]¶ Get a list of atomic symbols for this Molecule.
 Returns
the atomic symbols
 Return type
atomic_symbols (cctk.OneIndexedArray)

get_atoms_by_symbol
(symbol)[source]¶ Returns all the numbers of atoms of type
symbol
in the molecule.

get_bond_order
(atom1, atom2)[source]¶ Wrapper to get bond order between two atoms.
 Parameters
atom1 (int) – number of the first atom
atom2 (int) – number of the second atom
 Returns
the bond order

get_dihedral
(atom1=None, atom2=None, atom3=None, atom4=None, check=True, _dihedral=<function compute_dihedral_between>, atoms=None)[source]¶ Wrapper to compute dihedral angle between four atoms.
This function is relatively slow (ratelimiting for certain applications), so performance boosts have been implemented (e.g. preloading
_dihedral
). Parameters
atom1 (int) – number of the first atom
atom2 (int) – number of the second atom
atom3 (int) – number of the third atom
atom4 (int) – number of the fourth atom
check (Bool) – whether to validate input data (can be overridden to prevent slow doublechecking)
_dihedral (function) – function used to compute dihedral
atoms (list) – list of atom numbers
 Returns
the dihedral angle, in degrees

get_distance
(atom1=None, atom2=None, check=True, _dist=<function compute_distance_between>, atoms=None)[source]¶ Wrapper to compute distance between two atoms.
This function is relatively slow (ratelimiting for certain applications), so performance boosts have been implemented (e.g. preloading
_dist
). Parameters
atom1 (int) – number of the first atom
atom2 (int) – number of the second atom
check (Bool) – whether to validate input data (can be overridden to prevent slow doublechecking)
_dist (function) – function usd to compute distance
atoms (list) – list of atomic numbers
 Returns
the distance, in Angstroms

get_heavy_atoms
()[source]¶ Returns a list of all the heavy atoms in the molecule (i.e., not hydrogen), for array indexing.

get_n_atoms
()[source]¶ Determine how many atoms are in this Molecule.
 Returns
n_atoms (int): the number of atoms

get_sq_distance
(atom1, atom2, check=True)[source]¶ Wrapper to compute squared distance between two atoms – optimized for speed!
 Parameters
atom1 (int) – number of the first atom
atom2 (int) – number of the second atom
check (Bool) – whether to validate input data (can be overridden to prevent slow doublechecking)
 Returns
the squared distance

get_symmetric_atoms
()[source]¶ Returns lists of symmetric atoms, as defined in
cctk.load_group
.Useful for NMR spectroscopy, etc.

get_vector
(atom, atom2=None, check=True)[source]¶ Get the geometry vector for a given atom. If two atoms are specified, gives the vector connecting them (from
atom2
toatom
).mol.get_vector(atom)
is thus equivalent tomol.get_vector(atom, origin)
. Parameters
atom1 (int) – number of the first atom
atom2 (int) – number of the second atom (optional)
check (Bool) – whether to validate input data (can be overridden to prevent slow doublechecking)
 Returns
a Numpy array

limit_solvent_shell
(solute=0, num_atoms=0, num_solvents=10, distance_from_atom=None, return_idxs=False)[source]¶ Automatically detects solvent molecules and removes them until you have a set number of solvents or atoms.
The “distance” between molecules is the minimum of the pairwise atomic distances, to emphasize innersphere interactions.
 Parameters
solute (int) – which fragment is the solute, 0indexed
num_atoms (int) – remove atoms until there are this number (modulo the size of a solvent molecule)
num_solvents (int) – remove solvent molecules until there are this number
distance_from_atom (int) – if you want to find molecules closest to a given atom in the solute, specify the atom number here. if this atom is not in the solute fragment, an exception will be raised.
return_idxs (bool) – if True, indices of atoms that would be in the new molecule are returned. no change is made to
self
.
 Returns
new
Molecule
object

optimize
(inplace=True, nprocs=1, return_energy=False)[source]¶ Optimize molecule at the GFN2xtb level of theory.
 Parameters
inplace (Bool) – whether or not to return a new molecule or simply modify
self.geometry
nprocs (int) – number of processors to use
return_energy (Bool) – whether to return energy or not

optimize_dihedral
(atom1, atom2, atom3, atom4, step=10)[source]¶ Minimizes the value of
self.rms_distance_between_atoms
for the given dihedral, in onedegree increments. A cheap alternative to geometry optimization using ab initio methods or density functional theory. Parameters
atom1 (int) – atom number of the first atom in the dihedral
atom2 (int) – atom number of the second atom in the dihedral
atom3 (int) – atom number of the third atom in the dihedral
atom4 (int) – atom number of the fourth atom in the dihedral
step (float) – explore angles from 0 to 360 with this stepsize in degrees
 Returns
the final value of the angle

perturb
(size=0.005)[source]¶ This function can be used to generate a slightly different molecule in cases where numerical (or geometric) converge is problematic.
It adds a random variable (sampled from a normal distribution, centered at 0 with stddev
size`) to every number in ``self.geometry
. Parameters
size (float) – stddev of the normal distribution
 Returns
the Molecule object

remove_atom
(number)[source]¶ Remove the atom with number
number
. Parameters
number (int) – number of the atom
 Returns
the Molecule object

remove_bond
(atom1, atom2)[source]¶ Alias for
self.add_bond(atom1, atom2, bond_order=0)
– more intuitive nomenclature.

renumber_to_match
(model, check_chirality='all')[source]¶ Renumbers atoms to match
model
(must have isomorphic bond graph). Returns a copy ofself
with renumbered atoms. Parameters
model (cctk.Molecule) – isomorphic molecule to renumber by
check_chirality (list of atomic numbers) – atomic numbers to check, to prevent inversion due to graph isomorphism. Alternatively
None
will prevent any checking and “all” will usecctk.topology.get_exchangable_centers()
.
 Returns
new
Molecule
object

rms_distance_between_atoms
()[source]¶ Returns the RMS distance (in Angstroms) between every pair of atoms  a quick, easytocalculate proxy for minimizing steric clashes.

rotate_molecule
(axis, degrees)[source]¶ Rotates the whole molecule around the given axis.
 Parameters
axis (vector) – the vector to rotate about
theta (float) – how much to rotate (in degrees)
 Returns
the Molecule object

set_angle
(atom1=None, atom2=None, atom3=None, angle=None, move='group', atoms=None)[source]¶ Adjusts the
atom1
–atom2
–atom3
bond angle to be a fixed value by movingatom3
.If move is set to “group”, then all atoms bonded to
atom3
will also be moved.If move is set to “atom”, then only
atom3
will be moved. Parameters
atom1 (int) – the number of the first atom
atom2 (int) – the number of the second atom
atom3 (int) – the number of the third atom
angle (float) – final value in degrees of the
atom1
–atom2
–atom3
anglemove (str) – determines how fragment moving is handled
atoms (list) – 3element list of atom numbers
 Returns
the Molecule object

set_dihedral
(atom1=None, atom2=None, atom3=None, atom4=None, dihedral=None, move='group34', check_result=True, atoms=None)[source]¶ Adjusts the
atom1
–atom2
–atom3
–atom4
dihedral angle to be a fixed value by moving atom 4.If
move
is set to “atom”, then onlyatom4
will be moved.If
move
is set to “group4”, then all atoms bonded toatom4
will also be moved.If
move
is set to “group34”, then all atoms bonded toatom3
andatom4
will also be moved. Parameters
atom1 (int) – the number of the first atom
atom2 (int) – the number of the second atom
atom3 (int) – the number of the third atom
atom4 (int) – the number of the fourth atom
dihedral (float) – final value in degrees of the
atom1
–atom2
–atom3
–atom4
anglemove (str) – determines how fragment moving is handled
check_result (Bool) – whether the final answer should be checked for correctness
atoms (list) – 4element list of atomic numbers
 Returns
the Molecule object

set_distance
(atom1=None, atom2=None, distance=None, move='group', atoms=None)[source]¶ Adjusts the
atom1
–atom2
bond length to be a fixed distance by moving atom2.If
move
is set to “group”, then all atoms bonded toatom2
will also be moved.If
move
is set to “atom”, then only atom2 will be moved. Parameters
atom1 (int) – the number of the first atom
atom2 (int) – the number of the second atom
distance (float) – distance in Angstroms of the final bond
move (str) – determines how fragment moving is handled
atoms (list) – 2element list of atom numbers
 Returns
the Molecule object

swap_atom_numbers
(atom1, atom2)[source]¶ Interchanges the numbers of
atom1
andatom2
. Parameters
atom1 (int) – number of 1st atom
atom2 (int) – number of 2nd atom
 Returns
new
Molecule
object (does not modify inplace)

to_string
()[source]¶ Save the current molecule as a string, for subsequent loading. Not humanreadable.
Vibrational modes are currently not saved.

translate_molecule
(vector)[source]¶ Translates the whole molecule by the given vector.
 Parameters
vector (vector) – the vector to translate by
 Returns
the Molecule object

volume
(pts_per_angstrom=10, qhull=False)[source]¶ Returns volume calculated using the Gavezotti algorithm (JACS, 1983, 105, 5220). Relatively slow. If MemoryError, defaults to a qhullbased approach (accurate in the limit as number of atoms goes to infinity)
 Parameters
pts_per_angstrom (int) – how many grid points to use per Å  time scales as O(n**3) so be careful!
qhull (bool) – use faster QHull algorithm
 Returns
volume in Å**3
