Molecules
This page describes how molecular models are handled in HARP.
atomcollection
class harp.molecule.atomcollection (atomid, resid, resname, atomnamechain, element, conf, xyz, occupancy, bfactor, hetatom, modelnum, authresid=None)
An atomcollection
object stores all information about the molecular models and the methods to manipulate them in HARP.
Attributes
The attributes are named following the PDB naming convention
Variable | Descrption |
---|---|
self.natoms |
int Number of atoms in the molecule. |
self.atomid |
numpy.ndarray A 1-D int array of size natoms , containing a numerical id for each individual atom in the molecule. |
self.ind |
numpy.ndarray A 1-D int array of size natoms , containing the index for each individual atom in the molecule. Is equal to numpy.arange(natoms) . Used internally in HARP for indexing. |
self.resid |
numpy.ndarray A 1-D int array of size natoms , containing a numerical id for the residue for individual atom in the molecule. |
self.resname |
numpy.ndarray A 1-D string array of size natoms , containing the identity of the residue for each individual atom in the molecule (e.g., Gly, Ala, etc.). |
self.element |
numpy.ndarray A 1-D string array of size natoms , containing the element identity for each individual atom in the molecule (e.g., C, H, O, N, etc.). |
self.atomname |
numpy.ndarray A 1-D string array of size natoms , containing the atom identity of each individual atom in a particular residue in the molecule (e.g., CA, CB, N, etc.). |
self.chain |
numpy.ndarray A 1-D string array of size natoms , containing the identity of the chain for each individual atom in the molecule. |
self.xyz |
numpy.ndarray A 2-D double array of size [natoms, 3] , containing the Cartesian co-ordinates of each individual atom in the molecule. Note: This must be a double array! |
self.occupancy |
numpy.ndarray A 1-D double array of size natoms , containing the occupancy for each individual atom in the molecule. |
self.bfactor |
numpy.ndarray A 1-D double array of size natoms , containing the B-factor for each individual atom in the molecule. |
self.hetatom |
numpy.ndarray A 1-D bool array of size natoms , indicating whether the atom is a heteroatom entry. |
self.authresid |
numpy.ndarray A 1-D int array of size natoms , containing the author-provided id for the residue for individual atom in the molecule. Defaults to resid if not provided. |
self.conf |
numpy.ndarray A 1-D string array of size natoms , containing the conformation for individual atom in the molecule. |
self.modelnum |
numpy.ndarray A 1-D int array of size natoms , containing the model number for individual atom in the molecule. |
self.unique_residues |
numpy.ndarray A 1-D int array of the unique resid in the molecule. |
self.unique_chains |
numpy.ndarray A 1-D string array of the unique chain in the molecule. |
self.unique_confs |
numpy.ndarray A 1-D string array of the unique conf in the molecule. |
Functions
The functions associated with atomcollection
are
Function | Descrption |
---|---|
self.get_residue |
Returns a particular residue. Input: int number (the specific resid to be returned). Output: atomcollection . |
self.get_chain |
Returns a particular chain. Input: string chain (the specific chain to be returned). Output: atomcollection . |
self.get_chains |
Returns multiple chains. Input: numpy.ndarray /list chains (the array/list of chain to be returned). Output: atomcollection . |
self.get_atomname |
Returns a collection of specific atoms based on their identity in residues. Input: string atomname (the specific atomname to be returned). Output: atomcollection . |
self.get_atomids |
Returns a collection of specific atoms. Input: numpy.ndarray /list atomids (the array/list of atomid to be returned). Output: atomcollection . |
self.get_conformation |
Returns a specific conformation of the model. Input: string conf (the specifc conf to be returned). Output: atomcollection . |
self.dehydrogen |
Removes hydrogens from self Input: None. Output: atomcollection . |
self.remove_hetatoms |
Removes heteroatoms from self Input: None. Output: atomcollection . |
self.split_residue |
Splits an atomcollection of a single residue into component parts. For a nucleotide, returns a list of phosphate, sugar, and base. For an amino-acid, returns backbone and sidechain. Input: None (Note: self must be a single residue!). Output: A list of atomcollection . |
self.com |
Returns the Cartesian co-ordinate center-of-mass (centroid) of an atomcollection , i.e. the mean of self.xyz . Input: None. Output: A numpy.ndarray of size 3. |
A separate function is involved in loading an atomcollection
from a file.
load
harp.molecule.load (fname, only_polymers = False, firstmodel = True, authid = False)
Argument
Variable | Descrption |
---|---|
fname |
string The filename of the mmCIF file to load. Note: This function currently only handles the file suffixes .mmcif , .cif , or_ .cif.gz . |
only_polymers |
bool A flag for whether to only use entities labeled as polymers (i.e. not water, ions, metals, or ligands, etc.). |
firstmodel |
bool A flag for whether to take the only first model. Useful for an ensemble of models (e.g., as in NMR). |
authid |
bool A flag for whether to use authid as atomcollection.resid or not. Useful for when people populate the wrong column (e.g., during model building). |
Output
Variable | Descrption |
---|---|
atomcollection |
atomcollection Contains the molecular model written in the .mmCIF file |