- class openff.pablo.ccd.CcdCache(library_paths: Iterable[Path | str], cache_path: Path | str = Path(xdg_base_dir.save_cache_path('openff-pablo'), 'ccd_cache'), preload: list[str] = [], patches: Iterable[Mapping[str, Callable[[ResidueDefinition], list[ResidueDefinition]]]] = {}, extra_definitions: Mapping[str, Iterable[ResidueDefinition]] = {})[source]
Bases:
Mapping[str,tuple[ResidueDefinition, …]]Caches, patches, and presents the CCD as a Python
Mapping.This class is a wrapper around a
dictthat stores residue definitions. When a residue is requested via the indexing syntax (for example,my_ccd_cache["ALA"]) or theinoperator, this dictionary is checked first. If the residue is not present, the CCD is then checked. If the residue cannot be retrieved from the innerdictor the CCD, aKeyErroris raised orFalseis returned as appropriate.Iterating over the mapping, checking its length, or otherwise treating the
CcdCacheas a mapping other than with the indexing syntax orinoperator works only on the innerdict. As a result, accessing a residue via indexing may return a value even if these other methods suggest it won’t.CcdCachecan apply patches to the entries it downloads from the CCD. This is used to work around known errors, deficiencies and inconsistencies in the CCD definitions. Patches are specified as functions that take a single residue definition and return a list of them.The
extra_definitionsand thewith_()andwith_replacedmethods allow custom definitions to be added to aCcdCache. These custom definitions are not patched. Since they are stored alongside cached entries in the inner dictionary, custom definitions supercede any that have not already been downloaded from the CCD.When a CCD entry is downloaded, the corresponding CIF file is stored in the
cache_path. This means that each entry will be downloaded only once, even across multiple Python invocations. All entries in the cache are loaded (and patches applied) when a newCcdCacheis created.CcdCacheassumes that the files in the cache path were downloaded from the CCD and may do unexpected things if they are edited by hand.Users may provide additional CCD entries by specifying library paths. By default, this is used to ship commonly used residues with Pablo. At the moment, patches are applied to files in library paths, but it is likely that in the future they won’t be and residues shipped with Pablo will be shipped pre-patched to speed up load times. Like with the cache, files from library paths are loaded when a
CcdCacheis created.Accessing the CCD requires internet access. Without internet access, entries from the cache or library paths can still be loaded, as can any entries added to an instance of this class.
- Parameters:
library_paths¶ – Paths to search for user-provided or packaged CCD entries. All paths are searched.
cache_path¶ – The path to which to download CCD entries. This path is searched in addition to
library_paths.preload¶ – A list of residue names to download when initializing the class.
patches¶ – Functions to call on
ResidueDefinitionsdownloaded from the CCD before they are returned or added to the innerdict. An iterable of maps from residue names each to a single callable. Each map is applied to residues with the given name in the order they are iterated over. Any patches corresponding to key"*"will be applied to all residues before the more specific patches in its map.extra_definitions¶ – Additional residue definitions to add to the cache. Note that patches are not applied to these definitions.
Instance and Static Methods
Get a copy of this
CcdCachewith additional definitions added.Add a patch to the residues loaded via a copy of this
CcdCache.Get a copy of this
CcdCachewith some definitions replaced.Get a copy of
selfwith all combinations of some protonation states.Copy
self, adding new residue definitions requiring some virtual sites.Copy
self, adding new definitions for common multisite water models.Get a copy of this
CcdCachelacking any definitions with some names.- with_(definitions: Mapping[str, Sequence[ResidueDefinition]] | Sequence[ResidueDefinition]) Self[source]
Get a copy of this
CcdCachewith additional definitions added.Definitions may be supplied as a mapping from residue names to sequences of residue definitions, or as a sequence of residue definitions. In the latter case, the residue names are taken from the residue definitions themselves.
Note that patches are not applied to the new definitions.
Examples
Add a custom definition to the
STD_CCD_CACHE. We use a 4-letter residue code as they are supported by Pablo’s PDB reader and do not clash with the CCD’s definitions.>>> from openff.pablo import STD_CCD_CACHE, ResidueDefinition >>> my_ccd_cache = STD_CCD_CACHE.with_([ ... ResidueDefinition.from_smiles( ... "[H:1][O:2][O:3][H:4]", ... {1: "H1", 2: "O1", 3: "O2", 4: "H2"}, ... "HOOH", ... ) ... ])
Add protonation variants of a residue by specifying acidic and basic atoms.
>>> from openff.pablo import STD_CCD_CACHE, ResidueDefinition >>> >>> # Get the GABA (γ-amino butanoic acid) residue definition from CCD >>> gaba_resdef = STD_CCD_CACHE["ABU"][0] >>> >>> # Generate the variants and add them to a new cache >>> my_ccd_cache = STD_CCD_CACHE.with_({ ... "ABU": gaba_resdef.vary_protonation( ... acidic=["HXT"], # Atom name of abstractable proton ... basic=[("N", "H3")], # Atom to protonate, name of new proton ... )[1:], # Skip the first entry, which is already in the cache ... }) >>> # Should have added three variants - positive, negative, zwitterion >>> len(my_ccd_cache["ABU"]) - len(STD_CCD_CACHE["ABU"]) 3
See also
- with_patch(residue_name: str, patch: Callable[[ResidueDefinition], list[ResidueDefinition]]) Self[source]
Add a patch to the residues loaded via a copy of this
CcdCache.The patch is added to a copy of the
CcdCache, and the copy is returned. The originalCcdCacheis left unmodified.The patch function is called on each residue definition stored under the given residue name. The returned residue definitions are concatenated and replace the originals. Patches can therefore add, modify, split, or replace residue definitions depending on whether they include the original definition in the output.
The patch is applied to all definitions in the cache when this function is applied, as well as any definitions downloaded from the CCD in the future. It is not applied to definitions added by the other
CcdCache.with_*()methods.
- with_replaced(definitions: Mapping[str, Sequence[ResidueDefinition]] | Sequence[ResidueDefinition]) Self[source]
Get a copy of this
CcdCachewith some definitions replaced.Similar to
with_, but does not retain existing definitions for the specified residue names. All residue names that are keys of adefinitionsmapping or are residue names in adefinitionssequence are removed from the newCcdCachebefore adding the new definitions.Note that patches are not applied to the new definitions.
- with_varied_protonation(residue_name: str, *, acidic: Iterable[str] = (), basic: Iterable[tuple[str, str]] = ()) Self[source]
Get a copy of
selfwith all combinations of some protonation states.Note that all combinations of protonations and deprotonations are generated; this means that if
acidichas lengthnandbasichas lengthm,2**(n+m)variants will be generated for each existing variant.If no variants at all are generated,
PabloErroris raised. Otherwise, whatever variants make sense are created for each existing variant.This method will download the given residue name from the CCD if it is not already in the cache.
- Parameters:
residue_name¶ – The name of the residue to generate alternate protonation states for.
acidic¶ – Existing hydrogen atoms that can be removed to form a new protonation state. Each element specifies an atom name to remove, decrementing the formal charge on the neighbouring heavy atom. Multiply bonded, unbonded, missing, or non-hydrogen atoms are skipped unless no variants at all are generated.
basic¶ – Existing non-hydrogen atoms that can be protonated to form a new protonation state, as well as the canonical name of the new hydrogen. Each tuple specifies an atom name to protonate (increment the formal charge and form a bond) and the name of the added proton. Unknown heavy atoms and new atom names that clash with existing names raise are skipped unless no variants at all are generated.
- with_virtual_sites(residue_name: str, virtual_sites: Iterable[str]) Self[source]
Copy
self, adding new residue definitions requiring some virtual sites.The new definition is added to a copy of the
CcdCache, and the copy is returned. The originalCcdCacheis left unmodified.A new residue definition is added for each definition currently stored in the cache under the given name. The new definition requires that all the given virtual site names be present in order for it to match, and it discards the corresponding ATOM/HETATM records.
This method works by adding a patch. It will affect any residue definition already added to the cache under the given name, or any definition downloaded in the future, but not any definition added in the future by the
with_orwith_replacedmethods.
- with_vsite_water() Self[source]
Copy
self, adding new definitions for common multisite water models.The new definitions are added to a copy of the
CcdCache, and the copy is returned. The originalCcdCacheis left unmodified.The new definitions require that all the virtual site names be present in order for them to match, and they discard the corresponding ATOM/HETATM records. The name for the 4-point model virtual site is
EPW, and for the 5-point modelEP1andEP2.This method works by adding a patch. It will affect any 3-atom residue definitions already added to the cache under the names
HOH,WAT, orSOL, or any so-named definition downloaded in the future, but not any definition added in the future by thewith_orwith_replacedmethods.
- without(residue_names: Iterable[str]) Self[source]
Get a copy of this
CcdCachelacking any definitions with some names.All definitions for each of the given residue names will not be present in the new cache. Note that residues that are in the CCD will still be returned when they are requested, as long as they can be re-downloaded or found in the cache.
See also