rxn.reaction_preprocessing.molecule_standardizer.MoleculeStandardizer
- class rxn.reaction_preprocessing.molecule_standardizer.MoleculeStandardizer(annotations=None, discard_missing_annotations=False, canonicalize=True)[source]
Bases:
object
Class to standardize standalone molecules (reactions are standardized with the Standardizer class).
Note that the standardization of one molecule may lead to a combination of molecules, hence the functions return lists of strings.
- Parameters
annotations (
Optional
[List
[MoleculeAnnotation
]], default:None
) –discard_missing_annotations (
bool
, default:False
) –canonicalize (
bool
, default:True
) –
- __init__(annotations=None, discard_missing_annotations=False, canonicalize=True)[source]
- Parameters
annotations (
Optional
[List
[MoleculeAnnotation
]], default:None
) – A list of MoleculeAnnotation objects used to perform the substitutions /rejections. Defaults to an empty list.discard_missing_annotations (
bool
, default:False
) – whether reactions containing unannotated molecules that should be must be rejected.canonicalize (
bool
, default:True
) – whether to canonicalize the compounds.
Methods
__init__
([annotations, ...])- type annotations
Optional
[List
[MoleculeAnnotation
]], default:None
standardize
(smiles)Standardize a molecule.
standardize_in_equation
(reaction)Do the molecule-wise standardization for a reaction equation.
standardize_in_equation_with_errors
(reaction)Do the molecule-wise standardization for a reaction equation, and get the reasons for potential failures.
- standardize(smiles)[source]
Standardize a molecule.
The returned value is a list, because in some cases standardization returns two independent molecules.
- Parameters
smiles (
str
) – SMILES string to standardize. Use dots for fragment bonds!- Raises
SanitizationError of one of its subclasses – error in sanitization.
InvalidSmiles – Invalid SMILES.
ValueError – “~” being used for fragment bonds.
- Return type
List
[str
]- Returns
Standardized SMILES string.
- standardize_in_equation(reaction)[source]
Do the molecule-wise standardization for a reaction equation.
Relies on standardize_in_equation_with_errors(), for modularity purposes. Will propagate the exceptions raised in that function.
- Parameters
reaction (
ReactionEquation
) –- Return type
ReactionEquation
- standardize_in_equation_with_errors(reaction, propagate_exceptions=False)[source]
Do the molecule-wise standardization for a reaction equation, and get the reasons for potential failures.
This function was originally implemented in Standardizer, and then moved here for more modularity.
- Parameters
reaction (
ReactionEquation
) – reaction to standardize.propagate_exceptions (
bool
, default:False
) – if True, will stop execution and raise directly instead of collecting the SMILES leading to the failure. Not ideal, but probably the only way (?) to not have duplicated code in the function standardize_in_equation().
- Returns
the standardized reaction equation (or an empty one if there was a failure).
list of invalid SMILES in the reaction.
list of rejected SMILES in the reaction.
list of missing annotations in the reaction.
- Return type
Tuple