rxn.chemutils.smiles_standardization.standardize_molecules
- rxn.chemutils.smiles_standardization.standardize_molecules(molecules, canonicalize=True, sanitize=True, inchify=False, fragment_bond='~', ordered_precursors=True, molecule_token_delimiter=None, is_enzymatic=False, enzyme_separator='|')[source]
Ensure that a set of molecules represented by a string follows a desired standard.
- Parameters
molecules (
str
) – molecules SMILES. Molecules can be separated via a “.”. Fragments are supported with a custom fragment_bond.canonicalize (
bool
, default:True
) – canonicalize SMILES. Defaults to True.sanitize (
bool
, default:True
) – sanitize SMILES. Defaults to True.inchify (
bool
, default:False
) – inchify the SMILES. Defaults to False.fragment_bond (
str
, default:'~'
) – fragment bond. Defaults to ‘~’.ordered_precursors (
bool
, default:True
) – order precursors. Defaults to True.molecule_token_delimiter (
Optional
[str
], default:None
) – delimiter for big molecule tokens. Defaults to Noneis_enzymatic (
bool
, default:False
) – the molecules are representing an enzymatic reaction. Defaults to False.enzyme_separator (
str
, default:'|'
) – separator for molecules and the enzyme. Defaults to ‘|’.
- Return type
str
- Returns
standardized molecules.
Examples
Standardize multiple molecules: >>> standardize_molecules(‘CCO.CC’) ‘CC.CCO’ Standardize multiple molecules including fragment information: >>> standardize_molecules(‘CCO.CC~C’) ‘CCO.C~CC’