rxn.chemutils.smiles_standardization.standardize_molecules

rxn.chemutils.smiles_standardization.standardize_molecules(molecules, canonicalize=True, sanitize=True, inchify=False, fragment_bond='~', ordered_precursors=True, molecule_token_delimiter=None, is_enzymatic=False, enzyme_separator='|')[source]

Ensure that a set of molecules represented by a string follows a desired standard.

Parameters
  • molecules (str) – molecules SMILES. Molecules can be separated via a “.”. Fragments are supported with a custom fragment_bond.

  • canonicalize (bool, default: True) – canonicalize SMILES. Defaults to True.

  • sanitize (bool, default: True) – sanitize SMILES. Defaults to True.

  • inchify (bool, default: False) – inchify the SMILES. Defaults to False.

  • fragment_bond (str, default: '~') – fragment bond. Defaults to ‘~’.

  • ordered_precursors (bool, default: True) – order precursors. Defaults to True.

  • molecule_token_delimiter (Optional[str], default: None) – delimiter for big molecule tokens. Defaults to None

  • is_enzymatic (bool, default: False) – the molecules are representing an enzymatic reaction. Defaults to False.

  • enzyme_separator (str, default: '|') – separator for molecules and the enzyme. Defaults to ‘|’.

Return type

str

Returns

standardized molecules.

Examples

Standardize multiple molecules: >>> standardize_molecules(‘CCO.CC’) ‘CC.CCO’ Standardize multiple molecules including fragment information: >>> standardize_molecules(‘CCO.CC~C’) ‘CCO.C~CC’