rxn.reaction_preprocessing.config.StandardizeConfig

class rxn.reaction_preprocessing.config.StandardizeConfig(input_file_path='${rxn_import.output_csv}', annotation_file_paths=<factory>, discard_unannotated_metals=False, output_file_path='${data.proc_dir}/${data.name}.standardized.csv', fragment_bond='${common.fragment_bond}', reaction_column_name='${common.reaction_column_name}', remove_stereo_if_not_defined_in_precursors=False, keep_intermediate_columns='${common.keep_intermediate_columns}')[source]

Bases: object

Configuration for the standardization transformation step.

Fields:

input_file_path: The input CSV (one SMILES per line). output_file_path: The output file path containing the result after standardization. annotation_file_paths: The files to load the annotated molecules from. discard_unannotated_metals: whether reactions containing unannotated

molecules with transition metals must be rejected.

fragment_bond: Token used to denote a fragment bond in the reaction SMILES. reaction_column_name: Name of the reaction column for the data file. remove_stereo_if_not_defined_in_precursors: Remove chiral centers from product. keep_intermediate_columns: Whether the columns generated during preprocessing should be kept.

Parameters
  • input_file_path (str, default: '${rxn_import.output_csv}') –

  • annotation_file_paths (List[str], default: <factory>) –

  • discard_unannotated_metals (bool, default: False) –

  • output_file_path (str, default: '${data.proc_dir}/${data.name}.standardized.csv') –

  • fragment_bond (FragmentBond, default: '${common.fragment_bond}') –

  • reaction_column_name (str, default: '${common.reaction_column_name}') –

  • remove_stereo_if_not_defined_in_precursors (bool, default: False) –

  • keep_intermediate_columns (bool, default: '${common.keep_intermediate_columns}') –

__init__(input_file_path='${rxn_import.output_csv}', annotation_file_paths=<factory>, discard_unannotated_metals=False, output_file_path='${data.proc_dir}/${data.name}.standardized.csv', fragment_bond='${common.fragment_bond}', reaction_column_name='${common.reaction_column_name}', remove_stereo_if_not_defined_in_precursors=False, keep_intermediate_columns='${common.keep_intermediate_columns}')
Parameters
  • input_file_path (str, default: '${rxn_import.output_csv}') –

  • annotation_file_paths (List[str], default: <factory>) –

  • discard_unannotated_metals (bool, default: False) –

  • output_file_path (str, default: '${data.proc_dir}/${data.name}.standardized.csv') –

  • fragment_bond (FragmentBond, default: '${common.fragment_bond}') –

  • reaction_column_name (str, default: '${common.reaction_column_name}') –

  • remove_stereo_if_not_defined_in_precursors (bool, default: False) –

  • keep_intermediate_columns (bool, default: '${common.keep_intermediate_columns}') –

Return type

None

Methods

__init__([input_file_path, ...])

param input_file_path

Attributes

discard_unannotated_metals

fragment_bond

input_file_path

keep_intermediate_columns

output_file_path

reaction_column_name

remove_stereo_if_not_defined_in_precursors

annotation_file_paths