rxn.reaction_preprocessing.config.AugmentConfig
- class rxn.reaction_preprocessing.config.AugmentConfig(input_file_path='${preprocess.output_file_path}', output_file_path='${data.proc_dir}/${data.name}.augmented.csv', tokenize=True, random_type=RandomType.unrestricted, permutations=1, reaction_column_name='${common.reaction_column_name}', rxn_section_to_augment=ReactionSection.precursors, fragment_bond='${common.fragment_bond}', keep_intermediate_columns='${common.keep_intermediate_columns}')[source]
Bases:
objectConfiguration for the augmentation transformation step.
- Fields:
input_file_path: The input file path (one SMILES per line). output_file_path: The output file path. tokenize: if tokenization is to be performed random_type: The randomization type to be applied permutations: number of randomic permutations for input SMILES reaction_column_name: Name of the reaction column for the data file. rxn_section_to_augment: The section of the rxn SMILES to augment.
“precursors” for augmenting only the precursors “products” for augmenting only the products
fragment_bond: Token used to denote a fragment bond in the reaction SMILES. keep_intermediate_columns: Whether the columns generated during preprocessing should be kept.
- Parameters
input_file_path (
str, default:'${preprocess.output_file_path}') –output_file_path (
str, default:'${data.proc_dir}/${data.name}.augmented.csv') –tokenize (
bool, default:True) –random_type (
RandomType, default:<RandomType.unrestricted: 2>) –permutations (
int, default:1) –reaction_column_name (
str, default:'${common.reaction_column_name}') –rxn_section_to_augment (
ReactionSection, default:<ReactionSection.precursors: 1>) –fragment_bond (
FragmentBond, default:'${common.fragment_bond}') –keep_intermediate_columns (
bool, default:'${common.keep_intermediate_columns}') –
- __init__(input_file_path='${preprocess.output_file_path}', output_file_path='${data.proc_dir}/${data.name}.augmented.csv', tokenize=True, random_type=RandomType.unrestricted, permutations=1, reaction_column_name='${common.reaction_column_name}', rxn_section_to_augment=ReactionSection.precursors, fragment_bond='${common.fragment_bond}', keep_intermediate_columns='${common.keep_intermediate_columns}')
- Parameters
input_file_path (
str, default:'${preprocess.output_file_path}') –output_file_path (
str, default:'${data.proc_dir}/${data.name}.augmented.csv') –tokenize (
bool, default:True) –random_type (
RandomType, default:<RandomType.unrestricted: 2>) –permutations (
int, default:1) –reaction_column_name (
str, default:'${common.reaction_column_name}') –rxn_section_to_augment (
ReactionSection, default:<ReactionSection.precursors: 1>) –fragment_bond (
FragmentBond, default:'${common.fragment_bond}') –keep_intermediate_columns (
bool, default:'${common.keep_intermediate_columns}') –
- Return type
None
Methods
__init__([input_file_path, ...])- param input_file_path
Attributes
fragment_bondinput_file_pathkeep_intermediate_columnsoutput_file_pathpermutationsrandom_typereaction_column_namerxn_section_to_augmenttokenize