rxn.chemutils.tokenization

Functions

copy_as_detokenized(src, dest)

Copy a source file to a destination, while making sure that it is not tokenized.

detokenize_file(input_file, output_file)

param input_file

detokenize_smiles(tokenized_smiles)

Detokenize a tokenized SMILES string (that contains spaces between the characters).

ensure_tokenized_file(file[, postfix, ...])

Ensure that a file is tokenized: do nothing if the file is already tokenized, create a tokenized copy otherwise.

file_is_tokenized(filepath)

Whether a file contains tokenized SMILES or not.

string_is_tokenized(smiles_line)

Whether a string is a tokenized SMILES or not.

to_tokens(smiles)

Tokenize a SMILES molecule or reaction into a list of tokens.

tokenize_file(input_file, output_file[, ...])

Tokenize a file containing SMILES strings.

tokenize_smiles(smiles[, fallback_value])

Tokenize a SMILES molecule or reaction, and join the tokens with spaces.

Exceptions

TokenizationError(title, detail)

Exception raised in RDKit.