rxn.chemutils.tokenization.ensure_tokenized_file
- rxn.chemutils.tokenization.ensure_tokenized_file(file, postfix='.tokenized', fallback_value='')[source]
Ensure that a file is tokenized: do nothing if the file is already tokenized, create a tokenized copy otherwise.
- Parameters
file (
Union
[str
,PathLike
]) – path to the file that we want to ensure is tokenized.postfix (
str
, default:'.tokenized'
) – postfix to add to the tokenized copy (if applicable).fallback_value (
str
, default:''
) – placeholder for strings that cannot be tokenized (if applicable).
- Return type
str
- Returns
The path to the tokenized file (original path, or path to new file).