rxn.utilities.csv.streaming_csv_editor.StreamingCsvEditor
- class rxn.utilities.csv.streaming_csv_editor.StreamingCsvEditor(columns_in, columns_out, transformation, line_terminator='\\n')[source]
Bases:
object
Edit the content of a CSV with a specified transformation, line-by-line.
This class avoids loading the whole file into memory as would be done with a pandas DataFrame.
- Parameters
columns_in (
List
[str
]) –columns_out (
List
[str
]) –transformation (
Callable
[...
,Any
]) –line_terminator (
str
, default:'\\n'
) –
- __init__(columns_in, columns_out, transformation, line_terminator='\\n')[source]
- Parameters
columns_in (
List
[str
]) – names for the columns acting as input for the transformation.columns_out (
List
[str
]) – names for the columns where to write the result of the transformation.transformation (
Callable
[...
,Any
]) –function to call on the values from the input columns, with the results being written to the output columns. The function should be annotated, and the following are admissible:
- For the parameters:
one or several strings
a list of strings (with one or more elements)
a tuple of strings (with one or more elements)
- For the return type:
one string
a list of strings (with one or more elements)
a tuple of strings (with one or more elements)
line_terminator (
str
, default:'\\n'
) – line terminator to use for writing the CSV.
Methods
__init__
(columns_in, columns_out, transformation)- type columns_in
List
[str
]
process
(csv_iterator)Process and edit a CSV file.
process_paths
(path_in, path_out[, verbose])Process and edit a CSV file.
- process(csv_iterator)[source]
Process and edit a CSV file.
- Parameters
csv_iterator (
CsvIterator
) – Input CSV iterator.- Return type
- Returns
an edited instance of a CsvIterator.
- process_paths(path_in, path_out, verbose=False)[source]
Process and edit a CSV file.
- Parameters
path_in (
Union
[str
,PathLike
]) – path to the existing CSV.path_out (
Union
[str
,PathLike
]) – path to the edited CSV (to be saved).verbose (
bool
, default:False
) – whether to write the progress with tqdm.
- Return type
None