Oxen provides a powerful data frame library that allows you to interact with tabular data.
df
command for all CLI actions involving data frames. For example, oxen df <FILENAME>
displays the contents of tabular data files.
--full
flag.
You can also use oxen df
options to view your data with modifications. These changes wonβt be written anywhere unless you use the --write
or --output
flags.
oxen df --write
. Any modifications you make with this flag set will be written back to the original file and register as βmodifiedβ in your Oxen repository.
oxen df
provides several options that can help with this.
For these examples, weβll use our CatDogBBox repository.
oxen df
with --output
, the resulting data frame will be written to disk as a new file of the specified type.
Some formats like parquet and arrow are more efficient for different tasks, but are not human readable like tsv or csv. These are tradeoffs youβll have to decide on for your application. Oxen currently supports the following file extensions: csv
, tsv
, parquet
, arrow
, json
, jsonl
.
--columns
.
--take
--unique
option.
--vstack
option takes a variable length list of files youβd like to concatenate.
--add-col 'col:val:dtype'
--add-row
option takes in a comma separated list of values and automatically parses the correct dtypes.
--randomize
flag.
sort
flag. You can sort the data by the values of any column in your data frame.
--sort
sorts in ascending order, but this can be switched with the --reverse
flag.