Oxen makes versioning your datasets as easy as versioning your code. You can install through homebrew or pip or from our releases page.

Clone Dataset

Clone your first Oxen repository from the OxenHub.

Initialize User

Each change you make will be associated with a name and email. Set them before you get started so you know who changed what. The user data is saved by default in ~/.config/oxen/user_config.toml.

Create Repository

Initialize your first Oxen repository, and commit the first version of your data.

Version Your Data

Once your data has been committed, you can always return to that version.

Confidently overwrite the file, move the file, delete the file, it doesnโ€™t matter. Oxen will always have a copy of the data at the time of the previous commit.

Create Branch

It is good practice to create a new branch for changes you make to your data. This will allow you to easily compare the parallel versions of your data over time.

Delete Branch

Once finished with a branch, you can delete it.

Diff Changes

View the change you made with the oxen diff command. This will show you the changes you made to your data since the last commit.

CLI
oxen diff image_classification_data.csv
Column changes:
   + label (str)

Row changes:
   ฮ” 1 (modified)
   + 3 (added)
   - 2 (removed)

shape: (6, 7)
+-------------+-----+-----+-------+--------+-------------+-------------------+
| file        | x   | y   | width | height | label.right | .oxen.diff.status |
| ---         | --- | --- | ---   | ---    | ---         | ---               |
| str         | i64 | i64 | i64   | i64    | str         | str               |
+-------------+-----+-----+-------+--------+-------------+-------------------+
| image_0.jpg | 0   | 0   | 10    | 10     | cat         | modified          |
| image_1.jpg | 1   | 2   | 10    | 20     | null        | removed           |
| image_1.jpg | 200 | 100 | 10    | 20     | dog         | added             |
| image_2.jpg | 4   | 10  | 20    | 20     | null        | removed           |
| image_3.jpg | 4   | 10  | 20    | 20     | dog         | added             |
| image_4.jpg | 10  | 10  | 10    | 10     | dog         | added             |
+-------------+-----+-----+-------+--------+-------------+-------------------+

Once you push you changes to OxenHub, you can view the changes you made in your commit history.

The diff command line tool is more powerful than it looks on the surface. Oxen has the ability to diff files of many formats, and the ability to specify keys are targets in tabular diffs to make it easier to see what changed.

For advanced usage, check out the full diff documentation.

Restore Changes

If you are not happy with the changes you made to your data, you can restore them to the previous commit with the oxen restore command.

Commit Changes

Once you are happy with the changes you have made to your data, you can commit them to the repository with a new message.

View History

To see the commit history of your repository, you can use the oxen log command.

Checkout Main Branch

Once you are done making changes to your data, you can return to the main branch with the oxen checkout command.

Never fear, the file now has now been reverted to the inital commit again, but your changes will be saved in the branch you created.

List Branches

To see the branches in your repository, you can use the oxen branch command.

Push Data

Once your data has been committed locally, you can sync it to the OxenHub.

OxenHub is a free service that allows you to collaborate on your data in the cloud. You can create a free account at https://oxen.ai.

Clone Data

Clone your data faster than ever before. Oxen has been optimized to the core to make pulling large datasets as fast as possible.

Pull Changes

Only pull the changes you need. Oxen will only pull the files that have changed since the last time you pulled.

Download Individual Files

With Oxen you do not need to download the entire dataset to your local machine. You can download only the subset of files or directories you need.