Skip to main content
When using models on Oxen.ai, by default we store the model inputs, outputs, and metadata in a repository. Every piece of data is versioned so you can trace the provenance of your data and models. In order to version data at scale, we built an open source version control system that can scale to monorepos with millions of files and terabytes of data.

Key Concepts

  • Repository: A collection of files and folders that is versioned together.
  • Commit: A snapshot of a repository at a given time.
  • Branch: A named pointer to a commit.
  • Dataset: A tabular file within an oxen repository that can be indexed and searched. ie csv, jsonl, parquet, etc.
  • Workspace: The equivalent of a working directory on the remote server where files can be added in an uncommitted state.
Follow along with the Version Control guide to learn how to version your data.