Documentation Index
Fetch the complete documentation index at: https://docs.oxen.ai/llms.txt
Use this file to discover all available pages before exploring further.
oxen.oxen_fs
OxenFS Objects
class OxenFS(fsspec.AbstractFileSystem)
OxenFS is a filesystem interface for Oxen repositories that implements the
fsspec protocol. This
allows you to interact with Oxen repositories using familiar filesystem
operations and integrate with other compatible libraries like Pandas.
Basic Usage
Creating a Filesystem Instance
import oxen
# For Oxen Hub repositories
fs = oxen.OxenFS("ox", "Flowers")
# For local oxen-server
fs = oxen.OxenFS("ox", "test-repo", host="localhost:3000", scheme="http")
Reading Files
with fs.open("data/train.csv") as f:
content = f.read()
Writing Files
You must have write access to the repository to write files. See:
https://docs.oxen.ai/getting-started/python#private-repositories
OxenFS will automatically commit the file to the repository when the
context is exited (or the file is closed some other way). New
directories are automatically created as needed.
# Write with custom commit message
with fs.open("data/test.txt", mode="wb", commit_message="Added test.txt") as f:
f.write("Hello, world!")
# You can also set/update the commit message inside the context
with fs.open("data/test.txt", mode="wb") as f:
f.commit_message = "Updated test.txt"
f.write("Hello, world again!")
Writing file objects
If you’re integrating Oxen in a situation where you already have a file object,
you can save it to your repo by using shutil.copyfileobj like this:
import shutil
file_object_from_somewhere = open("data.csv")
with fs.open("train/data.csv", mode="wb") as output_file:
output_file.commit_message = "Copy from a file object"
shutil.copyfileobj(file_object_from_somewhere, output_file)
Integration with Third Party Libraries (Pandas, etc.)
OxenFS works seamlessly with Pandas and other fsspec-compatible libraries using
the URL format: oxen://namespace:repo@revision/path/to/file
Reading Data
These will work with Pandas {to,from}_{csv,parquet,json,etc.} functions.
import pandas as pd
# Read parquet directly from Oxen repository
df = pd.read_parquet("oxen://openai:gsm8k@main/gsm8k_test.parquet")
Writing Data
# Write DataFrame directly to Oxen repository
df.to_csv("oxen://ox:my-repo@main/data/test.csv", index=False)
Notes
- Only binary read (“rb”) and write (“wb”) modes are currently supported
- But writing will automatically encode strings to bytes
- Does not yet support streaming files. All operations use temporary local files.
__init__
def __init__(namespace: str,
repo: str,
host: str = "hub.oxen.ai",
revision: str = "main",
scheme: str = "https",
**kwargs)
Initialize the OxenFS instance.
Arguments:
namespace - str
The namespace of the repository.
repo - str
The name of the repository.
host - str
The host to connect to. Defaults to ‘hub.oxen.ai’
revision - str
The branch name or commit id to checkout. Defaults to ‘main’
scheme - str
The scheme to use for the remote url. Default: ‘https’
def ls(path: str = "", detail: bool = False)
List the contents of a directory.
Arguments:
path - str
The path to list the contents of.
detail - bool
If True, return a list of dictionaries with detailed metadata.
Otherwise, return a list of strings with the filenames.
OxenFSFileWriter Objects
A file writer for the OxenFS backend.
This is normally called through OxenFS.open() or fsspec.open().
write
def write(data: str | bytes)
Write string or binary data to the file.
flush
Flush the file to disk.
tell
Return the current position of the file.
seek
def seek(offset: int, whence: int = os.SEEK_SET)
Seek to a specific position in the file.
commit
def commit(commit_message: Optional[str] = None)
Commit the file to the remote repo.
close
Close the file writer. This will commit the file to the remote repo.