Data frame
oxen.data_frame
DataFrame Objects
The DataFrame class allows you to perform CRUD operations on a remote data frame.
If you pass in a Workspace or a RemoteRepo the data is indexed into DuckDB on an oxen-server without downloading the data locally.
Examples
CRUD Operations
Index a data frame in a workspace.
__init__
Initialize the DataFrame class. Will index the data frame into duckdb on init.
Will throw an error if the data frame does not exist.
Arguments:
remote
-str
,RemoteRepo
, orWorkspace
The workspace or remote repo the data frame is in.path
-str
The path of the data frame file in the repository.host
-str
The host of the oxen-server. Defaults to “hub.oxen.ai”.branch
-str
The branch of the remote repo. Defaults to “main”.scheme
-str
The scheme of the remote repo. Defaults to “https”.
size
Get the size of the data frame. Returns a tuple of (rows, columns)
page_size
Get the page size of the data frame for pagination in list() command.
Returns:
The page size of the data frame.
total_pages
Get the total number of pages in the data frame for pagination in list() command.
Returns:
The total number of pages in the data frame.
list_page
List the rows within the data frame.
Arguments:
page_num
-int
The page number of the data frame to list. We default to page size of 100 for now.
Returns:
A list of rows from the data frame.
insert_row
Insert a single row of data into the data frame.
Arguments:
data
-dict
A dictionary representing a single row of data. The keys must match a subset of the columns in the data frame. If a column is not present in the dictionary, it will be set to an empty value.
Returns:
The id of the row that was inserted.
get_row_by_id
Get a single row of data by id.
Arguments:
id
-str
The id of the row to get.
Returns:
A dictionary representing the row.
update_row
Update a single row of data by id.
Arguments:
id
-str
The id of the row to update.data
-dict
A dictionary representing a single row of data. The keys must match a subset of the columns in the data frame. If a column is not present in the dictionary, it will be set to an empty value.
Returns:
The updated row as a dictionary.
delete_row
Delete a single row of data by id.
Arguments:
id
-str
The id of the row to delete.
restore
Unstage any changes to the schema or contents of a data frame
commit
Commit the current changes to the data frame.
Arguments:
message
-str
The message to commit the changes.branch
-str
The branch to commit the changes to. Defaults to the current branch.