๐ Notebooks
๐บ๏ธ Explore, Process, and Version Data
Explore, process, and version data with your favorite tools
Example Notebook: Explore, Process, and Version Data
Downloading Data
The RemoteRepo class allows you to download arbitrary files from a remote repository. To see more options check out the Python Docs.
Or you can directly download a file into a Pandas DataFrame using HTTP or the FSSpec format:
Exploring Data
Use whatever tools you want to explore your data. For example you can use Matplotlib to plot the distribution of the model
column:
Cleaning Data
Use pandas to clean or process the data. For example you can remove the Internal Thoughts
from the response
column:
Versioning Data
You can then either write the data back directly with pandas
or with the RemoteRepo
class.
With pandas (auto commit message):
With RemoteRepo and a commit message