> ## Documentation Index
> Fetch the complete documentation index at: https://docs.oxen.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Video Generation

> Fine-tune video generation models for custom content

## Overview

Fine-tune video generation models to create videos in your specific style. Works for both text-to-video and image-to-video generation.

## Your Data

### Text-to-Video

Data should have:

* **Video column** - Paths to your training videos
* **Caption column** - Text descriptions of each video

Example `videos.parquet`:

| video         | caption                          |
| ------------- | -------------------------------- |
| clips/001.mp4 | person walking in cyberpunk city |
| clips/002.mp4 | car driving through neon streets |

### Image-to-Video

Data should have:

* **Video column** - Output video paths
* **Image column** - First frame/reference image
* **Caption column** - Description of the motion/action

Example `img2vid.parquet`:

| image          | video         | caption                  |
| -------------- | ------------- | ------------------------ |
| frames/001.jpg | clips/001.mp4 | zoom into the building   |
| frames/002.jpg | clips/002.mp4 | camera pan left to right |

## Minimal Example: Text-to-Video

<CodeGroup>
  ```python Python theme={null}
  import requests

  url = "https://hub.oxen.ai/api/repos/YOUR_NAMESPACE/YOUR_REPO/fine_tunes"
  headers = {
      "Authorization": "Bearer YOUR_API_KEY",
      "Content-Type": "application/json"
  }

  # Create fine-tune
  data = {
      "resource": "main/videos.parquet",
      "base_model": "YOUR_VIDEO_MODEL",  # e.g., a video generation model
      "script_type": "text_to_video",
      "training_params": {
          "video_column": "video",
          "caption_column": "caption",
          "steps": 2000
      }
  }

  response = requests.post(url, headers=headers, json=data)
  fine_tune_id = response.json()["fine_tune"]["id"]

  # Start training
  run_url = f"{url}/{fine_tune_id}/actions/run"
  requests.post(run_url, headers=headers)

  print(f"Fine-tune started: {fine_tune_id}")
  ```

  ```bash cURL theme={null}
  # Create fine-tune
  curl -X POST https://hub.oxen.ai/api/repos/YOUR_NAMESPACE/YOUR_REPO/fine_tunes \
    -H "Authorization: Bearer YOUR_API_KEY" \
    -H "Content-Type: application/json" \
    -d '{
      "resource": "main/videos.parquet",
      "base_model": "YOUR_VIDEO_MODEL",
      "script_type": "text_to_video",
      "training_params": {
        "video_column": "video",
        "caption_column": "caption",
        "steps": 2000
      }
    }'

  # Start training
  curl -X POST https://hub.oxen.ai/api/repos/YOUR_NAMESPACE/YOUR_REPO/fine_tunes/FINE_TUNE_ID/actions/run \
    -H "Authorization: Bearer YOUR_API_KEY"
  ```
</CodeGroup>

## Minimal Example: Image-to-Video

<CodeGroup>
  ```python Python theme={null}
  import requests

  url = "https://hub.oxen.ai/api/repos/YOUR_NAMESPACE/YOUR_REPO/fine_tunes"
  headers = {
      "Authorization": "Bearer YOUR_API_KEY",
      "Content-Type": "application/json"
  }

  # Create fine-tune
  data = {
      "resource": "main/img2vid.parquet",
      "base_model": "YOUR_VIDEO_MODEL",
      "script_type": "image_to_video",
      "training_params": {
          "image_column": "image",          # First frame/reference
          "video_column": "video",          # Output video
          "caption_column": "caption",      # Motion description
          "steps": 2000
      }
  }

  response = requests.post(url, headers=headers, json=data)
  fine_tune_id = response.json()["fine_tune"]["id"]

  # Start training
  run_url = f"{url}/{fine_tune_id}/actions/run"
  requests.post(run_url, headers=headers)

  print(f"Fine-tune started: {fine_tune_id}")
  ```

  ```bash cURL theme={null}
  curl -X POST https://hub.oxen.ai/api/repos/YOUR_NAMESPACE/YOUR_REPO/fine_tunes \
    -H "Authorization: Bearer YOUR_API_KEY" \
    -H "Content-Type: application/json" \
    -d '{
      "resource": "main/img2vid.parquet",
      "base_model": "YOUR_VIDEO_MODEL",
      "script_type": "image_to_video",
      "training_params": {
        "image_column": "image",
        "video_column": "video",
        "caption_column": "caption",
        "steps": 2000
      }
    }'
  ```
</CodeGroup>

## Key Parameters

**Text-to-Video:**

| Parameter        | Description        | Example                 |
| ---------------- | ------------------ | ----------------------- |
| `video_column`   | Video file column  | `"video"`, `"clip"`     |
| `caption_column` | Description column | `"caption"`, `"prompt"` |
| `steps`          | Training steps     | `2000`                  |

**Image-to-Video:**

| Parameter        | Description                 | Example                 |
| ---------------- | --------------------------- | ----------------------- |
| `image_column`   | First frame/reference image | `"image"`, `"frame"`    |
| `video_column`   | Output video column         | `"video"`, `"clip"`     |
| `caption_column` | Motion description          | `"caption"`, `"motion"` |
| `steps`          | Training steps              | `2000`                  |

## Data Requirements

Video fine-tuning is resource-intensive:

* **Quantity**: 50-200 videos minimum
* **Quality**: Consistent resolution, frame rate, duration
* **Length**: 2-10 seconds per clip (shorter is better)
* **Format**: MP4, WebM, or other common formats
* **Captions**: Describe motion, camera movement, and key actions

<Warning>
  Video fine-tuning requires significant compute resources and storage. Expect longer training times compared to image or text models.
</Warning>

## Monitor Progress

```python theme={null}
status_url = f"https://hub.oxen.ai/api/repos/YOUR_NAMESPACE/YOUR_REPO/fine_tunes/{fine_tune_id}"
response = requests.get(status_url, headers=headers)
fine_tune = response.json()["fine_tune"]

print(f"Status: {fine_tune['status']}")
print(f"Current step: {fine_tune.get('current_step', 0)}")
```

## Next Steps

* [Text-to-Video Reference](/fine-tuning-api/reference/text_to_video) - All parameters
* [Image-to-Video Reference](/fine-tuning-api/reference/image_to_video) - All parameters
* [Deploy your model](/getting-started/inference) - Generate videos with your fine-tuned model

## Common Issues

<AccordionGroup>
  <Accordion title="Videos not loading">
    Ensure videos are committed to your Oxen repository. Check file paths are correct and relative to repo root.
  </Accordion>

  <Accordion title="Out of memory error">
    Video models need significant GPU memory. Reduce `batch_size` to 1 and consider shorter video clips.
  </Accordion>

  <Accordion title="Training very slow">
    Video fine-tuning takes hours to days. Start with 1000 steps for testing. Use shorter videos (2-5 seconds) for faster iteration.
  </Accordion>

  <Accordion title="Low quality output">
    Ensure training videos have consistent quality, resolution, and frame rate. Increase training steps to 3000-5000.
  </Accordion>
</AccordionGroup>
