Overview
This schema is used for fine-tuning models with image to text capabilities.Schema Type
When creating a fine-tune with this schema, use:script_type:image_to_text(the fine-tune type)base_model: One of the supported model canonical names below
Supported Models
- Qwen3 VL 8B - Instruct (
Qwen/Qwen3-VL-8B-Instruct) - Qwen3 VL 2B - Instruct (
Qwen/Qwen3-VL-2B-Instruct) - Qwen3 VL 4B - Instruct (
Qwen/Qwen3-VL-4B-Instruct)
Request Schema
Required Fields
| Field | Type | Required | Description |
|---|---|---|---|
answer_column | string | Yes | Response Column (DataFrame column name) |
batch_size | integer | No | (default: 1) (min: 1) |
enable_thinking | boolean | No | enable_thinking |
epochs | integer | No | (default: 1) (min: 1) |
grad_accum | integer | No | (default: 1) (min: 1) |
image_columns | array | Yes | Image Columns (array of string) (Multiple DataFrame column names) |
learning_rate | number | No | (default: 0.0001) |
logging_steps | integer | No | (default: 10) (min: 1) |
lora_alpha | integer | No | (default: 16) (min: 1) |
lora_rank | integer | No | (default: 16) (min: 1) |
neftune_noise_alpha | number | No | (default: 0) |
question_column | string | Yes | Prompt Column (DataFrame column name) |
save_steps_ratio | number | No | (default: 0.25) |
save_strategy | string | No | save_strategy |
seq_length | integer | No | (default: 4096) (min: 1) |
use_lora | boolean | No | Use LoRA |
Example Request
Field Details
answer_column
Response Column
Type: string
Column containing the captions or responses
batch_size
Type: integer
Default: 1
Minimum: 1
enable_thinking
Type: boolean
Default: false
epochs
Type: integer
Default: 1
Minimum: 1
grad_accum
Type: integer
Default: 1
Minimum: 1
image_columns
Image Columns
Type: array
Columns containing image file paths
Default: []
learning_rate
Type: number
Default: 0.0001
Minimum: 0
logging_steps
Type: integer
Default: 10
Minimum: 1
lora_alpha
Type: integer
Default: 16
Minimum: 1
lora_rank
Type: integer
Default: 16
Minimum: 1
neftune_noise_alpha
Type: number
Default: 0
Minimum: 0
question_column
Prompt Column
Type: string
Column containing the prompts or questions for each image
save_steps_ratio
Type: number
Default: 0.25
save_strategy
Type: string
Default: "epoch"
seq_length
Type: integer
Default: 4096
Minimum: 1
use_lora
Use LoRA
Type: boolean
Enable LoRA for faster fine-tuning and lower memory use
Default: true