Toolsets
Toolsets
Toolsets define the tools available to the model during training. Define tools once at the dataset level and reference them in files or records.
Structure
{
"toolsets": [
{
"id": "my-tools",
"tools": [
{
"name": "tool_name",
"description": "What the tool does",
"input_schema": { ... },
"output_schema": { ... }
}
]
}
]
}
Tool Definition
| Field | Required | Description |
|---|---|---|
name | Yes | Tool function name |
description | No | Human-readable description |
input_schema | No | JSON Schema for inputs |
output_schema | No | JSON Schema for outputs |
Example Toolset
{
"toolsets": [
{
"id": "math-tools",
"tools": [
{
"name": "calculator",
"description": "Evaluate a mathematical expression",
"input_schema": {
"type": "object",
"properties": {
"expression": {
"type": "string",
"description": "Math expression to evaluate"
}
},
"required": ["expression"]
},
"output_schema": {
"type": "object",
"properties": {
"value": { "type": "number" }
},
"required": ["value"]
}
},
{
"name": "unit_converter",
"description": "Convert between units",
"input_schema": {
"type": "object",
"properties": {
"value": { "type": "number" },
"from_unit": { "type": "string" },
"to_unit": { "type": "string" }
},
"required": ["value", "from_unit", "to_unit"]
}
}
]
}
]
}
Using Toolsets
Default Toolset
Set a default toolset in defaults:
{
"defaults": {
"toolset_id": "math-tools"
}
}
File-Level Toolset
Override for specific files:
{
"files": [
{
"split": "train",
"objective": "sft",
"toolset_id": "advanced-tools",
"shards": [...]
}
]
}
Record-Level Toolset
Override for specific records:
{
"id": "record-001",
"toolset_id": "special-tools",
"messages": [...]
}
Priority Order
Toolset resolution follows this priority:
- Record
toolset_id(highest) - File
toolset_id - Default
toolset_id(lowest)
Multiple Toolsets
Define multiple toolsets for different scenarios:
{
"toolsets": [
{
"id": "basic-tools",
"tools": [
{ "name": "calculator", ... }
]
},
{
"id": "web-tools",
"tools": [
{ "name": "web_search", ... },
{ "name": "fetch_url", ... }
]
},
{
"id": "all-tools",
"tools": [
{ "name": "calculator", ... },
{ "name": "web_search", ... },
{ "name": "fetch_url", ... }
]
}
]
}
Best Practices
- Clear descriptions - Help the model understand when to use each tool
- Strict schemas - Use
additionalProperties: falsefor cleaner data - Consistent naming - Use snake_case for tool names
- Required fields - Mark required inputs in the schema
- Examples - Include example calls in descriptions if helpful