Tool Use Datasets
Tool Use Datasets
This guide covers creating training data for teaching models to use tools (function calling).
Defining Tools
Define tools in toolsets:
{
"toolsets": [
{
"id": "web-tools",
"tools": [
{
"name": "web_search",
"description": "Search the web for information",
"input_schema": {
"type": "object",
"properties": {
"query": { "type": "string", "description": "Search query" }
},
"required": ["query"]
}
},
{
"name": "get_weather",
"description": "Get current weather for a location",
"input_schema": {
"type": "object",
"properties": {
"location": { "type": "string" },
"units": { "type": "string", "enum": ["celsius", "fahrenheit"] }
},
"required": ["location"]
}
}
]
}
],
"defaults": {
"toolset_id": "web-tools"
}
}
Basic Tool Call
{
"id": "tool-001",
"messages": [
{
"role": "user",
"content": [{ "type": "text", "text": "What's the weather in Tokyo?" }]
},
{
"role": "assistant",
"content": [
{
"type": "tool_call",
"name": "get_weather",
"call_id": "weather-1",
"arguments": { "location": "Tokyo", "units": "celsius" }
}
]
},
{
"role": "tool",
"content": [
{
"type": "tool_result",
"name": "get_weather",
"call_id": "weather-1",
"result": { "temperature": 22, "condition": "sunny", "humidity": 65 }
}
]
},
{
"role": "assistant",
"content": [{ "type": "text", "text": "It's currently 22°C and sunny in Tokyo with 65% humidity." }]
}
]
}
Multiple Tool Calls
{
"id": "tool-002",
"messages": [
{
"role": "user",
"content": [{ "type": "text", "text": "Compare the weather in Tokyo and New York" }]
},
{
"role": "assistant",
"content": [
{
"type": "tool_call",
"name": "get_weather",
"call_id": "weather-tokyo",
"arguments": { "location": "Tokyo" }
},
{
"type": "tool_call",
"name": "get_weather",
"call_id": "weather-nyc",
"arguments": { "location": "New York" }
}
]
},
{
"role": "tool",
"content": [
{
"type": "tool_result",
"name": "get_weather",
"call_id": "weather-tokyo",
"result": { "temperature": 22, "condition": "sunny" }
},
{
"type": "tool_result",
"name": "get_weather",
"call_id": "weather-nyc",
"result": { "temperature": 15, "condition": "cloudy" }
}
]
},
{
"role": "assistant",
"content": [{ "type": "text", "text": "Tokyo is warmer at 22°C and sunny, while New York is 15°C and cloudy." }]
}
]
}
Sequential Tool Use
{
"id": "tool-003",
"messages": [
{
"role": "user",
"content": [{ "type": "text", "text": "Find recent news about AI and summarize the top story" }]
},
{
"role": "assistant",
"content": [
{
"type": "tool_call",
"name": "web_search",
"call_id": "search-1",
"arguments": { "query": "AI news today" }
}
]
},
{
"role": "tool",
"content": [
{
"type": "tool_result",
"name": "web_search",
"call_id": "search-1",
"result": {
"results": [
{ "title": "New AI Model Released", "url": "https://..." },
{ "title": "AI in Healthcare", "url": "https://..." }
]
}
}
]
},
{
"role": "assistant",
"content": [
{
"type": "tool_call",
"name": "fetch_url",
"call_id": "fetch-1",
"arguments": { "url": "https://..." }
}
]
},
{
"role": "tool",
"content": [
{
"type": "tool_result",
"name": "fetch_url",
"call_id": "fetch-1",
"result": { "content": "A new AI model was released today..." }
}
]
},
{
"role": "assistant",
"content": [{ "type": "text", "text": "The top AI news today is about a new model release..." }]
}
]
}
When NOT to Use Tools
Include examples where tools aren’t needed:
{
"id": "no-tool-001",
"messages": [
{
"role": "user",
"content": [{ "type": "text", "text": "What is photosynthesis?" }]
},
{
"role": "assistant",
"content": [{ "type": "text", "text": "Photosynthesis is the process by which plants convert sunlight into energy..." }]
}
]
}
Tool Errors
Include examples of handling tool errors:
{
"id": "tool-error-001",
"messages": [
{
"role": "user",
"content": [{ "type": "text", "text": "Get weather for Atlantis" }]
},
{
"role": "assistant",
"content": [
{
"type": "tool_call",
"name": "get_weather",
"call_id": "weather-err",
"arguments": { "location": "Atlantis" }
}
]
},
{
"role": "tool",
"content": [
{
"type": "tool_result",
"name": "get_weather",
"call_id": "weather-err",
"result": { "error": "Location not found" }
}
]
},
{
"role": "assistant",
"content": [{ "type": "text", "text": "I couldn't find weather data for Atlantis. This might be a fictional or unrecognized location. Could you provide a real city name?" }]
}
]
}
Best Practices
- Unique call_ids - Each tool call needs a unique identifier
- Match call_ids - Tool results must reference the correct call_id
- Vary complexity - Include simple and complex tool interactions
- Include failures - Teach graceful error handling
- Balance with non-tool examples - Not every query needs tools
- Realistic arguments - Use plausible input values