Content Parts
Content Parts
Message content is an array of content parts. Each part has a type and type-specific fields.
Content Part Types
| Type | Description | Required Fields |
|---|---|---|
text | Plain text | text |
reasoning | Chain-of-thought | text |
json | Structured data | data |
image | Image reference | ref |
audio | Audio reference | ref |
video | Video reference | ref |
document | Document reference | ref |
tool_call | Tool invocation | name, call_id, arguments |
tool_result | Tool output | name, call_id, result |
Text
Plain text content:
{ "type": "text", "text": "Hello, world!" }
Reasoning
Chain-of-thought or scratchpad content. Use reasoning_policy in defaults to control training behavior:
{
"type": "reasoning",
"text": "Let me think through this step by step. First, I need to..."
}
JSON
Structured data output:
{
"type": "json",
"data": {
"name": "John",
"age": 30,
"city": "New York"
}
}
Media Types
Images, audio, video, and documents reference external files:
{
"type": "image",
"ref": { "asset_id": "img-001" },
"mime_type": "image/jpeg"
}
Or with direct URI:
{
"type": "audio",
"ref": { "uri": "s3://bucket/audio/clip.wav" },
"mime_type": "audio/wav",
"sha256": "abc123...",
"bytes": 123456
}
Reference Options
| Field | Description |
|---|---|
ref.asset_id | Reference to assets.jsonl entry (preferred) |
ref.uri | Direct URI to the file |
mime_type | MIME type of the content |
sha256 | SHA-256 hash for verification |
bytes | File size in bytes |
Tool Call
Invoke a tool:
{
"type": "tool_call",
"name": "calculator",
"call_id": "calc-001",
"arguments": { "expression": "2 + 2" }
}
| Field | Required | Description |
|---|---|---|
name | Yes | Tool name (must match toolset) |
call_id | Yes | Unique identifier for this call |
arguments | Yes | Tool input arguments |
Tool Result
Return from a tool execution:
{
"type": "tool_result",
"name": "calculator",
"call_id": "calc-001",
"result": { "value": 4 }
}
| Field | Required | Description |
|---|---|---|
name | Yes | Tool name |
call_id | Yes | Matches the tool_call |
result | Yes | Tool output |
Mixed Content
Messages can contain multiple content parts:
{
"role": "user",
"content": [
{ "type": "text", "text": "Describe this image:" },
{ "type": "image", "ref": { "asset_id": "img-001" } }
]
}
{
"role": "assistant",
"content": [
{ "type": "reasoning", "text": "I should analyze the visual elements..." },
{ "type": "text", "text": "The image shows a sunset over the ocean." }
]
}
Metadata
Any content part can include metadata:
{
"type": "text",
"text": "Hello!",
"metadata": {
"source": "human",
"confidence": 0.95
}
}