Preference Training Example
Preference Training Example
This example shows a preference training dataset for DPO or RLHF.
File Structure
preference-dataset/
metadata.json
data/
pref-train.jsonl
metadata.json
{
"schema_version": "llm-training-data/v1",
"dataset": {
"name": "preference-training-example",
"description": "Example dataset for DPO/preference training"
},
"defaults": {
"system": [
{ "type": "text", "text": "You are a helpful, harmless, and honest assistant." }
]
},
"files": [
{
"split": "train",
"objective": "preference",
"shards": [
{ "path": "data/pref-train.jsonl" }
]
}
]
}
data/pref-train.jsonl
{"id":"pref-001","messages":[{"role":"user","content":[{"type":"text","text":"How do I make a good first impression?"}]}],"candidates":[{"label":"chosen","messages":[{"role":"assistant","content":[{"type":"text","text":"Here are some tips for making a good first impression:\n\n1. **Smile and make eye contact** - Shows confidence and friendliness\n2. **Listen actively** - Show genuine interest in others\n3. **Use a firm handshake** - Conveys confidence\n4. **Remember names** - People appreciate being remembered\n5. **Be positive** - Avoid complaining or negativity\n\nFirst impressions form quickly, so be mindful of your body language and tone."}]}]},{"label":"rejected","messages":[{"role":"assistant","content":[{"type":"text","text":"Just be yourself I guess. Smile or whatever."}]}]}]}
{"id":"pref-002","messages":[{"role":"user","content":[{"type":"text","text":"Explain blockchain technology."}]}],"candidates":[{"label":"chosen","messages":[{"role":"assistant","content":[{"type":"text","text":"Blockchain is a distributed digital ledger technology. Here's how it works:\n\n**Core concepts:**\n- **Blocks**: Groups of transactions bundled together\n- **Chain**: Blocks linked cryptographically in sequence\n- **Distributed**: Copies exist across many computers\n- **Immutable**: Once recorded, data is very hard to change\n\n**Key benefits:**\n- Transparency - Anyone can verify transactions\n- Security - Cryptographic protection\n- Decentralization - No single point of failure\n\nBlockchain powers cryptocurrencies like Bitcoin, but is also used for supply chain tracking, voting systems, and more."}]}]},{"label":"rejected","messages":[{"role":"assistant","content":[{"type":"text","text":"Blockchain is what Bitcoin uses. It's like a database but different. Very technical stuff."}]}]}]}
Pretty-Printed Record
{
"id": "pref-001",
"messages": [
{
"role": "user",
"content": [
{ "type": "text", "text": "How do I make a good first impression?" }
]
}
],
"candidates": [
{
"label": "chosen",
"messages": [
{
"role": "assistant",
"content": [
{
"type": "text",
"text": "Here are some tips for making a good first impression:\n\n1. **Smile and make eye contact**..."
}
]
}
]
},
{
"label": "rejected",
"messages": [
{
"role": "assistant",
"content": [
{ "type": "text", "text": "Just be yourself I guess. Smile or whatever." }
]
}
]
}
]
}
Multi-Turn Context
Preference with conversation history:
{
"id": "pref-003",
"messages": [
{
"role": "user",
"content": [{ "type": "text", "text": "I'm planning a trip to Japan." }]
},
{
"role": "assistant",
"content": [{ "type": "text", "text": "That sounds exciting! When are you planning to go?" }]
},
{
"role": "user",
"content": [{ "type": "text", "text": "In April. What should I see?" }]
}
],
"candidates": [
{
"label": "chosen",
"messages": [
{
"role": "assistant",
"content": [
{
"type": "text",
"text": "April is perfect for cherry blossom season! Here are must-see destinations:\n\n**Tokyo**: Ueno Park, Shinjuku Gyoen for sakura viewing\n**Kyoto**: Philosopher's Path, Maruyama Park\n**Osaka**: Osaka Castle Park\n\nBook accommodations early as it's peak season. Would you like specific recommendations for any city?"
}
]
}
]
},
{
"label": "rejected",
"messages": [
{
"role": "assistant",
"content": [{ "type": "text", "text": "Tokyo and Kyoto are popular." }]
}
]
}
]
}
Ranked Preferences
With multiple candidates and rankings:
{
"id": "pref-ranked",
"messages": [
{
"role": "user",
"content": [{ "type": "text", "text": "Write a short poem about the ocean." }]
}
],
"candidates": [
{
"label": "chosen",
"rank": 1,
"score": 0.95,
"messages": [
{
"role": "assistant",
"content": [
{
"type": "text",
"text": "Waves dance beneath the silver moon,\nSalt air carries an ancient tune,\nDepths hold secrets yet untold,\nStories of sailors, brave and bold."
}
]
}
]
},
{
"label": "neutral",
"rank": 2,
"score": 0.70,
"messages": [
{
"role": "assistant",
"content": [
{
"type": "text",
"text": "The ocean is blue and wide,\nWith many fish inside,\nWaves crash on the shore,\nI'd like to see some more."
}
]
}
]
},
{
"label": "rejected",
"rank": 3,
"score": 0.30,
"messages": [
{
"role": "assistant",
"content": [{ "type": "text", "text": "Ocean poem: water, fish, waves, blue. The end." }]
}
]
}
]
}
Requirements
- At least one candidate with
label: "chosen" - At least one candidate with
label: "rejected" - Minimum 2 candidates total