Sample Packages

These are small but realistic training-data packages located at:

public/samples/llm-training-data/

They are served on the site under:

/samples/llm-training-data/

Each sample includes a metadata.json plus one or more JSONL shard files under data/. Copy a folder, edit the metadata, and replace the JSONL records with your own.

Sample Index

Telco Support SFT

Path: public/samples/llm-training-data/telco-support-sft/

Public files:

metadata.json

{
  "schema_version": "llm-training-data/v1",
  "dataset": {
    "name": "telco-support-sft",
    "dataset_id": "telco-support-sft-v1",
    "description": "Anonymized broadband troubleshooting conversations",
    "license": "Proprietary",
    "contact": "data@telco.meetkai.ai",
    "tags": ["telecom", "customer-support", "broadband"]
  },
  "defaults": {
    "system": [{ "type": "text", "text": "You are a helpful telecom support assistant." }]
  },
  "files": [
    {
      "split": "train",
      "objective": "sft",
      "shards": [
        { "path": "data/train-00000.jsonl", "record_count": 1 },
        { "path": "data/train-00001.jsonl", "record_count": 1 }
      ]
    }
  ]
}

data/train-00000.jsonl

{"id":"telco-sft-0001","messages":[{"role":"user","content":[{"type":"text","text":"My modem shows a red LOS light. What should I do?"}]},{"role":"assistant","content":[{"type":"text","text":"A red LOS light usually means loss of signal. Check that the fiber or coax cable is firmly connected, then power-cycle the modem for 30 seconds. If the light stays red, there may be an outage and support can run a line test."}]}]}

data/train-00001.jsonl

{"id":"telco-sft-0002","messages":[{"role":"user","content":[{"type":"text","text":"Wi-Fi drops every evening around 8 PM. Any quick checks?"}]},{"role":"assistant","content":[{"type":"text","text":"Evening drops can be congestion or interference. Restart the router, place it higher and away from appliances, and run a speed test at 8 PM. If speeds are low only at peak hours, support can check for area congestion or upgrade options."}]}]}

Preference Ranking

Path: public/samples/llm-training-data/preference-ranking/

Public files:

metadata.json

{
  "schema_version": "llm-training-data/v1",
  "dataset": {
    "name": "support-preference-ranking",
    "dataset_id": "support-preference-ranking-v1",
    "description": "Preference pairs for broadband support responses",
    "license": "Proprietary",
    "contact": "data@meetkai.ai",
    "tags": ["telecom", "preference", "customer-support"]
  },
  "files": [
    {
      "split": "train",
      "objective": "preference",
      "shards": [{ "path": "data/train.jsonl", "record_count": 2 }]
    }
  ]
}

data/train.jsonl

{"id":"pref-0001","messages":[{"role":"user","content":[{"type":"text","text":"My speed test is 20 Mbps but I pay for 100 Mbps. What should I check?"}]}],"candidates":[{"candidate_id":"pref-0001-a","label":"chosen","messages":[{"role":"assistant","content":[{"type":"text","text":"Try these steps: 1) connect via Ethernet to rule out Wi-Fi, 2) reboot the modem and router, 3) run a speed test during off-peak hours. If wired speeds are still low, support can check the line and provisioned profile."}]}]},{"candidate_id":"pref-0001-b","label":"rejected","messages":[{"role":"assistant","content":[{"type":"text","text":"Speeds vary, there is nothing you can do."}]}]}]}
{"id":"pref-0002","messages":[{"role":"user","content":[{"type":"text","text":"Why did I get a late payment fee on my bill?"}]}],"candidates":[{"candidate_id":"pref-0002-a","label":"chosen","messages":[{"role":"assistant","content":[{"type":"text","text":"Late fees apply when payment is received after the due date listed on your bill. I can help you review the due date and payment history, and we can check if a courtesy waiver is available."}]}]},{"candidate_id":"pref-0002-b","label":"rejected","messages":[{"role":"assistant","content":[{"type":"text","text":"Because you paid late. Check your bill."}]}]}]}

CPT Knowledge Base

Path: public/samples/llm-training-data/cpt-knowledgebase/

Public files:

metadata.json

{
  "schema_version": "llm-training-data/v1",
  "dataset": {
    "name": "telco-kb-cpt",
    "dataset_id": "telco-kb-cpt-v1",
    "description": "Knowledge base and runbook articles for network operations",
    "license": "Proprietary",
    "contact": "data@meetkai.ai",
    "tags": ["telecom", "knowledge-base", "cpt"]
  },
  "files": [
    {
      "split": "train",
      "objective": "cpt",
      "shards": [{ "path": "data/train.jsonl", "record_count": 2 }]
    }
  ]
}

data/train.jsonl

{"id":"cpt-0001","messages":[{"role":"document","content":[{"type":"text","text":"ONT Installation Checklist\\n\\n1. Verify optical signal levels are within spec.\\n2. Label the drop fiber and ONT port.\\n3. Confirm power and link lights are stable for 5 minutes.\\n4. Run a provisioning sync and record the device serial."}]}]}
{"id":"cpt-0002","messages":[{"role":"document","content":[{"type":"text","text":"Outage Triage Playbook\\n\\n- Confirm the scope: single account vs. neighborhood.\\n- Check upstream alarms and planned maintenance windows.\\n- Validate recent config changes and rollback if needed.\\n- Notify customers if the estimated time to restore exceeds 30 minutes."}]}]}

Tool Use Support

Path: public/samples/llm-training-data/tool-use-support/

Public files:

metadata.json

{
  "schema_version": "llm-training-data/v1",
  "dataset": {
    "name": "tool-use-support-sft",
    "dataset_id": "tool-use-support-sft-v1",
    "description": "Support conversations with tool calls for outage and plan lookup",
    "license": "Proprietary",
    "contact": "data@meetkai.ai",
    "tags": ["telecom", "tool-use", "customer-support"]
  },
  "toolsets": [
    {
      "id": "support-tools",
      "tools": [
        {
          "name": "lookup_outage",
          "description": "Check outage status for a customer account",
          "input_schema": {
            "type": "object",
            "properties": { "account_id": { "type": "string" } },
            "required": ["account_id"]
          }
        },
        {
          "name": "get_plan_details",
          "description": "Fetch plan details by plan identifier",
          "input_schema": {
            "type": "object",
            "properties": { "plan_id": { "type": "string" } },
            "required": ["plan_id"]
          }
        }
      ]
    }
  ],
  "defaults": {
    "toolset_id": "support-tools"
  },
  "files": [
    {
      "split": "train",
      "objective": "sft",
      "shards": [{ "path": "data/train.jsonl", "record_count": 2 }]
    }
  ]
}

data/train.jsonl

{"id":"tool-0001","messages":[{"role":"user","content":[{"type":"text","text":"Is there an outage on my line? My account is 51234."}]},{"role":"assistant","content":[{"type":"tool_call","name":"lookup_outage","call_id":"outage-1","arguments":{"account_id":"51234"}}]},{"role":"tool","content":[{"type":"tool_result","name":"lookup_outage","call_id":"outage-1","result":{"status":"none","last_checked":"2025-01-10T18:42:00Z"}}]},{"role":"assistant","content":[{"type":"text","text":"I do not see an active outage on your line. If you are still offline, we can run a line test or schedule a technician."}]}]}
{"id":"tool-0002","messages":[{"role":"user","content":[{"type":"text","text":"What upload speed does the Ultra 300 plan include?"}]},{"role":"assistant","content":[{"type":"tool_call","name":"get_plan_details","call_id":"plan-1","arguments":{"plan_id":"ultra-300"}}]},{"role":"tool","content":[{"type":"tool_result","name":"get_plan_details","call_id":"plan-1","result":{"plan_id":"ultra-300","download_mbps":300,"upload_mbps":20}}]},{"role":"assistant","content":[{"type":"text","text":"The Ultra 300 plan includes up to 300 Mbps download and 20 Mbps upload."}]}]}

Multimodal SFT

Path: public/samples/llm-training-data/multimodal-sft/

Public files:

assets.jsonl

{"asset_id":"img-router-001","uri":"https://datapass.meetkai.ai/assets/router-front.jpg","mime_type":"image/jpeg"}
{"asset_id":"img-cable-001","uri":"https://datapass.meetkai.ai/assets/ethernet-port.jpg","mime_type":"image/jpeg"}

data/train.jsonl

{"id":"mm-0001","messages":[{"role":"user","content":[{"type":"text","text":"Which port should the Ethernet cable go into on this router?"},{"type":"image","ref":{"asset_id":"img-router-001"}}]},{"role":"assistant","content":[{"type":"text","text":"Use the yellow LAN port labeled 1. Plug the Ethernet cable firmly until it clicks."},{"type":"image","ref":{"asset_id":"img-cable-001"}}]}]}

RFT Grading

Path: public/samples/llm-training-data/rft-grading/

Public files:

metadata.json

{
  "schema_version": "llm-training-data/v1",
  "dataset": {
    "name": "network-troubleshooting-rft",
    "dataset_id": "network-troubleshooting-rft-v1",
    "description": "RFT records with grading rubrics for troubleshooting steps",
    "license": "Proprietary",
    "contact": "data@meetkai.ai",
    "tags": ["telecom", "rft", "troubleshooting"]
  },
  "files": [
    {
      "split": "train",
      "objective": "rft",
      "shards": [{ "path": "data/train.jsonl", "record_count": 2 }]
    }
  ]
}

data/train.jsonl

{"id":"rft-0001","messages":[{"role":"user","content":[{"type":"text","text":"What port is used for HTTPS traffic?"}]},{"role":"assistant","content":[{"type":"text","text":"HTTPS uses port 443."}]}],"reference":{"grading":{"type":"exact_match"},"answer":"443"}}
{"id":"rft-0002","messages":[{"role":"user","content":[{"type":"text","text":"List three steps to isolate Wi-Fi interference in an apartment."}]},{"role":"assistant","content":[{"type":"text","text":"1) Move the router away from microwaves and thick walls. 2) Switch to 5 GHz or change channels to reduce overlap. 3) Run a site survey or Wi-Fi analyzer to identify crowded channels."}]}],"reference":{"grading":{"type":"rubric"},"rubric":{"must_have":["change channels or band","move router placement","check interference"],"nice_to_have":["use analyzer"],"must_not_have":["disable security"]}}}

Data Card References

If you want to pair training data with a Data Card, these registry examples are simple and easy to follow:

If you are new to the format, start with the Quickstart.