Minimal Example
This example demonstrates the minimum required fields for a valid LLM Data Card. It’s ideal for simple datasets that don’t need extensive metadata.
When to Use This Pattern
Use this minimal structure when:
- Your dataset is straightforward and publicly available
- You want to quickly publish a data card
- Your dataset has no personal data or complex licensing
Complete Example
{
"schema_version": "llm-datacard/v1.0",
"core": {
"id": "example-minimal",
"version": "1.0.0",
"title": "Example Minimal Dataset",
"summary": "A minimal example dataset demonstrating the required fields of the LLM Data Card schema.",
"maintainer": "DataPass Team",
"contact": "data@meetkai.ai"
},
"data": {
"kind": "real",
"modalities": ["text"],
"languages": ["en"],
"size": {
"examples": 1000
},
"domains": ["general"],
"record_format": "json-structured"
},
"rights": {
"license": "CC0-1.0",
"allows_commercial_use": true,
"contains_personal_data": "none"
},
"provenance": {
"source_types": ["official-open-data"]
},
"access": {
"availability": "public-download",
"url": "https://datapass.meetkai.ai/registry/example-minimal/1.0.0"
}
}
Section Breakdown
Core Section
Every data card must include these core identifiers:
| Field | Purpose | Example Value |
|---|---|---|
id | Unique dataset identifier | example-minimal |
version | Semantic version | 1.0.0 |
title | Human-readable name | Example Minimal Dataset |
summary | Brief description | 1-3 sentences |
maintainer | Organization or person | DataPass Team |
contact | Email for inquiries | data@meetkai.ai |
Data Section
Describes what the dataset contains:
| Field | Purpose | This Example |
|---|---|---|
kind | Real, synthetic, or hybrid | real |
modalities | Types of content | ["text"] |
languages | BCP-47 language codes | ["en"] |
size.examples | Number of records | 1000 |
domains | Subject areas | ["general"] |
record_format | Record format | json-structured |
Rights Section
Specifies licensing and data privacy:
| Field | Purpose | This Example |
|---|---|---|
license | SPDX license identifier | CC0-1.0 |
allows_commercial_use | Commercial usage allowed | true |
contains_personal_data | Privacy level | none |
Provenance Section
Documents data origins:
| Field | Purpose | This Example |
|---|---|---|
source_types | How data was collected | ["official-open-data"] |
Access Section
Tells users how to get the data:
| Field | Purpose | This Example |
|---|---|---|
availability | Access method | public-download |
url | Download location | Direct URL |
Optional Enhancements
Consider adding these fields as your dataset matures:
core.doi- Digital Object Identifier for citationcore.preferred_citation- How to cite the datasetdata.size.tokens- Token count for text datasetsprovenance.collection_start_date/collection_end_date- Collection timeframeuse.intended_uses- Recommended applications
Try It
- Open in Builder - Create your own minimal card
- Validate this example - Check schema compliance
- View in Registry - See the full registry entry