Documentation

Quickstart

Create your first data card in minutes. Learn the required fields and basic structure. Get started

Field Reference

Detailed documentation for every field in the schema, organized by section. View reference

Validation Rules

Understand the conditional validation rules and how to satisfy them. Learn the rules

Examples

Real-world examples of data cards for different types of datasets. See examples

Schema Overview

The LLM Data Card v1.0 schema has 5 required sections and 11 optional sections:

Required Sections

  • core - Dataset identity and maintainer info
  • data - Data contents, modalities, languages, size
  • rights - Licensing and personal data status
  • provenance - Where the data came from
  • access - How to obtain the dataset

Optional Sections

  • artifacts - File pointers with checksums for reproducibility
  • processing - Normalization, filtering, deduplication steps
  • quality - Quality measurements and known issues
  • synthetic - Details about synthetic data generation
  • use - Intended and out-of-scope uses
  • governance - Review status and documentation
  • safety - Content risk assessment
  • community - Local/community involvement
  • sources - Per-source breakdown
  • stats - Numeric statistics
  • extensions - Custom fields