Restricted Access Dataset
This example demonstrates how to document a dataset with restricted access, including request instructions and terms of use.
When to Use This Pattern
Use this structure when:
- Your dataset requires approval before access
- You have specific terms or agreements users must accept
- The data contains sensitive information requiring controlled distribution
Complete Example
{
"schema_version": "llm-datacard/v1.0",
"core": {
"id": "medical-imaging-corpus",
"version": "2.0.0",
"title": "Medical Imaging Research Corpus",
"summary": "A curated collection of anonymized medical imaging data for AI research, including CT scans and MRI images with diagnostic annotations from board-certified radiologists.",
"maintainer": "Healthcare AI Institute",
"contact": "data-requests@healthcareai.org",
"doi": "10.5281/zenodo.example123"
},
"data": {
"kind": "real",
"modalities": ["image"],
"languages": ["en"],
"size": {
"examples": 50000,
"images": 50000,
"bytes": 524288000000
},
"domains": ["medical"],
"structures": ["classification-examples"],
"task_types": ["supervised-finetuning"],
"record_format": "other",
"record_format_notes": "DICOM format with accompanying JSON metadata files",
"has_human_annotations": true,
"label_types": ["classification-labels", "bounding-boxes"]
},
"rights": {
"license": "custom",
"license_url": "https://healthcareai.org/data-license",
"attribution_required": true,
"allows_commercial_use": false,
"contains_personal_data": "pseudonymous",
"consent_mechanism": "All data collected under IRB-approved protocols with patient informed consent. Data has been de-identified per HIPAA Safe Harbor guidelines.",
"restricted_uses": [
"Re-identification of patients",
"Commercial diagnostic applications without regulatory approval",
"Training models for clinical deployment without additional validation"
]
},
"provenance": {
"source_types": ["partner-license"],
"geography": { "scope": "multi-regional", "regions": ["North America", "Europe"] },
"collection_start_date": "2018-01-01",
"collection_end_date": "2023-12-31",
"collection_notes": "Images collected from partner hospitals under data sharing agreements. All annotations performed by board-certified radiologists with minimum 5 years experience."
},
"access": {
"availability": "on-request",
"terms_url": "https://healthcareai.org/data-use-agreement",
"request_instructions": "1. Complete the online application at healthcareai.org/apply\n2. Provide institutional affiliation and IRB approval (if applicable)\n3. Sign the Data Use Agreement\n4. Allow 2-4 weeks for review\n\nApproved researchers receive secure download credentials valid for 90 days."
},
"use": {
"intended_uses": [
"Medical imaging AI research",
"Algorithm development and benchmarking",
"Educational purposes in radiology training"
],
"out_of_scope_uses": [
"Direct clinical diagnosis without physician oversight",
"Commercial products without separate licensing",
"Any attempt to re-identify patients"
]
},
"governance": {
"review_status": "audited",
"last_reviewed": "2024-06-15",
"documentation_url": "https://healthcareai.org/corpus-documentation"
},
"safety": {
"content_risk_level": "medium",
"known_risky_categories": ["personal-information"],
"mitigations": "All images de-identified using HIPAA Safe Harbor method. Facial features removed from head imaging. Metadata scrubbed of identifying information. Annual re-audit of de-identification procedures."
}
}
Access Configuration
On-Request Availability
For datasets requiring approval:
"access": {
"availability": "on-request",
"terms_url": "https://datapass.meetkai.ai/legal/terms",
"request_instructions": "Detailed instructions..."
}
Access Types Compared
| Availability | Use Case | Required Fields |
|---|---|---|
public-download | Open data | url |
restricted | Approved users only | request_instructions or url |
on-request | Case-by-case approval | request_instructions or url |
not-available | Cannot be accessed | not_available_reason |
Request Instructions
Provide clear, step-by-step instructions:
"request_instructions": "1. Complete the online application at healthcareai.org/apply\n2. Provide institutional affiliation and IRB approval (if applicable)\n3. Sign the Data Use Agreement\n4. Allow 2-4 weeks for review\n\nApproved researchers receive secure download credentials valid for 90 days."
Personal Data Handling
This dataset contains pseudonymous personal data:
"contains_personal_data": "pseudonymous",
"consent_mechanism": "All data collected under IRB-approved protocols..."
Personal Data Levels
| Level | Description | Requires Consent? |
|---|---|---|
none | No personal data | No |
de_minimis | Minimal, incidental | No |
pseudonymous | Identifiable but anonymized | Yes |
direct | Directly identifying | Yes |
Custom File Formats
For non-standard formats, use other and explain:
"record_format": "other",
"record_format_notes": "DICOM format with accompanying JSON metadata files"
Safety Considerations
For sensitive datasets, document risks and mitigations:
"safety": {
"content_risk_level": "medium",
"known_risky_categories": ["personal-information"],
"mitigations": "All images de-identified using HIPAA Safe Harbor method..."
}
Related Guides
- Personal Data & Consent Guide - Handling PII requirements
- Access Reference - All access fields
- Rights Reference - Licensing and restrictions
Try It
- Open in Builder - Create your own restricted access card
- Validate this example - Check schema compliance