Registry
Browse submitted data cards
Dataset Registry
Browse and discover datasets with standardized LLM Data Cards. 3 datasets.
Multilingual QA Dataset
hybridA hybrid question-answering dataset combining real-world questions with synthetic answers, covering 10 languages with emphasis on factual knowledge and reading comprehension.
Examples: 250,000
License: CC-BY-NC-SA-4.0
enesfr
+7 text
restricted
Example Minimal Dataset
realA minimal example dataset demonstrating the required fields of the LLM Data Card schema.
Examples: 1,000
License: CC0-1.0
en text
public download
Hausa News Corpus
realA curated collection of news articles in Hausa language, covering politics, sports, entertainment, and local news from Nigerian media outlets.
Examples: 50,000
License: CC-BY-4.0
ha text
public download
No datasets found
Try adjusting your filters or search term.
Submit Your Dataset
Have a dataset to share? Use our Builder tool to create a Data Card, then submit via GitHub Pull Request.