Kinyarwanda Automatic Speech Recognition Hackathon. at Digital Umuganda

Hackathon Dates

Launch: June 1, 2025
Submission Deadline: June 30, 2025
Review & Validation: July 1–11, 2025
Winners Announced: July 14, 2025

Objective

Build robust Automatic Speech Recognition (ASR) models for Kinyarwanda that perform well across critical sectors such as healthcare, education, agriculture, financial services, and government.

Tracks

Track	Transcribed Speech	Unlabeled Speech	Dataset Use
Track A – Small	540 hours	None	Only use Track A data
Track B – Medium	1180 hours	None	Only use Track B data
Track C – Large	1180 hours	1170 hours	Can use additional open-source data (e.g., Common Voice)

Dataset
Collected via crowd-sourcing: image prompts with voice responses (10–30 seconds).

Includes audio files, metadata (speaker ID, age, gender, location), and partial transcripts.

Covers five key domains: Health, Education, Agriculture, Government, Finance.

Prizes
Track
1st Place Prize
Track A
$1,000 USD
Track B
$1,000 USD
Track C
$1,000 USD

Eligibility
Open to researchers, students, startups, and hobbyists

Team size: Up to 5 members

Must use only permitted data and submit open-source code
Submission Requirements
submission.zip containing:

Transcripts of the test set

Public GitHub repository link with code, training scripts, models

Technical report or blog post (PDF or link)

Track-specific experiment logs (e.g., Weights & Biases, TensorBoard)

Data sources declaration file (for Track C)

Evaluation Metrics
Word Error Rate (WER) and Character Error Rate (CER)

Combined Score formula:
Score = (1 - (0.4 × WER + 0.6 × CER)) × 100

Automated leaderboard based on test data

Rules & Guidelines
General Rules (All Tracks)
No manual transcription or human correction on test data

Code must be open-sourced under Apache-2.0, MIT, BSD-3-Clause, or MPL-2.0

Only one leaderboard account per team

Submissions must be reproducible (logs, scripts, configs required)

Track-Specific Notes
Track A – Small
Use only the 540-hour transcribed dataset

No external or unlabeled Kinyarwanda data allowed

Cross-lingual pre-trained models allowed; fine-tuning must only use Track A data

Track B – Medium
Use only the 1180-hour transcribed dataset

Same rules as Track A, with a larger dataset

Recommended GPU budget: less than or equal to 300 hours

Track C – Large
Use 1180-hour labeled + 1170-hour unlabeled dataset

Semi/self-supervised learning is allowed and encouraged

Additional public speech data (e.g., Common Voice) permitted

Must disclose all extra datasets and compute usage
Recommended Background
Required
Experience with ASR model development

Familiarity with Python, PyTorch or TensorFlow

Understanding of model training and evaluation

Good to Have
Experience with models like wav2vec 2.0, Whisper

Knowledge of data augmentation for speech

Experience training on large datasets

Background in Kinyarwanda linguistics or related NLP fields
License
Dataset: CC BY 4.0

Participant code: Must use a permissive open-source license
Track Registration Links
Track A – Join Track A

Track B – Join Track B

Track C – Join Track C

About Company

Job Description

Objective

Dataset

Prizes

Eligibility

Submission Requirements

Evaluation Metrics

General Rules (All Tracks)

Track-Specific Notes

Recommended Background

License

Track Registration Links

Share this Opportunity

Track	1st Place Prize
Track A	$1,000 USD
Track B	$1,000 USD
Track C	$1,000 USD