AI Systems Lack Training Data in Most African Languages
Most AI models are trained primarily in English and other major languages, leaving thousands of African languages underrepresented. This gap affects health care communication in regions with high disease burdens and limited medical staff.
bbc.co.ukMost AI systems are trained primarily in English, major European languages and Chinese, while African languages remain severely underrepresented according to a report published in Nature. 9 doctors per 10,000 people.
A 25-year-old woman identified as Falmata, whose first language is Shuwa spoken by just 4% of families in northeastern Nigeria, could not communicate her child's symptoms to Hausa-speaking medical staff at a displacement camp clinic. The child received rehydration salts without instructions on using boiled water, leading to a subsequent health crisis after a neighbor omitted a key safety detail.
Language barriers have contributed to delays in HIV diagnosis, treatment errors in malaria cases that caused an estimated 608,000 deaths in 2022, and reduced adherence to tuberculosis regimens. Initiatives such as African Next Voices are creating multilingual health datasets for languages including isiZulu, Hausa, Yoruba and Dholuo, while Lesan AI targets communication needs in the Horn of Africa.
Africa hosts less than 1% of global data centre capacity and only 5% of African AI researchers have access to sufficient computing power for advanced model training.
Key Facts
Story Timeline
3 events- 2022
Malaria caused an estimated 608,000 deaths, 95% in Africa.
1 source@Nature - 2024
Lacuna Fund released new open datasets for health and language.
1 source@Nature - 2025
African Declaration on AI adopted by the African Union.
1 source@Nature
Potential Impact
- 01
Patients may receive incorrect medication instructions due to language gaps in AI tools.
- 02
Health programs could see lower adherence rates for TB and malaria treatment.
Transparency Panel
Related Stories
thesouthafrican.comSouth African Researchers Develop Quantum and AI Tools for Cybersecurity
Scientists and startup companies in South Africa are applying quantum communication and AI-powered tools to address rising global cyber threats. The work focuses on strengthening data protection methods.
France 24EU Discusses Readiness for Artificial Intelligence Changes
A France 24 program examined whether European Union policies can address the effects of artificial intelligence. The discussion covered potential impacts across daily life and economic sectors.
reason.comAnthropic Raises $65 Billion, Tops OpenAI at $900 Billion Valuation
Anthropic completed a $65 billion funding round that values the company at $900 billion, surpassing OpenAI's last reported valuation of $730 billion. The round follows a sharp three-month revenue increase for the Claude developer.