By Juliet Nanfuka |
During a multistakeholder consultation held at the Forum on Internet Freedom in Africa (2025) that took place in Windhoek, Namibia, participants called attention to the urgent need to elevate African languages and indigenous knowledge systems within global internet governance. The consultation, hosted by UNESCO and the Collaboration on International ICT Policy for East and Southern Africa (CIPESA) highlighted the urgent need for the digital ecosystem to be more representative and responsive to the realities of African users. The consultation which comprised experts from academia, artificial intelligence (AI) experts, civil society and the media took place on September 26, 2025. One of the strongest concerns raised related to the ways in which big tech companies classify African languages. It was noted that current language identification models are often inaccurate, frequently misclassifying African language datasets which has often resulted in weak or unusable models and contributed to content moderation systems that are inadequately built to address the information disorder in African digital spaces.

Opening the session, John Okande, Programme Coordinator at UNESCO highlighted the UN International Decade of Indigenous Languages (2022-2032) which provides a global mandate to protect and promote linguistic diversity. He noted that this initiative aligns with the principles of UNESCO’s Guidelines for the Governance of Digital Platforms and the UN Global Principles on Information Integrity, which both call for multi-stakeholder action to ensure technology serves all communities equitably. Okande emphasised that these global frameworks “require deliberate adaptation to Africa’s unique linguistic and cultural contexts.” Various initiatives by UNESCO to promote multilingualism in cyberspace demonstrate the value of localised interventions that safeguard freedom of expression while building community resilience including. Among these is the Social Media for 4 Peace (SM4P) global initiative aimed at building societies’ resilience to online harmful content, disinformation and hate speech, while safeguarding freedom of expression and fostering peace through social media.
The consultation also laid bare how AI and Large Language Models (LLMs) can amplify harm. LLMs sometimes provide harmful or dangerous responses due to the data they are trained on being low-quality or biased. In many cases, outsourced data trainers lack supervision, and limited regulatory frameworks to ensure ethical or safe training processes.
Many LLMs lack basic safety guardrails for African languages in comparison to English where harmful queries are often flagged and blocked. This disparity is illustrative of the persisting data inequalities in the AI ecosystem.
Tajuddeen Gwadabe, Programs and MEL Lead at Masakhane African Languages Hub noted that while languages like Hausa have tens of millions of speakers, only one dialect, often the standardised, formal variant is what gets represented online. Entire linguistic communities, such as speakers of the Sokoto dialect, are rendered invisible in digital datasets.
Participants shared similar concerns as they noted that the broader online representations of African languages tend to reflect how language is used when written, and not how languages are spoken. They noted that code-mixing, slang, tonal nuance, gestures, and layered cultural meaning are nearly impossible for AI to capture without intentional investment.
“Despite African languages having a large number of speakers, digital spaces often only represent one variant or standardised dialect. For instance, in Hausa, only the standard writing from Kano is represented, while dialects from Sokoto “are hardly ever present.”
The consultation highlighted concerns in African intellectual infrastructure which serves as the basis for knowledge creation and dissemination including the facilitation of downstream productive activities, including information production, innovation, development of products, education, community building and interaction, democratic participation, socialisation, and many other socially valuable activities.
Dr. Phathiswa Magopeni, Executive Director of the South Africa Press Council, noted the urgent need to build African intellectual infrastructure alongside efforts to elevate African languages in the digital society. She highlighted the dominance of the English language including in African policy and regulatory documents across many countries and argued that this serves to protect English, but at the cost of indigenous languages.
She noted, “We are often willing to compromise the essence of our own languages in the belief that doing so will grant us access to spaces dominated by English. Meanwhile, the speakers of English continue to protect their language.” Dr. Magopeni emphasised that many African languages lack foundational datasets across academic, scientific, legal, and technical fields that are essential for the long-term strengthening of African intellectual infrastructure.
The consultation went on to raise various dynamics about the state of the current ecosystem including on the extent to which African identity gets lost online as Africans adjust their identity to suit the limitations of digital platforms. Further, there was debate on the extent to which platforms should be compelled to adapt to African contexts with consensus reached on that fact that political will is necessary to advance African languages in digital spaces. It was noted that without policymakers prioritising local languages including in Parliament, service delivery and publicly accessible data, there will be limited improvement.
Digital Rights research and political analyst Dércio Tsandzana illustrated the case of Mozambique noting that in Parliament, some members of parliament do not effectively participate all through their mandate due to their inability to speak Portuguese which is the national language. “If we don’t have politicians or policy makers that want to change first in their countries we will not see any change (by platforms).” Tsandzana noted.
Ultimately gaps in African languages online will continue to remain a sore point for disinformation and continent moderation due to the deep-seated issues concerning data quality, the nature of language use, and the limitations of AI technology.
The consensus from the consultation was that there is a need for more collaboration between stakeholders and an ecosystem-wide approach in African AI development. It was noted that universities, particularly African language departments, hold extensive expertise on standardised linguistic forms. Meanwhile, stakeholders such as governments which hold immense amounts of public data, through to community institutions such as local radio stations reflect how languages are used today all have a role to play in contributing to how African languages are integrated in AI. Thus, big tech companies need to work more cohesively with a broader spectrum of stakeholders.
Further, there was agreement in the urgency of populating the internet with more African content including stories, proverbs, folklore, and history. As AI continues to learn using whatever data is available, African content must be present and accurate. Thus there is a need to invest in indigenous language content development, strengthen African intellectual infrastructure, and to also demand accountability from global platforms. These efforts require the development of practical and context-specific action plans for policymakers and tech platforms to realise African indigenous language and knowledge systems in the digital ecosystem.








