Google has launched WAXAL, a large-scale artificial intelligence speech dataset designed to support Igbo, Yoruba, and 21 African languages spoken by more than 100 million people across Sub-Saharan Africa.
The company said the dataset was developed in collaboration with a consortium of African research institutions, which led data collection and curation efforts across multiple countries.
Google said the project aims to address the long-standing exclusion of African languages from global voice-enabled technologies due to the lack of high-quality speech data.
According to Aisha Walcott-Bryant, Head of Google Research Africa, the initiative began over three years ago after researchers identified a significant imbalance in global speech datasets that overwhelmingly favour Western and widely spoken languages.
She said WAXAL is intended to empower African researchers, developers, and entrepreneurs to build technology in their own languages and contexts.
“The dataset provides a critical foundation for building speech technologies that can reach more than 100 million people across the continent,” Walcott-Bryant said.
The WAXAL dataset includes approximately 1,250 hours of transcribed natural speech and more than 20 hours of studio-quality recordings for high-fidelity synthetic voice development.
Languages covered include Hausa, Yoruba, Igbo, Swahili, Luganda, Fulani, Kikuyu, Lingala, Shona, Malagasy, and others.
Data collection was led by African universities and community organisations, including Makerere University (Uganda), the University of Ghana, and Digital Umuganda (Rwanda). Partner institutions retain full ownership of the data, with Google providing technical support.
The lack of speech data has forced African innovators to build datasets from scratch, increasing costs and slowing innovation.
By lowering barriers to AI development, WAXAL is expected to accelerate locally relevant applications in education, healthcare, agriculture, and digital public services, while supporting broader efforts to build indigenous AI capacity across Africa.

