If digital technologies are inclusive, why would 45 million Yoruba speakers be excluded from machine translation services, asks Dr Emmanuel Ngue Um, associate professor of linguistics and digital humanities at the University of Yaoundé. Be wary of the ‘big tech behemoths’, he warns.
In the context of Africa, perhaps as much as elsewhere, the emerging discipline of digital humanities has not only inspired new forms of knowledge production, it has also provoked anti-colonialist and anti-imperialist responses. Against this backdrop, there seems to be a principled view among many African humanists – particularly linguists – and social scientists that technology should accommodate itself to society and not the reverse.
Such a view is premised on values and beliefs in the precedence of human concerns and claims for equal recognition and inclusion, as well as a resistance to forms of neoliberalism with which digital technologies are often associated. Sometimes, the values and ideals which a given human group lives by at a given time, may be anchored in a social habitus shaped by a technology which had previously been alien to their culture, but which has over time entered their minds and practices. For many language groups in Africa, technologies of writing fit into this pattern of newly adopted social habits stemming from Western colonisation.
In charting writing technologies for African languages that best reflect their structural properties and in an attempt to depart from the colonial legacy of writing, African linguists have developed new orthographies whereby symbols index as far as possible the language as it is spoken. This is realised, for example, by using diacritics – marks or ‘glyphs’ typically placed above or below a word – to represent the melodic contour of syllables in tone languages.
However, such innovations are achieved at the expense of machine readability and processability, thereby increasing the marginalisation of many African languages in the digital space. A glaring illustration of this is the Yoruba language and its 45 million speakers. Despite the standard orthography of Yoruba making use of diacritics, these are eschewed in the machine translation service available for Yoruba in Google Translate. Obviously, this cannot be a matter of neutral choice by Google.
As much as digital technologies should be the subject of cautious adoption in the south when engineered out of the box by big tech behemoths, those in the global south who adopt or reject these technologies wholesale, should also seek to critically understand how they work. This requires retooling humanities scholarship to overcome disciplinary silos that have prevented the necessary epistemological hybridisation between the so-called hard sciences and the humanities and social sciences, often at the expense of the social good.
Dr Emmanuel Ngue Um is associate professor of linguistics and digital humanities at the University of Yaoundé and head of the Department of Cameroonian Languages and Cultures at the Higher Teacher Training College of Bertoua in Cameroon. He is a member of the African Academy of Languages and serves on the governance committees of the Endangered Languages Project and Humanistica, the French Association of Digital Humanities. His current research focuses on machine translation and voice recognition technologies for African languages, with recent projects focusing on the Ewondoand Basaa languages.