-
InkubaLM: A small language model for low-resource African languages
Authors:
Atnafu Lambebo Tonja,
Bonaventure F. P. Dossou,
Jessica Ojo,
Jenalea Rajab,
Fadel Thior,
Eric Peter Wairagala,
Anuoluwapo Aremu,
Pelonomi Moiloa,
Jade Abbott,
Vukosi Marivate,
Benjamin Rosman
Abstract:
High-resource language models often fall short in the African context, where there is a critical need for models that are efficient, accessible, and locally relevant, even amidst significant computing and data constraints. This paper introduces InkubaLM, a small language model with 0.4 billion parameters, which achieves performance comparable to models with significantly larger parameter counts an…
▽ More
High-resource language models often fall short in the African context, where there is a critical need for models that are efficient, accessible, and locally relevant, even amidst significant computing and data constraints. This paper introduces InkubaLM, a small language model with 0.4 billion parameters, which achieves performance comparable to models with significantly larger parameter counts and more extensive training data on tasks such as machine translation, question-answering, AfriMMLU, and the AfriXnli task. Notably, InkubaLM outperforms many larger models in sentiment analysis and demonstrates remarkable consistency across multiple languages. This work represents a pivotal advancement in challenging the conventional paradigm that effective language models must rely on substantial resources. Our model and datasets are publicly available at https://huggingface.co/lelapa to encourage research and development on low-resource languages.
△ Less
Submitted 3 September, 2024; v1 submitted 30 August, 2024;
originally announced August 2024.
-
IrokoBench: A New Benchmark for African Languages in the Age of Large Language Models
Authors:
David Ifeoluwa Adelani,
Jessica Ojo,
Israel Abebe Azime,
Jian Yun Zhuang,
Jesujoba O. Alabi,
Xuanli He,
Millicent Ochieng,
Sara Hooker,
Andiswa Bukula,
En-Shiun Annie Lee,
Chiamaka Chukwuneke,
Happy Buzaaba,
Blessing Sibanda,
Godson Kalipe,
Jonathan Mukiibi,
Salomon Kabongo,
Foutse Yuehgoh,
Mmasibidi Setaka,
Lolwethu Ndolela,
Nkiruka Odu,
Rooweither Mabuya,
Shamsuddeen Hassan Muhammad,
Salomey Osei,
Sokhar Samb,
Tadesse Kebede Guge
, et al. (1 additional authors not shown)
Abstract:
Despite the widespread adoption of Large language models (LLMs), their remarkable capabilities remain limited to a few high-resource languages. Additionally, many low-resource languages (e.g. African languages) are often evaluated only on basic text classification tasks due to the lack of appropriate or comprehensive benchmarks outside of high-resource languages. In this paper, we introduce IrokoB…
▽ More
Despite the widespread adoption of Large language models (LLMs), their remarkable capabilities remain limited to a few high-resource languages. Additionally, many low-resource languages (e.g. African languages) are often evaluated only on basic text classification tasks due to the lack of appropriate or comprehensive benchmarks outside of high-resource languages. In this paper, we introduce IrokoBench -- a human-translated benchmark dataset for 16 typologically-diverse low-resource African languages covering three tasks: natural language inference~(AfriXNLI), mathematical reasoning~(AfriMGSM), and multi-choice knowledge-based QA~(AfriMMLU). We use IrokoBench to evaluate zero-shot, few-shot, and translate-test settings~(where test sets are translated into English) across 10 open and four proprietary LLMs. Our evaluation reveals a significant performance gap between high-resource languages~(such as English and French) and low-resource African languages. We observe a significant performance gap between open and proprietary models, with the highest performing open model, Aya-101 only at 58\% of the best-performing proprietary model GPT-4o performance. Machine translating the test set to English before evaluation helped to close the gap for larger models that are English-centric, like LLaMa 3 70B. These findings suggest that more efforts are needed to develop and adapt LLMs for African languages.
△ Less
Submitted 5 June, 2024;
originally announced June 2024.
-
Application of Principal Component Analysis and Artificial Neural Networks for the Prediction of QoS in FSO Links over South Africa
Authors:
S. O Adebusola,
P. A Owolawi,
J. S Ojo,
P. S Maswikaneng,
A. O Ayo
Abstract:
Optical Communication in Free Space (FSO) bids more radio bandwidth, operates under a gratis license, and has a lower startup cost as compared to Radio Frequency (RF). Nonetheless, its vulnerability to variations in atmospheric meteorological circumstances is a concern. Ultimately, the purpose of this study is to use Principal Component Analysis (PCA) with Artificial Neural Networks (ANN) to desig…
▽ More
Optical Communication in Free Space (FSO) bids more radio bandwidth, operates under a gratis license, and has a lower startup cost as compared to Radio Frequency (RF). Nonetheless, its vulnerability to variations in atmospheric meteorological circumstances is a concern. Ultimately, the purpose of this study is to use Principal Component Analysis (PCA) with Artificial Neural Networks (ANN) to design a QoS prediction model for a terrestrial FSO communication connection. To accomplish the specified goal, meteorological data such as visibility, wind speed, and altitude were collected from the Weather Services in South Africa (SAWS) archive during a ten-year duration at five different locations: George, Johannesburg, Kimberly, Bloemfontein, and Polokwane. The eigenvalues of the first Principal Component (PC1) and the second Principal Component (PC2) in the PCA across the stations Bloemfontein, Johannesburg, Kimberly, George, and Polokwane are 7.624 and 1.020, 7.234, and 0.984, 6.204 and 1.723, 7.354 and 0.876, and 7.104 and 0.865, respectively, demonstrating that, they are kept as QoS variables to train the Artificial Neural Network (ANN) model as they provide the most compelling interpretation of the original variable data. The RMSE values of every proposed model across all the study locations are 0.1437, 0.2131, 0.2329, 0.1101, and 0.1977, respectively. Based on the RMSE, the proposed performed better over George. A realistic and accurate predictive model is developed for each of the study locations. Thus, the developed model will serve as a valuable tool for maintaining good QoS in FSO network services and improving telecom businesses in South Africa.
△ Less
Submitted 19 March, 2024;
originally announced March 2024.
-
AfriMTE and AfriCOMET: Enhancing COMET to Embrace Under-resourced African Languages
Authors:
Jiayi Wang,
David Ifeoluwa Adelani,
Sweta Agrawal,
Marek Masiak,
Ricardo Rei,
Eleftheria Briakou,
Marine Carpuat,
Xuanli He,
Sofia Bourhim,
Andiswa Bukula,
Muhidin Mohamed,
Temitayo Olatoye,
Tosin Adewumi,
Hamam Mokayed,
Christine Mwase,
Wangui Kimotho,
Foutse Yuehgoh,
Anuoluwapo Aremu,
Jessica Ojo,
Shamsuddeen Hassan Muhammad,
Salomey Osei,
Abdul-Hakeem Omotayo,
Chiamaka Chukwuneke,
Perez Ogayo,
Oumaima Hourrane
, et al. (33 additional authors not shown)
Abstract:
Despite the recent progress on scaling multilingual machine translation (MT) to several under-resourced African languages, accurately measuring this progress remains challenging, since evaluation is often performed on n-gram matching metrics such as BLEU, which typically show a weaker correlation with human judgments. Learned metrics such as COMET have higher correlation; however, the lack of eval…
▽ More
Despite the recent progress on scaling multilingual machine translation (MT) to several under-resourced African languages, accurately measuring this progress remains challenging, since evaluation is often performed on n-gram matching metrics such as BLEU, which typically show a weaker correlation with human judgments. Learned metrics such as COMET have higher correlation; however, the lack of evaluation data with human ratings for under-resourced languages, complexity of annotation guidelines like Multidimensional Quality Metrics (MQM), and limited language coverage of multilingual encoders have hampered their applicability to African languages. In this paper, we address these challenges by creating high-quality human evaluation data with simplified MQM guidelines for error detection and direct assessment (DA) scoring for 13 typologically diverse African languages. Furthermore, we develop AfriCOMET: COMET evaluation metrics for African languages by leveraging DA data from well-resourced languages and an African-centric multilingual encoder (AfroXLM-R) to create the state-of-the-art MT evaluation metrics for African languages with respect to Spearman-rank correlation with human judgments (0.441).
△ Less
Submitted 23 April, 2024; v1 submitted 16 November, 2023;
originally announced November 2023.
-
How good are Large Language Models on African Languages?
Authors:
Jessica Ojo,
Kelechi Ogueji,
Pontus Stenetorp,
David Ifeoluwa Adelani
Abstract:
Recent advancements in natural language processing have led to the proliferation of large language models (LLMs). These models have been shown to yield good performance, using in-context learning, even on tasks and languages they are not trained on. However, their performance on African languages is largely understudied relative to high-resource languages. We present an analysis of four popular la…
▽ More
Recent advancements in natural language processing have led to the proliferation of large language models (LLMs). These models have been shown to yield good performance, using in-context learning, even on tasks and languages they are not trained on. However, their performance on African languages is largely understudied relative to high-resource languages. We present an analysis of four popular large language models (mT0, Aya, LLaMa 2, and GPT-4) on six tasks (topic classification, sentiment classification, machine translation, summarization, question answering, and named entity recognition) across 60 African languages, spanning different language families and geographical regions. Our results suggest that all LLMs produce lower performance for African languages, and there is a large gap in performance compared to high-resource languages (such as English) for most tasks. We find that GPT-4 has an average to good performance on classification tasks, yet its performance on generative tasks such as machine translation and summarization is significantly lacking. Surprisingly, we find that mT0 had the best overall performance for cross-lingual QA, better than the state-of-the-art supervised model (i.e. fine-tuned mT5) and GPT-4 on African languages. Similarly, we find the recent Aya model to have comparable result to mT0 in almost all tasks except for topic classification where it outperform mT0. Overall, LLaMa 2 showed the worst performance, which we believe is due to its English and code-centric~(around 98%) pre-training corpus. Our findings confirms that performance on African languages continues to remain a hurdle for the current LLMs, underscoring the need for additional efforts to close this gap.
△ Less
Submitted 30 April, 2024; v1 submitted 14 November, 2023;
originally announced November 2023.
-
How Good are Commercial Large Language Models on African Languages?
Authors:
Jessica Ojo,
Kelechi Ogueji
Abstract:
Recent advancements in Natural Language Processing (NLP) has led to the proliferation of large pretrained language models. These models have been shown to yield good performance, using in-context learning, even on unseen tasks and languages. They have also been exposed as commercial APIs as a form of language-model-as-a-service, with great adoption. However, their performance on African languages…
▽ More
Recent advancements in Natural Language Processing (NLP) has led to the proliferation of large pretrained language models. These models have been shown to yield good performance, using in-context learning, even on unseen tasks and languages. They have also been exposed as commercial APIs as a form of language-model-as-a-service, with great adoption. However, their performance on African languages is largely unknown. We present a preliminary analysis of commercial large language models on two tasks (machine translation and text classification) across eight African languages, spanning different language families and geographical areas. Our results suggest that commercial language models produce below-par performance on African languages. We also find that they perform better on text classification than machine translation. In general, our findings present a call-to-action to ensure African languages are well represented in commercial large language models, given their growing popularity.
△ Less
Submitted 10 May, 2023;
originally announced May 2023.
-
MasakhaNEWS: News Topic Classification for African languages
Authors:
David Ifeoluwa Adelani,
Marek Masiak,
Israel Abebe Azime,
Jesujoba Alabi,
Atnafu Lambebo Tonja,
Christine Mwase,
Odunayo Ogundepo,
Bonaventure F. P. Dossou,
Akintunde Oladipo,
Doreen Nixdorf,
Chris Chinenye Emezue,
sana al-azzawi,
Blessing Sibanda,
Davis David,
Lolwethu Ndolela,
Jonathan Mukiibi,
Tunde Ajayi,
Tatiana Moteu,
Brian Odhiambo,
Abraham Owodunni,
Nnaemeka Obiefuna,
Muhidin Mohamed,
Shamsuddeen Hassan Muhammad,
Teshome Mulugeta Ababu,
Saheed Abdullahi Salahudeen
, et al. (40 additional authors not shown)
Abstract:
African languages are severely under-represented in NLP research due to lack of datasets covering several NLP tasks. While there are individual language specific datasets that are being expanded to different tasks, only a handful of NLP tasks (e.g. named entity recognition and machine translation) have standardized benchmark datasets covering several geographical and typologically-diverse African…
▽ More
African languages are severely under-represented in NLP research due to lack of datasets covering several NLP tasks. While there are individual language specific datasets that are being expanded to different tasks, only a handful of NLP tasks (e.g. named entity recognition and machine translation) have standardized benchmark datasets covering several geographical and typologically-diverse African languages. In this paper, we develop MasakhaNEWS -- a new benchmark dataset for news topic classification covering 16 languages widely spoken in Africa. We provide an evaluation of baseline models by training classical machine learning models and fine-tuning several language models. Furthermore, we explore several alternatives to full fine-tuning of language models that are better suited for zero-shot and few-shot learning such as cross-lingual parameter-efficient fine-tuning (like MAD-X), pattern exploiting training (PET), prompting language models (like ChatGPT), and prompt-free sentence transformer fine-tuning (SetFit and Cohere Embedding API). Our evaluation in zero-shot setting shows the potential of prompting ChatGPT for news topic classification in low-resource African languages, achieving an average performance of 70 F1 points without leveraging additional supervision like MAD-X. In few-shot setting, we show that with as little as 10 examples per label, we achieved more than 90\% (i.e. 86.0 F1 points) of the performance of full supervised training (92.6 F1 points) leveraging the PET approach.
△ Less
Submitted 20 September, 2023; v1 submitted 19 April, 2023;
originally announced April 2023.
-
Harmonic-Balance Based Power Flow and ZVS Analysis of a Quad-Active Bridge DC-DC Converter
Authors:
Ezekiel Olayiwola Arogunjo,
Olivia Nnadi,
Joseph Olorunfemi Ojo
Abstract:
The power flow control of multi-active bridge converters requires a comprehensive steady-state analysis of the converter and the determination of conditions for zero voltage switching of all switching in the converter which result in minimum switching loss. This paper aims to model and carry out the power flow and Zero Voltage Switching (ZVS) analyses of Quad-active-bridge (QAB) dc-dc converter. T…
▽ More
The power flow control of multi-active bridge converters requires a comprehensive steady-state analysis of the converter and the determination of conditions for zero voltage switching of all switching in the converter which result in minimum switching loss. This paper aims to model and carry out the power flow and Zero Voltage Switching (ZVS) analyses of Quad-active-bridge (QAB) dc-dc converter. The dynamic as well as the steady state analyses of the converter were carried out, thereby determining the phase shifts required to meet commanded load powers. The full equivalent circuit model of the converter which include winding resistances and magnetizing inductances is used rather than the popular lossless star-equivalent circuit model that may introduce significant error in the converter's analysis. The conditions which ensure the converter working in ZVS mode are determined and experimentally verified.
△ Less
Submitted 24 July, 2022;
originally announced July 2022.
-
An Insight into the Dynamics of a Dual Active Bridge
Authors:
Ezekiel Olayiwola Arogunjo,
Joseph Olorunfemi Ojo
Abstract:
This paper aims at analyzing the effect of the zero dynamics of the Dual Active Bridge Isolated Bidirectional dc-dc converter (DAB) on the dynamics of the complete DAB system. It also explains its influence on controller design for the DAB system. In carrying out these analyses, the state space model of the DAB, as well as the first harmonic approximation (FHA) of the model are derived. The ZVS an…
▽ More
This paper aims at analyzing the effect of the zero dynamics of the Dual Active Bridge Isolated Bidirectional dc-dc converter (DAB) on the dynamics of the complete DAB system. It also explains its influence on controller design for the DAB system. In carrying out these analyses, the state space model of the DAB, as well as the first harmonic approximation (FHA) of the model are derived. The ZVS and the stability analysis of the system are undertaken based on the FHA model of the system. The system is shown to be stable for a constant output voltage operation for the entire power range while it is unstable for the constant power load (CPL) operation for load demands close to the system maximum power. It is also shown that the transformer winding currents are part of the zero dynamic states and are always stable regardless of the operating conditions of the system.
△ Less
Submitted 2 August, 2022; v1 submitted 24 July, 2022;
originally announced July 2022.