346 research outputs found
An overview of Sesotho BLARK content
This article overviews digital language resources available for Sesotho, an official language of South Africa. The South African Center for Digital Language Resources (SADiLaR) repository is used as a reference as it is the official host of various language resources for South African languages. A total of 18 written resources are identified from the repository, and a further 16 spoken resources are identified. Finally, a total of 45 applications and modules were identified. Findings indicate that the majority of applications and modules available for Sesotho are in fact general resources aimed at all eleven official South African languages. Furthermore, the available resources indicate an inclination to the development of entry level, basic language resources and an absence of middle and higher resources with functionalities such as semantic analyses for written resources and prosody prediction for spoken resources. The study is hindered by the dearth of resource specific evaluations and related research and exacerbated by the absence of some of the resources on the repository
Resource Repositories and linking resources: An exploratory study
In this article the existence, use and importance of repositories are explored. An introduction into language resources (LRs) is given as well as a discussion of two platforms for the distribution of language resources, namely, the repository of the South African Centre for Digital Language Resources (SADiLaR) and Lanfrica, a site that links resources. In this article, types of repositories, such as institutional and language resource repositories, will be distinguished and compared. Language preservation is proposed as an important aspect which can be strengthened by the presence and use of repositories. The view expressed in this article is that the availability of language resources and repositories are pivotal for the development, preservation and advancement of languages.
Having a host site that links available resources and a repository where resources could be uploaded is a positive attribute of the mentioned online platforms, however as it will be discussed, the fact that information is available online is not a guarantee that the resources are or will be used by researchers or other interested persons, especially if they are not aware of their existence.
The article is concluded with suggestions for future work, for example measuring the influence of inaccurate metadata of language resources on linguistic research
Corpus-based Lexicography for Sesotho
For centuries, dictionaries were compiled based upon the knowledge of the lexicographer and information retrieved from manually consulted sources, mainly through a process of reading and marking. This approach meant that much of the information used in the dictionary relied upon the knowledge of the lexicographer. It is vital to rely on the lexicographer’s knowledge of the language but this has its shortcomings, since there is no single individual who knows all the words or terms, their meanings and usage, the words they combine with, and so on, in a specific language. The utilization of this method left room for errors and omissions because the lexicographer could easily overlook some words due to factors like time, fatigue, limited knowledge of the lexicographer, etc. Important words, for example words likely to be looked for by the target users of the dictionary, could accidentally be omitted. In the 1980s, the corpus era was born and the lexicography field changed forever. Collins COBUILD in Birmingham spearheaded this era with the publication of the first corpus-based dictionary, the Collins COBUILD Dictionary in 1987. Since the corpus era began, lexicographers no longer rely solely on their knowledge of the language, intuition, or the limited information gathered from available written sources, which are very limited for African languages. The corpus allows the lexicographer to have access to huge volumes of authentic data from written texts and transcribed oral data. This research will therefore critically discuss dictionary compilation for Sesotho and spearhead the use of corpora in the compilation of Sesotho dictionaries, so that lexicographers do not compile dictionaries as if they are compiling the first dictionary for the language. In addition, they should take into account tasks like lexicographic planning, amongst other factors required to compile a good user-friendly dictionary.
Key words
Corpora, collocations, concordances, lexicography, lexicographical planning, microstructure, macrostructure, lemmatisation.Dissertation (MA)--University of Pretoria, 2018.African LanguagesMAUnrestricte
Emerging market analysis of passive and active investing under bear and bull market conditions
Purpose – Stirred by scant regard for market phases in portfolio performance assessments, the current paper investigates the active versus passive investment strategies under the bull and bear market conditions in emerging markets focusing on South Africa as a case study. Design/methodology/approach – Methodologically, the measures of Jensen's alpha and Treynor index are applied to the monthly returns of 20 funds from January 2010 to June 2022. Findings – The results are enlightening; though they contradict developed market evidence, they are consistent with emerging market trends. The findings show that actively managed funds outperform the market benchmark and passive investing style under bear and normal market conditions. Passive investment strategy outperforms both market benchmark and actively investing style under bull market conditions. Practical implications – In the face of improved market efficiency, increased liquidity and recent technological impact, the findings of this study have practical application. The study outcomes should inform and update global investors, especially asset managers interested in emerging markets; however, the limitations of the study should also be considered. Originality/value – While limited studies consider market conditions when comparing and contrasting the performance of passive versus active investing, such consideration is lacking in emerging markets. The current study corrects this literature imbalance
The Influence of Chemical Modification on Linker Rotational Dynamics in Metal–Organic Frameworks
The robust synthetic flexibility of metal–organic frameworks (MOFs) offers a promising class of tailorable materials, for which the ability to tune specific physicochemical properties is highly desired. This is achievable only through a thorough description of the consequences for chemical manipulations both in structure and dynamics. Magic angle spinning solid‐state NMR spectroscopy offers many modalities in this pursuit, particularly for dynamic studies. Herein, we employ a separated‐local‐field NMR approach to show how specific intraframework chemical modifications to MOF UiO‐66 heavily modulate the dynamic evolution of the organic ring moiety over several orders of magnitude.Ringrotationen in MOFs wurden in Festkörper‐NMR‐Experimenten unter Probenrotation um den magischen Winkel durch dipolare Dephasierung über die Rotorperiode detektiert. Informationen zur Dynamik in Metall‐organischen Gerüsten sind wichtig, weil die Geschwindigkeit der Rotationsbewegung des Linkers die Sorptions‐ und Trenneigenschaften von MOFs beeinflusst.Peer Reviewedhttps://deepblue.lib.umich.edu/bitstream/2027.42/144665/1/ange201805004_am.pdfhttps://deepblue.lib.umich.edu/bitstream/2027.42/144665/2/ange201805004-sup-0001-misc_information.pdfhttps://deepblue.lib.umich.edu/bitstream/2027.42/144665/3/ange201805004.pd
IrokoBench: A New Benchmark for African Languages in the Age of Large Language Models
Despite the widespread adoption of Large language models (LLMs), their
remarkable capabilities remain limited to a few high-resource languages.
Additionally, many low-resource languages (e.g. African languages) are often
evaluated only on basic text classification tasks due to the lack of
appropriate or comprehensive benchmarks outside of high-resource languages. In
this paper, we introduce IrokoBench -- a human-translated benchmark dataset for
16 typologically-diverse low-resource African languages covering three tasks:
natural language inference~(AfriXNLI), mathematical reasoning~(AfriMGSM), and
multi-choice knowledge-based QA~(AfriMMLU). We use IrokoBench to evaluate
zero-shot, few-shot, and translate-test settings~(where test sets are
translated into English) across 10 open and four proprietary LLMs. Our
evaluation reveals a significant performance gap between high-resource
languages~(such as English and French) and low-resource African languages. We
observe a significant performance gap between open and proprietary models, with
the highest performing open model, Aya-101 only at 58\% of the best-performing
proprietary model GPT-4o performance. Machine translating the test set to
English before evaluation helped to close the gap for larger models that are
English-centric, like LLaMa 3 70B. These findings suggest that more efforts are
needed to develop and adapt LLMs for African languages.Comment: Under revie
Ultra-Fast Molecular Rotors within Porous Organic Cages
These include all the FIDs and raw NMR data published in the paper “Ultra-Fast Molecular Rotors within Porous Organic Cages”, Ashlea R. Hughes, et al., Chemistry – A European Journal., 2017. All NMR data were acquired with TopSpin 3.2. Please see the manuscript for more details
ChemInform Abstract: EINFLUSS DER ZUSAMMENSETZUNG IN DER DAMPFPHASE AUF DIE MORPHOLOGIE VON SIC-EINKRISTALLEN
- …
