Maninkakan Lexicon

Item Name: Maninkakan Lexicon
Authors: Moussa Bamba
LDC Catalog No.: LDC2013L01
ISBN: 1-58563-662-2
Release Date: Dec 16, 2012
Data Type: lexicon
Data Source(s): dictionaries
Application(s): instruction, language teaching, machine translation, sociolinguistics
Language(s): Eastern Maninkakan, English, French
Language ID(s): emk, eng, fra
Distribution: Web Download
Member fee: $0 for 2013 members
Non-member Fee: US $800.00
Reduced-License Fee: N/A
Extra-Copy Fee: US $
Non-member License: yes
Online documentation: yes
Licensing Instructions: Subscription & Standard Members, and Non-Members
Citation: Moussa Bamba
Maninkakan Lexicon
Linguistic Data Consortium, Philadelphia


Maninkakan Lexicon was developed by LDC and contains 5,834 entries of the Maninkakan language presented as a Maninkakan-English lexicon and a Maninkakan-French lexicon. It is the second publication in an ongoing LDC project to to build an electronic dictionary of four Mandekan languages: Mawukakan, Maninkakan, Bambara and Jula. These are Eastern Manding languages in the Mande Group of the Niger-Congo language family. LDC released a Mawukakan Lexicon (LDC2005L01) in 2005.

There are approximately 3.5 million Maninkakan speakers in West Africa, mostly in Guinea and Mali, and also in Liberia, Senegal, Sierra Leone and Côte dIvoire. The word Maninkakan is composed of three lexemes: (1) Mande or Manden, the name of the territory occupied by the people who speak the language, (2) the suffix -ka which when added derives the name of the inhabitant of Mande or Manden, and (3) kan, which means language. Thus Maninkakan is the language of the people who live in Mande/Manden. Mandekan, Mandenkan, Maninka and Malinke are all used to refer to the language of the inhabitants of the Mande/Manden.

More information about the work of LDC in the languages of West Africa and the challenges those languages present for language resource development can be found here.


Maninkakan is written using Latin script, Arabic script and the NKo alphabet. This lexicon is presented using a Latin-based transcription system because the Latin alphabet is familiar to the majority of Mandekan language speakers and because it is expected to facilitate the work of researchers interested in this resource.

The dictionary is provided in two formats, Toolbox and XML. Toolbox is a version of the widely used SIL Shoebox program adapted to display Unicode. Toolbox can be downloaded for free from this link, The Toolbox files are provided in two fonts, Arial and Doulous SIL. The Arial files should display using the Arial font which is standard on most operating systems. Doulous SIL, available as a free download, is a robust font that should display all characters without issue. Users should launch Toolbox using the *.prj files in the Arial or Doulous_SIL folders.

The lexicon is presented in Unicode Normalization Form D, canonical decomposition. This means that all glyphs are divided into as many parts as possible. See the following link for more information on Unicode normalization forms.

The XML formatted lexicon was generated by Toolbox and a DTD is included.


Please view this XML sample.


None at this time.

Content Copyright

Portions © 2013 Trustees of the University of Pennsylvania