Mawukakan Lexicon

Item Name: Mawukakan Lexicon
Author(s): Moussa Bamba
LDC Catalog No.: LDC2005L01
ISBN: 1-58563-337-2
ISLRN: 592-356-503-307-6
Release Date: April 15, 2005
Member Year(s): 2005
DCMI Type(s): Text
Data Source(s): dictionaries
Application(s): instruction, language teaching, machine translation, sociolinguistics
Language(s): Mahou, French, English
Language ID(s): mxx, fra, eng
License(s): LDC User Agreement for Non-Members
Online Documentation: LDC2005L01 Documents
Licensing Instructions: Subscription & Standard Members, and Non-Members
Citation: Bamba, Moussa. Mawukakan Lexicon LDC2005L01. Web Download. Philadelphia: Linguistic Data Consortium, 2005.
Related Works: View


Mawukakan Lexicon was developed by the Linguistic Data Consortium (LDC) and consists of a dictionary of the Mawukakan language with English and French references that contains 4,467 lexical entries.

The Mawukakan Lexicon is the first publication of an ongoing project at LDC aiming to build an electronic dictionary of three Mandekan languages, which are Eastern Manding languages of the Mande Group of the Niger-Congo family. The other variants of Mandekan are Bambara or Bamanankan, from Mali, and Maninka or Maninkakan, from Guinea and Conakry. LDC released Maninkakan Lexicon (LDC2013L01) in 2013 and Bamanankan Lexicon (LDC2016L01) in 2016.

Mawukakan has less than half a million speakers, mostly in the francophone area of West Africa. Therefore, the lexicon is offered with both English and French translations. The language lacks a written tradition, making such a dictionary project extremely important. For this dictionary, the language has been transcribed in the International Phonetic Alphabet (IPA). IPA is the least costly system for transcribing the African languages without a writing tradition, and for which an alphabet was defined only at the end of the 1950s, a period in which most of the colonies in West Africa became independent.


The dictionary is trilingual, that is, the target language is Mawukakan, while English and French are used as glossing languages. The dictionary is provided in two formats, Toolbox and XML. The Toolbox program is a version of the widely used SIL Shoebox program adapted to display Unicode. Toolbox can be downloaded for free from this link.

Both the Toolbox and the XML versions of this dictionary use the Unicode (UTF-8) encoding. The Doulos SIL Unicode font works well, as do a number of other fonts displaying ASCII, Latin-1 (U+00A0 through U+00FF), Latin Extended-A (U+0100 through U+017F), Latin Extended B (U+0180 through U+024F), IPA extensions (U+0250 through U+02AF), and Combining Diacritic Marks (U+300 through U+36F).


Meghan Glenn served as an editor for the French and English parts of this lexicon.


For an example of the data in this lexicon, please view this sample (XML).


None at this time.

Available Media

View Fees

Login for the applicable fee