Bamanankan Lexicon

Item Name: Bamanankan Lexicon
Author(s): Moussa Bamba
LDC Catalog No.: LDC2016L01
ISBN: 1-58563-781-5
ISLRN: 830-816-122-814-4
Release Date: December 15, 2016
Member Year(s): 2016
DCMI Type(s): Text
Data Source(s): dictionaries
Application(s): instruction, language teaching, machine translation, sociolinguistics
Language(s): Bambara, English, French
Language ID(s): bam, eng, fra
License(s): LDC User Agreement for Non-Members
Online Documentation: LDC2016L01 Documents
Licensing Instructions: Subscription & Standard Members, and Non-Members
Citation: Bamba, Moussa. Bamanankan Lexicon LDC2016L01. Web Download. Philadelphia: Linguistic Data Consortium, 2016.
Related Works: View


Bamanankan Lexicon was developed by the Linguistic Data Consortium (LDC) and contains 5,978 entries of the Bamanankan language presented as a Bamanankan-English lexicon and a Bamanankan-French lexicon. It is the third publication in an LDC project to build an electronic dictionary of three Mandekan languages: Mawukakan, Maninkakan and Bamanankan. These are Eastern Manding languages in the Mande Group of the Niger-Congo language family. LDC released a Mawukakan Lexicon (LDC2005L01) in 2005 and a Maninkakan Lexicon (LDC2013L01) in 2013.

There are approximately 15 million Bamanankan speakers (four million in Mali, and ten million in the West African region who speak Bamanankan as a second language.) The number of speakers of the different Mandekan dialects is approximately 30 to 40 million, mainly in Mali, Burkina Faso, Senegal, Guinea Bissau, Guinea, Liberia, Ivory Coast, Sierra Leone and Gambia. Bamanankan is the most studied among the Mandekan languages, due to the fact that it is spoken as a first or second language by at least 80% of the population of Mali and widely as a second or third language in most of West Africa.

More information about LDC's work in the languages of West Africa and the challenges those languages present for language resource development can be found here.


This lexicon is presented using a Latin-based transcription system because the Latin alphabet is familiar to the majority of Mandekan language speakers and it is expected to facilitate the work of researchers interested in this resource.

The dictionary is provided in two formats, Toolbox and XML. Toolbox is a version of the widely-used SIL Shoebox program adapted to display Unicode. Toolbox can be downloaded for free from this link. The Toolbox files are provided in two fonts, Arial and Doulos SIL. The Arial files should display using the Arial font which is standard on most operating systems. Doulos SIL, available as a free download, is a robust font that should display all characters without issue. Users should launch Toolbox using the *.prj files in the Arial or Doulos_SIL folders.

The lexicon is presented in Unicode Normalization Form D, canonical decomposition. This means that all glyphs are divided into as many parts as possible. See the following link for more information on Unicode normalization forms.

The XML formatted lexicon was generated by Toolbox and a DTD is included.


Meghan Glenn served as an editor for the French and English parts of this Lexicon.


Please view this xml sample and Toolbox sample.


None at this time.

Available Media

View Fees

Login for the applicable fee