Ethnobotanical Research and Language Documentation of Nahuatl
Item Name: | Ethnobotanical Research and Language Documentation of Nahuatl |
Author(s): | Jonathan D. Amith, Amelia Domínguez Alcántara, Hermelindo Salazar Osollo, Ceferino Salgado Castañeda, Eleuterio Gorostiza Salgado |
LDC Catalog No.: | LDC2021S06 |
ISBN: | 1-58563-968-0 |
ISLRN: | 239-876-617-783-3 |
DOI: | https://doi.org/10.35111/9djs-6v63 |
Release Date: | July 15, 2021 |
Member Year(s): | 2021 |
DCMI Type(s): | MovingImage, Sound, StillImage, Text |
Sample Type: | pcm |
Sample Rate: | 48000 |
Data Source(s): | field recordings |
Application(s): | language documentation, speech recognition |
Language(s): | Highland Puebla Nahuatl, Zacatlán-Ahuacatlán-Tepetzintla Nahuatl, Spanish |
Language ID(s): | azz, nhi, spa |
License(s): |
Ethnobotanical Research and Language Documentation of Nahuatl Agreement |
Online Documentation: | LDC2021S06 Documents |
Licensing Instructions: | Subscription & Standard Members, and Non-Members |
Citation: | Amith, Jonathan D., et al. Ethnobotanical Research and Language Documentation of Nahuatl LDC2021S06. Web Download. Philadelphia: Linguistic Data Consortium, 2021. |
Related Works: | View |
Introduction
Ethnobotanical Research and Language Documentation of Nahuatl consists of approximately 190 hours of field recordings collected in the Sierra Nororiental and Sierra Norte regions of Puebla, Mexico. The corpus contains audio and video recordings of native Nahuatl speakers during the collection of particular plants; partial transcripts (Nahuatl and Spanish); a Highland Puebla Nahuat dictionary; botanical and ethnobotanical data; and speaker metadata.
Nahuatl is one of the most widely spoken indigenous languages in the Americas with approximately 1.5 million speakers in Mexico. Many distinct and sometimes mutually intelligible varieties have been recognized. The recordings in this release were collected between 2008 and 2019 in two different municipalities: Cuetzalan del Progreso and Tepetzintla. Speech from Cuetzalan represents Highland Puebla Nahuat, and speech from Tepetzintla represents Zacatlán-Ahuacatlám-Tepetzintla Nahuatl.
Data
The recordings consist of a speaker talking about a plant's nomenclature, classification, and use. Audio files are primarily single channel 48kHz, 16-bit wav. Some data is also presented as mp3. Video files are presented as mp4.
Transcripts are included for the Cuetzalan recordings in Transcriber format. These transcripts have been partially translated into Spanish using ELAN.
A Highland Puebla Nahuatl dictionary is included in both text and Toolbox XML formats. Botanical and ethnobotanical information is presented as a collection of pdfs, and images as jpegs.
Further information about the corpus is available in the included documentation.
Note that some folders are empty and are planned to be used in future work.
Samples
Please view the following samples:
- Audio Sample (WAV)
- Transcript Sample (TXT)
- Translation Sample (XML)
- Dictionary Sample (TXT)
- Botanical Sample (JPG)
Updates
None at this time.