KHATT: Handwritten Arabic Text
Item Name: | KHATT: Handwritten Arabic Text |
Author(s): | Sabri A. Mahmoud, Irfan Ahmad, Wasfi G. Al-Khatib, Mohammad Alshayeb, Mohammad Tanvir Parvez, Volker Märgner, Gernot A. Fink |
LDC Catalog No.: | LDC2015T23 |
ISBN: | 1-58563-736-X |
ISLRN: | 866-063-772-506-2 |
DOI: | https://doi.org/10.35111/vc52-tm53 |
Release Date: | November 16, 2015 |
Member Year(s): | 2015 |
DCMI Type(s): | StillImage, Text |
Data Source(s): | essays |
Application(s): | handwriting recognition |
Language(s): | Arabic |
Language ID(s): | ara |
License(s): |
LDC User Agreement for Non-Members |
Online Documentation: | LDC2015T23 Documents |
Licensing Instructions: | Subscription & Standard Members, and Non-Members |
Citation: | Mahmoud, Sabri A., et al. KHATT: Handwritten Arabic Text LDC2015T23. USB Flash Drive. Philadelphia: Linguistic Data Consortium, 2015. |
Introduction
KHATT: Handwritten Arabic Text was developed by King Fahd University of Petroleum & Minerals, Technical University of Dortmund and Braunschweig University of Technology. It is comprised of scanned Arabic handwriting from 1,000 distinct male and female writers representing diverse countries, age groups, handedness and education levels. Participants produced text on a topic of their choice in an unrestricted style. KHATT was designed to promote research in areas such as text recognition and writer identification.
Data
The majority of participants were natives of Saudi Arabia; the next largest group was from a collection of regional countries (Egypt, Jordan, Kuwait, Morocco, Palestine, Tunisia and Yemen). Most writers were between 16-25 years of age with high school or university qualifications.
Scanned text is presented as tiff images scanned at 200, 300 and 600 DPI (dots per inch). The source images are four-page tiffs consisting of metadata about the writer, fixed paragraphs and free writing. Image files of isolated paragraphs or lines are also included. Ground-truth files are presented as plain-text Unicode. Data is divided into training, validation and test sets.
Samples
Please view this image sample and this text sample.
Updates
None at this time.