FORM1 Kinematic Gesture

Item Name: FORM1 Kinematic Gesture
Author(s): Craig Martell, Chris Osborn, Lisa Britt, Kari Myers
LDC Catalog No.: LDC2004V01
ISBN: 1-58563-299-6
ISLRN: 787-443-746-101-0
Release Date: September 01, 2004
Member Year(s): 2004
DCMI Type(s): MovingImage, Text
Data Source(s): video
Project(s): Talkbank
Application(s): gesture recognition, gesture synthesis, information extraction, natural language processing
Language(s): English
Language ID(s): eng
License(s): LDC User Agreement for Non-Members
Licensing Instructions: Subscription & Standard Members, and Non-Members
Citation: Martell, Craig, et al. FORM1 Kinematic Gesture LDC2004V01. Web Download. Philadelphia: Linguistic Data Consortium, 2004.
Related Works: View


FORM1 Kinematic Gesture was produced by the Linguistic Data Consortium (LDC) and contains 30 minutes of audio/video recordings with associated gesture annotations.

FORM is a gesture annotation scheme designed to capture the kinematic information in gesture from videos of speakers. This publication is a detailed database of gesture-annotated videos stored in the Anvil and FORM file formats. FORM encodes the "phonetics" of gesture by giving geometric descriptions of location and movement of the right and left arms. Other kinematic information such as effort and shape are also recorded.

FORM gesture data has applications in statistical natural language processsing, gesture recognition and generation, information extraction from video, and human-computer interaction.

FORM2 Kinematic Gesture (LDC2003V01) was released in 2003 by LDC and encoded much of the same data provided here using the more recent FORM 2.0 tag set.


This publication contains gesture annotations created using the FORM 1.0 tag set. The Anvil annotation files used in their creation are also included, as are 29.5 minutes of the original audio/video recordings excerpted from a lecture given by Brian MacWhinney on January 24, 2000 at Carnegie Mellon University. A second data set, 5.5 minutes of Paul Howard telling a story in conversation while being motion captured, is also supplied. These video recordings were chosen because they are part of the NSF-funded TalkBank project.

There are a total of 69 data files: 21 movie (.mov) files, 24 Anvil (.anvil) files, and 24 FORM (.form1) files. The discrepency between the number of movie files and the numbers of annotation files is because for one of the movie files there are four separate annotation files.

The movie files are in Quicktime format with the following specs:

Size 360 x 240 pixels
Compression H.261
Video rate 29.97 fps
Audio rate 48 kHz
Audio format 8-bit/16-bit stereo

The Anvil files can be opened using the Anvil video annotation tool, which is freely available from Michael Kipp. The .form1 file format is an intermediate data format that contains only the FORM1 values from each .anvil in a comma-delimited, frame-by-frame listing with the following form: frame,upper_arm_lift,forearm_orientation,handshape,wrist_up_down,wrist_side_side,effort,tension


This research was conducted using funding from the following grant sources: ISLE - 9910603 NSF: TalkBank (via subcontract from Carnegie Mellon University) - BCS-998009 and BCS-9978056 NSF: Discourse and Gesture w/ Joshi, Liberman, and Martell - EIA98-09209


Please view the following samples:


None at this time.

Available Media

View Fees

Login for the applicable fee