Nautilus Speaker Characterization

Item Name: Nautilus Speaker Characterization
Author(s): Laura Fernández Gallardo
LDC Catalog No.: LDC2018S17
ISBN: 1-58563-868-4
ISLRN: 157-037-166-491-1
DOI: https://doi.org/10.35111/chqa-vd56
Release Date: December 17, 2018
Member Year(s): 2018
DCMI Type(s): Sound
Sample Type: pcm
Sample Rate: 48000
Data Source(s): microphone conversation, microphone speech, question-answers
Application(s): machine learning, spoken dialogue modeling, spoken dialogue systems, subjectivity analysis, prosody, speaker identification
Language(s): German
Language ID(s): deu
License(s): Nautilus Speaker Characterization Agreement
Online Documentation: LDC2018S17 Documents
Licensing Instructions: Subscription & Standard Members, and Non-Members
Citation: Gallardo, Laura Fernández. Nautilus Speaker Characterization LDC2018S17. Web Download. Philadelphia: Linguistic Data Consortium, 2018.
Related Works: View

Introduction

Nautilus Speaker Characterization was developed at the Technical University of Berlin and is comprised of approximately 155 hours of conversational speech from 300 German speakers aged 18 to 35 years (126 males and 174 females) with no marked dialect or accent, recorded in an acoustically-isolated room. The corpus was designed to support research on the detection of speaker social characteristics, such as personality, charisma and voice attractiveness.

Four scripted and four semi-spontaneous dialogs simulating telephone call inquiries were elicited from the speakers. Additionally, spontaneous neutral and emotional speech utterances (predominantly excitement or frustration) and questions were produced.

Speech corresponding to one of the semi-spontaneous dialogs was evaluated with respect to 34 continuous numeric labels of perceived interpersonal speaker characteristics (such as likable, attractive, competent, childish). For a set of 20 selected "extreme" speakers evaluated for their warmth-attractiveness, 34 naive voice descriptions (such as bright, creaky, articulate, melodious) were also evaluated.

Data

Interactions between speakers and their interlocutor (a recording assistant) are provided in separate mono files, accompanied by timestamps and tags that define the speaker's turns. Nearly all speech is sampled at 48 kHz, with some at 44.1 kHz. All speech files are 16-bit 1-channel flac compressed wav.

This release contains all labels, together with the speech recordings and the speakers' metadata (e.g., age, gender, place of birth, chronological places of residence and duration of stay, parents' place of birth, self-assessed personality).

Samples

Please view this female sample and male sample.

Updates

None at this time.

Available Media

View Fees





Login for the applicable fee