Nautilus Speaker Characterization
|Item Name:||Nautilus Speaker Characterization|
|Author(s):||Laura Fernández Gallardo|
|LDC Catalog No.:||LDC2018S17|
|Release Date:||December 17, 2018|
|Data Source(s):||microphone conversation, microphone speech, question-answers|
|Application(s):||machine learning, spoken dialogue modeling, spoken dialogue systems, subjectivity analysis, prosody, speaker identification|
Nautilus Speaker Characterization Agreement
|Online Documentation:||LDC2018S17 Documents|
|Licensing Instructions:||Subscription & Standard Members, and Non-Members|
|Citation:||Gallardo, Laura Fernández. Nautilus Speaker Characterization LDC2018S17. Web Download. Philadelphia: Linguistic Data Consortium, 2018.|
Nautilus Speaker Characterization was developed at the Technical University of Berlin and is comprised of approximately 155 hours of conversational speech from 300 German speakers aged 18 to 35 years (126 males and 174 females) with no marked dialect or accent, recorded in an acoustically-isolated room. The corpus was designed to support research on the detection of speaker social characteristics, such as personality, charisma and voice attractiveness.
Four scripted and four semi-spontaneous dialogs simulating telephone call inquiries were elicited from the speakers. Additionally, spontaneous neutral and emotional speech utterances (predominantly excitement or frustration) and questions were produced.
Speech corresponding to one of the semi-spontaneous dialogs was evaluated with respect to 34 continuous numeric labels of perceived interpersonal speaker characteristics (such as likable, attractive, competent, childish). For a set of 20 selected "extreme" speakers evaluated for their warmth-attractiveness, 34 naive voice descriptions (such as bright, creaky, articulate, melodious) were also evaluated.
Interactions between speakers and their interlocutor (a recording assistant) are provided in separate mono files, accompanied by timestamps and tags that define the speaker's turns. Nearly all speech is sampled at 48 kHz, with some at 44.1 kHz. All speech files are 16-bit 1-channel flac compressed wav.
This release contains all labels, together with the speech recordings and the speakers' metadata (e.g., age, gender, place of birth, chronological places of residence and duration of stay, parents' place of birth, self-assessed personality).
None at this time.