German Speech Corpus aligned with CTC segmentation
Alignments on Librivox and Spoken Wikipedia Corpus (SWC) with CTC segmentation:
Dataset | Length | Speakers | Utterances |
---|---|---|---|
SWC | 210h | 363 | 78214 |
Librivox | 804h | 251 | 368532 |
The pre-processed text and alignments can be found on https://github.com/lumaku/german-corpus-aligned
Source of the audio files:
- SWC: German Spoken Wikipedia Corpus
- Librivox: from the IDs in the metadata file
books-German.json
. Audiofiles can be automatically retrieved viaid
using the LibriVox API, e.g. https://librivox.org/api/feed/audiobooks/?id=82&format=json , and then downloading the URL. See the Downloads section for the collected mp3 files. (Convert to wav withffmpeg -i path/to/audio.mp3 -ac 1 -ar 16000 path/to/audio.wav
)
See the Downloads section for a pre-trained model.
Downloads
- Librivox + SWC alignments
- Librivox MP3 bundle in
librivox-de-mp3.tar.zst
. The full file has sha1sum fff13810471def3c342ee500dc7d2d6b78c9b64a - Pre-trained German ESPnet 1 Transformer model
german.transformer.v1.tar.gz
; sha1sum a2f2f9a25ca27e4b9b968f175d9d87c305f4b155
Reference
The full paper can be found in the preprint https://arxiv.org/abs/2007.09127 or published at https://doi.org/10.1007/978-3-030-60276-5_27.
To cite this work:
@InProceedings{ctcsegmentation,
author="K{\"u}rzinger, Ludwig and Winkelbauer, Dominik and Li, Lujun and Watzel, Tobias and Rigoll, Gerhard",
editor="Karpov, Alexey
and Potapova, Rodmonga",
title="CTC-Segmentation of Large Corpora for German End-to-End Speech Recognition",
booktitle="Speech and Computer",
year="2020",
publisher="Springer International Publishing",
address="Cham",
pages="267--278",
abstract="Recent end-to-end Automatic Speech Recognition (ASR) systems demonstrated the ability to outperform conventional hybrid DNN/HMM ASR. Aside from architectural improvements in those systems, those models grew in terms of depth, parameters and model capacity. However, these models also require more training data to achieve comparable performance.",
isbn="978-3-030-60276-5"
}