This repository contains the dataset described in article "HeightCeleb - an enrichment of VoxCeleb dataset with speaker height information", which was be presented at SLT 2024 conference in Macau, China.
The dataset is an extension to Voxceleb dataset and contains height information that was scraped from the Internet.
Demo system deployed in HuggingFace Spaces
HeightCeleb dataset is an extension of Voxceleb dataset.
A. Nagrani, J. S. Chung, A. Zisserman, "VoxCeleb: a large-scale speaker identification dataset", INTERSPEECH, 2017
- Original dataset: https://www.robots.ox.ac.uk/~vgg/data/voxceleb/
- License: https://www.robots.ox.ac.uk/~vgg/data/voxmovies/files/license.txt
This dataset contains VoxCeleb1 ID, sex
and split information from the Voxceleb dataset.
HighCeleb dataset is distributed under CC BY 4.0 license.
If you use HeightCeleb in your research, please cite it using the following BibTeX entry:
@INPROCEEDINGS{10832224,
author={Kacprzak, Stanisław and Kowalczyk, Konrad},
booktitle={2024 IEEE Spoken Language Technology Workshop (SLT)},
title={Heightceleb - An Enrichment of Voxceleb Dataset With Speaker Height Information},
year={2024},
volume={},
number={},
pages={857-862},
doi={10.1109/SLT61566.2024.10832224}}