(Translated by https://www.hiragana.jp/)
Search | arXiv e-print repository
Skip to main content

Showing 1–3 of 3 results for author: Sach, M

Searching in archive eess. Search in all archives.
.
  1. arXiv:2406.04660  [pdf, other

    eess.AS cs.SD

    URGENT Challenge: Universality, Robustness, and Generalizability For Speech Enhancement

    Authors: Wangyou Zhang, Robin Scheibler, Kohei Saijo, Samuele Cornell, Chenda Li, Zhaoheng Ni, Anurag Kumar, Jan Pirklbauer, Marvin Sach, Shinji Watanabe, Tim Fingscheidt, Yanmin Qian

    Abstract: The last decade has witnessed significant advancements in deep learning-based speech enhancement (SE). However, most existing SE research has limitations on the coverage of SE sub-tasks, data diversity and amount, and evaluation metrics. To fill this gap and promote research toward universal SE, we establish a new SE challenge, named URGENT, to focus on the universality, robustness, and generaliza… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

    Comments: 6 pages, 3 figures, 3 tables. Accepted by Interspeech 2024. An extended version of the accepted manuscript with appendix

  2. arXiv:2309.02432  [pdf, other

    eess.AS cs.SD

    Employing Real Training Data for Deep Noise Suppression

    Authors: Ziyi Xu, Marvin Sach, Jan Pirklbauer, Tim Fingscheidt

    Abstract: Most deep noise suppression (DNS) models are trained with reference-based losses requiring access to clean speech. However, sometimes an additive microphone model is insufficient for real-world applications. Accordingly, ways to use real training data in supervised learning for DNS models promise to reduce a potential training/inference mismatch. Employing real data for DNS training requires eithe… ▽ More

    Submitted 5 September, 2023; originally announced September 2023.

  3. arXiv:2306.02778  [pdf, other

    eess.AS

    EffCRN: An Efficient Convolutional Recurrent Network for High-Performance Speech Enhancement

    Authors: Marvin Sach, Jan Franzen, Bruno Defraene, Kristoff Fluyt, Maximilian Strake, Wouter Tirry, Tim Fingscheidt

    Abstract: Fully convolutional recurrent neural networks (FCRNs) have shown state-of-the-art performance in single-channel speech enhancement. However, the number of parameters and the FLOPs/second of the original FCRN are restrictively high. A further important class of efficient networks is the CRUSE topology, serving as reference in our work. By applying a number of topological changes at once, we propose… ▽ More

    Submitted 5 June, 2023; originally announced June 2023.

    Comments: 5 pages, 5 figures, accepted for Interspeech 2023