(Translated by https://www.hiragana.jp/)
Search | arXiv e-print repository
Skip to main content

Showing 1–4 of 4 results for author: Korshunov, P

Searching in archive eess. Search in all archives.
.
  1. arXiv:2311.17655  [pdf, other

    cs.CV cs.AI cs.MM cs.SD eess.AS

    Vulnerability of Automatic Identity Recognition to Audio-Visual Deepfakes

    Authors: Pavel Korshunov, Haolin Chen, Philip N. Garner, Sebastien Marcel

    Abstract: The task of deepfakes detection is far from being solved by speech or vision researchers. Several publicly available databases of fake synthetic video and speech were built to aid the development of detection methods. However, existing databases typically focus on visual or voice modalities and provide no proof that their deepfakes can in fact impersonate any real person. In this paper, we present… ▽ More

    Submitted 29 November, 2023; originally announced November 2023.

    Comments: 10 pages, 3 figures, 3 tables

    ACM Class: I.4.3; I.2.10; H.5.1

  2. arXiv:2009.03155  [pdf, other

    cs.CV cs.MM eess.IV

    Deepfake detection: humans vs. machines

    Authors: Pavel Korshunov, Sébastien Marcel

    Abstract: Deepfake videos, where a person's face is automatically swapped with a face of someone else, are becoming easier to generate with more realistic results. In response to the threat such manipulations can pose to our trust in video evidence, several large datasets of deepfake videos and many methods to detect them were proposed recently. However, it is still unclear how realistic deepfake videos are… ▽ More

    Submitted 7 September, 2020; originally announced September 2020.

  3. arXiv:1911.02388  [pdf, other

    eess.AS cs.LG cs.SD

    The Speed Submission to DIHARD II: Contributions & Lessons Learned

    Authors: Md Sahidullah, Jose Patino, Samuele Cornell, Ruiqing Yin, Sunit Sivasankaran, Hervé Bredin, Pavel Korshunov, Alessio Brutti, Romain Serizel, Emmanuel Vincent, Nicholas Evans, Sébastien Marcel, Stefano Squartini, Claude Barras

    Abstract: This paper describes the speaker diarization systems developed for the Second DIHARD Speech Diarization Challenge (DIHARD II) by the Speed team. Besides describing the system, which considerably outperformed the challenge baselines, we also focus on the lessons learned from numerous approaches that we tried for single and multi-channel systems. We present several components of our diarization syst… ▽ More

    Submitted 6 November, 2019; originally announced November 2019.

  4. arXiv:1911.01255  [pdf, other

    eess.AS cs.SD

    pyannote.audio: neural building blocks for speaker diarization

    Authors: Hervé Bredin, Ruiqing Yin, Juan Manuel Coria, Gregory Gelly, Pavel Korshunov, Marvin Lavechin, Diego Fustes, Hadrien Titeux, Wassim Bouaziz, Marie-Philippe Gill

    Abstract: We introduce pyannote.audio, an open-source toolkit written in Python for speaker diarization. Based on PyTorch machine learning framework, it provides a set of trainable end-to-end neural building blocks that can be combined and jointly optimized to build speaker diarization pipelines. pyannote.audio also comes with pre-trained models covering a wide range of domains for voice activity detection,… ▽ More

    Submitted 4 November, 2019; originally announced November 2019.

    Comments: Submitted to ICASSP 2020