biodenoising

Biodenoising

Introduction

We present Biodenoising, a new method for animal vocalization denoising that does not require access to clean data. There are two core ideas behind Biodenoising:

There is a eloquent video about how these audio patterns work for whales and birds.

The paper is accepted at ICASSP 2025. We publish the pre-print on arXiv.

Marius Miron, Sara Keen, Jen-Yu Liu, Benjamin Hoffman, Masato Hagiwara, Olivier Pietquin, Felix Effenberger, Maddie Cusimano, "Biodenoising: animal vocalization denoising without access to clean data

Along with the pre-print, we publish two Python pip-installable libraries biodenoising, biodenoising-inference, and biodenoising-datasets that can be used to denoise animal vocalizations and download the datasets.

Github Github inference Github Datasets Colab

We base our work on the speech enhancement models demucs dns 48 and CleanUNet because they were small models and fast to train. Demucs worked particularly well. The performance may improve by training newer architectures.

Abstract

Animal vocalization denoising is a task similar to human speech enhancement, a well-studied field of research. In contrast to the latter, it is applied to a higher diversity of sound production mechanisms and recording environments, and this higher diversity is a challenge for existing models. Adding to the challenge and in contrast to speech, we lack large and diverse datasets comprising clean vocalizations. As a solution we use as training data pseudo-clean targets, i.e. pre-denoised vocalizations, and segments of background noise without a vocalization. We propose a train set derived from bioacoustics datasets and repositories representing diverse species, acoustic environments, geographic regions. Additionally, we introduce a non-overlapping benchmark set comprising clean vocalizations from different taxa and noise samples. We show that that denoising models (demucs, CleanUNet) trained on pseudo-clean targets obtained with speech enhancement models achieve competitive results on the benchmarking set.

Benchmarking dataset

We introduce a benchmarking dataset for animal vocalization denoising, Biodenoising_validation. It contains 62 pairs of clean animal vocalizations and noise excerpts. We list some audio demos from this dataset below. Details about the training data can be found at the end of this page.

Audio demos

Here we look at zero-shot performance of the methods on the benchmarking dataset, i.e. generalization to unseen taxa and noise. None of the methods has been adapted/seen to the tested datasets. So the performance may improve when doing self-training to those data. We are actually working on such a method.

First, we compare the original noisy file with our denoising trained on pseudo-clean targets(biodenoising) and two state of the art methods noisereduce and noisy target training.

Original Biodenoising Noisereduce Noisy target

How well does it do on longer recordings?

Original Biodenoising Noisereduce

Recording animals in the lab does not always yield clean vocalizations. In fact these zebra finch recorded with a close-mic are noisy because you can hear the fan and the wings and hopping. And noisereduce while it works great for the fan noise it can not do a good job for the wings and hopping.

Original Biodenoising Noisereduce

The most difficult condition is when we try to denoise biologger recordings, like this carrion crow. Again the wind and the self-noise are very loud.

Original Biodenoising Noisereduce

Underwater conditions tend to be noisier than terrestrial conditions. These models were not trained to operate below -5dB SNR but they can still perform reasonably well. Here you can find recordings of orcas from Orcasound and South-Alaska humpback whale recorded by Michelle Fournet.

Original Biodenoising Noisereduce

My favorite recording is the one of a bowhead whale from the Watkins Marine Mammals dataset. Note that in contrast to the examples above this noisy recording was pre-cleaned using speech enhancement models and then used in training. This recording motivated me to start this project.

Original Biodenoising Noisereduce

Training dataset description

Noisy datasets Hours Medium Private Direct Link Type
Dolphin signature whistles 0.23 underwater yes no link dolphins
Hanaian Gibbons 1.11 terrestrial no yes link gibbons
Geladas 2.23 terrestrial yes no link geladas
Orcasound Aldev 0.25 underwater no yes link orcas
Thyolo 0.61 terrestrial no yes link birds
Anuran 1.13 terrestrial no no link frogs
South-Alaska humpback whale 14.13 underwater yes no link cetaceans
Orcasound SKRW 2.41 underwater no yes link orcas
Black and white ruffed lemur 1.06 terrestrial no yes link lemurs
Orcasound humpback whale 0.8 underwater no yes link orcas
Orchive 0.03 underwater no yes link orcas
Whydah 0.57 terrestrial no yes link birds
Sabiod NIPS4B 0.55 underwater no yes link cetaceans
Xeno canto labeled subset 6.82 terrestrial no yes link birds
ASA Berlin 4.69 terrestrial no no link various
Watkins 5.33 underwater no yes link various
Macaques coo calls 0.7 terrestrial no yes link macaques
Noise datasets Hours Medium Private Direct Link Type
FSD50k subset 26.34 terrestrial no yes link various
IDMT Traffic 9.72 terrestrial no yes link streets
ShipsEar 3.55 underwater yes no link ships
DeepShip subset 1.78 underwater no yes link ships
Orcasound ship noise 7.23 underwater no yes link ships
TUT 2016 subset 0.33 terrestrial no yes link home
Extracted noise Hours Medium Private Direct Link Type
MARS MBARI 0.5 underwater no yes link various
NOAA Sanctsound 47.48 underwater no yes link various
Orcasound best os 1.6 underwater no yes link various
South-Alaska humpback whale 114.85 underwater yes no link various

Bibtex

@misc{miron2024biodenoisinganimalvocalizationdenoising,
      title={Biodenoising: animal vocalization denoising without access to clean data}, 
      author={Marius Miron and Sara Keen and Jen-Yu Liu and Benjamin Hoffman and Masato Hagiwara and Olivier Pietquin and Felix Effenberger and Maddie Cusimano},
      year={2024},
      eprint={2410.03427},
      archivePrefix={arXiv},
      primaryClass={cs.SD},
      url={https://arxiv.org/abs/2410.03427}, 
}