Examples

Audio examples across a range of bioacoustics tasks and datasets. Each example shows the input audio, the prompt used, the model’s prediction, and the gold label.

Species Classification

What is the common name for the focal species in the audio?
AudioPredictionGold LabelDataset
0:00
Humpback WhaleHumpback Whale Watkins
0:00
WalrusWalrus Watkins
0:00
Spinner DolphinPantropical Spotted Dolphin Watkins
0:00
Greater YellowlegsGreater Yellowlegs CBI
0:00
Wood DuckBlue-winged Teal CBI
0:00
culex pipiens complexculex pipiens complex HumbugDB
0:00
othersnon-mosquito HumbugDB
0:00
non-mosquitoan dirus HumbugDB
0:00
Spectacled TetrakaSpectacled Tetraka Unseen (zero-shot)
0:00
Dusky White-eyeDusky White-eye Unseen (zero-shot)
0:00
Pacific RobinFire-tailed Sunbird Unseen (zero-shot)
What is the scientific name for the focal species in the audio?
AudioPredictionGold LabelDataset
0:00
tauraco fischeritauraco fischeri Unseen (zero-shot)
0:00
larvivora cyanelarvivora cyane Unseen (zero-shot)
0:00
Nisaetus kelaartiNisaetus philippensis Unseen (zero-shot)

Detection

What are the common names for the species in the audio, if any?
AudioPredictionGold LabelDataset
0:00
Gray-cheeked ThrushGray-cheeked Thrush DCASE
0:00
NoneNone DCASE
0:00
NoneMeerkat close call DCASE
0:00
Black-throated Green WarblerBlack-throated Green Warbler, Eastern Towhee ENABirds
0:00
NoneKirtland's Warbler, American Crow ENABirds
0:00
Black-and-white WarblerNone ENABirds
0:00
Red-legged thrushRed-legged thrush RFCX
0:00
Puerto Rican bullfinchPuerto Rican bullfinch RFCX
0:00
Common coquiNone RFCX
0:00
Minke whaleMinke whale HICEAS
0:00
NoneNone HICEAS
0:00
Minke whaleNone HICEAS

Call Type

Which of these, if any, are present? Single pulse gibbon call, Multiple pulse gibbon call, Gibbon duet, None.
AudioPredictionGold LabelDataset
0:00
Multiple pulse gibbon callMultiple pulse gibbon call Hainan Gibbons
0:00
Single pulse gibbon callMultiple pulse gibbon call Hainan Gibbons
0:00
Gibbon duetNone Hainan Gibbons
What type of vocalization is heard from the focal species in the audio? Answer with 'call' or 'song'.
AudioPredictionGold LabelDataset
0:00
callcall --
0:00
songsong --
0:00
callsong --

Life Stage

What is the life stage of the focal species in the audio?
AudioPredictionGold LabelDataset
0:00
juvenilejuvenile --
0:00
adultadult --
0:00
juvenilenestling --

Audio Captioning

Caption the audio, using the common name for any animal species.
AudioPredictionGold LabelDataset
0:00
Call of a new zealand bellbird with background sounds from new zealand falcon. The common evening song of a Mainland New Zealand Bellbird. --
0:00
Cajun Chorus Frog The sound of Squirrel Treefrog after a rain. --

Counting

How many birds are in the audio? Choose between 1, 2, 3 or 4.
AudioPredictionGold LabelDataset
0:00
11 ZF-NBirds
0:00
44 ZF-NBirds
0:00
23 ZF-NBirds

General Sound Classification

Classify the sound into one of the following categories: dog, rooster, pig, cow, frog, cat, hen, insects, sheep, crow, rain, sea_waves, crackling_fire, crickets, chirping_birds, water_drops, wind, pouring_water, toilet_flush, thunderstorm, crying_baby, sneezing, clapping, breathing, coughing, footsteps, laughing, brushing_teeth, snoring, drinking_sipping, door_wood_knock, mouse_click, keyboard_typing, door_wood_creaks, can_opening, washing_machine, vacuum_cleaner, clock_alarm, clock_tick, glass_breaking, helicopter, chainsaw, siren, car_horn, engine, train, church_bells, airplane, fireworks, hand_saw
AudioPredictionGold LabelDataset
0:00
dogdog ESC-50
0:00
crying_babycat ESC-50