Examples¶
Audio examples across a range of bioacoustics tasks and datasets. Each example shows the input audio, the prompt used, the model’s prediction, and the gold label.
Species Classification¶
What is the common name for the focal species in the audio?
| Audio | Prediction | Gold Label | Dataset |
|---|---|---|---|
0:00 |
Humpback Whale | Humpback Whale | Watkins |
0:00 |
Walrus | Walrus | Watkins |
0:00 |
Spinner Dolphin | Pantropical Spotted Dolphin | Watkins |
0:00 |
Greater Yellowlegs | Greater Yellowlegs | CBI |
0:00 |
Wood Duck | Blue-winged Teal | CBI |
0:00 |
culex pipiens complex | culex pipiens complex | HumbugDB |
0:00 |
others | non-mosquito | HumbugDB |
0:00 |
non-mosquito | an dirus | HumbugDB |
0:00 |
Spectacled Tetraka | Spectacled Tetraka | Unseen (zero-shot) |
0:00 |
Dusky White-eye | Dusky White-eye | Unseen (zero-shot) |
0:00 |
Pacific Robin | Fire-tailed Sunbird | Unseen (zero-shot) |
What is the scientific name for the focal species in the audio?
| Audio | Prediction | Gold Label | Dataset |
|---|---|---|---|
0:00 |
tauraco fischeri | tauraco fischeri | Unseen (zero-shot) |
0:00 |
larvivora cyane | larvivora cyane | Unseen (zero-shot) |
0:00 |
Nisaetus kelaarti | Nisaetus philippensis | Unseen (zero-shot) |
Detection¶
What are the common names for the species in the audio, if any?
| Audio | Prediction | Gold Label | Dataset |
|---|---|---|---|
0:00 |
Gray-cheeked Thrush | Gray-cheeked Thrush | DCASE |
0:00 |
None | None | DCASE |
0:00 |
None | Meerkat close call | DCASE |
0:00 |
Black-throated Green Warbler | Black-throated Green Warbler, Eastern Towhee | ENABirds |
0:00 |
None | Kirtland's Warbler, American Crow | ENABirds |
0:00 |
Black-and-white Warbler | None | ENABirds |
0:00 |
Red-legged thrush | Red-legged thrush | RFCX |
0:00 |
Puerto Rican bullfinch | Puerto Rican bullfinch | RFCX |
0:00 |
Common coqui | None | RFCX |
0:00 |
Minke whale | Minke whale | HICEAS |
0:00 |
None | None | HICEAS |
0:00 |
Minke whale | None | HICEAS |
Call Type¶
Which of these, if any, are present? Single pulse gibbon call, Multiple pulse gibbon call, Gibbon duet, None.
| Audio | Prediction | Gold Label | Dataset |
|---|---|---|---|
0:00 |
Multiple pulse gibbon call | Multiple pulse gibbon call | Hainan Gibbons |
0:00 |
Single pulse gibbon call | Multiple pulse gibbon call | Hainan Gibbons |
0:00 |
Gibbon duet | None | Hainan Gibbons |
What type of vocalization is heard from the focal species in the audio? Answer with 'call' or 'song'.
| Audio | Prediction | Gold Label | Dataset |
|---|---|---|---|
0:00 |
call | call | -- |
0:00 |
song | song | -- |
0:00 |
call | song | -- |
Life Stage¶
What is the life stage of the focal species in the audio?
| Audio | Prediction | Gold Label | Dataset |
|---|---|---|---|
0:00 |
juvenile | juvenile | -- |
0:00 |
adult | adult | -- |
0:00 |
juvenile | nestling | -- |
Audio Captioning¶
Caption the audio, using the common name for any animal species.
| Audio | Prediction | Gold Label | Dataset |
|---|---|---|---|
0:00 |
Call of a new zealand bellbird with background sounds from new zealand falcon. | The common evening song of a Mainland New Zealand Bellbird. | -- |
0:00 |
Cajun Chorus Frog | The sound of Squirrel Treefrog after a rain. | -- |
Counting¶
How many birds are in the audio? Choose between 1, 2, 3 or 4.
| Audio | Prediction | Gold Label | Dataset |
|---|---|---|---|
0:00 |
1 | 1 | ZF-NBirds |
0:00 |
4 | 4 | ZF-NBirds |
0:00 |
2 | 3 | ZF-NBirds |
General Sound Classification¶
Classify the sound into one of the following categories: dog, rooster, pig, cow, frog, cat, hen, insects, sheep, crow, rain, sea_waves, crackling_fire, crickets, chirping_birds, water_drops, wind, pouring_water, toilet_flush, thunderstorm, crying_baby, sneezing, clapping, breathing, coughing, footsteps, laughing, brushing_teeth, snoring, drinking_sipping, door_wood_knock, mouse_click, keyboard_typing, door_wood_creaks, can_opening, washing_machine, vacuum_cleaner, clock_alarm, clock_tick, glass_breaking, helicopter, chainsaw, siren, car_horn, engine, train, church_bells, airplane, fireworks, hand_saw
| Audio | Prediction | Gold Label | Dataset |
|---|---|---|---|
0:00 |
dog | dog | ESC-50 |
0:00 |
crying_baby | cat | ESC-50 |