The warbleR function query_xc()
queries for avian vocalization recordings in the open-access online repository Xeno-Canto. It can return recordings metadata or download the associated sound files.
Get recording metadata for green hermits (Phaethornis guy):
library(warbleR)
pg <- query_xc(qword = 'Phaethornis guy', download = FALSE)
Keep only song vocalizations of high quality:
song_pg <- pg[grepl("song", ignore.case = TRUE, pg$Vocalization_type) & pg$Quality == "A", ]
# remove 1 site from Colombia to have a few samples per country
song_pg <- song_pg[song_pg$Locality != "Suaita, Santander", ]
Map locations using map_xc()
:
map_xc(song_pg, leaflet.map = TRUE)
Once you feel fine with the subset of data you can go ahead and download the sound files and save the metadata as a .csv file:
query_xc(X = song_pg, path = "./examples/p_guy", parallel = 3)
write.csv(song_pg, file = "./examples/p_guy/metadata_p_guy_XC.csv", row.names = FALSE)
Now convert all to .wav format (mp3_2_wav
) and homogenizing sampling rate and bit depth (fix_wavs
):
mp3_2_wav(samp.rate = 22.05, path = "./examples/p_guy")
fix_wavs(path = "./examples/p_guy", samp.rate = 44.1, bit.depth = 16)
Now songs should be manually annotated and all the selection in the .txt files should be pooled together in a single spreadsheet.
Once that is done we can read the spreadsheet with the package ‘readxl’ as follows:
# install.packages("readxl") # install if needed
# load package
library(readxl)
# read data
annotations <- read_excel(path = "./examples/p_guy/annotations_p_guy.xlsx")
# check data
head(annotations)
selec | Channel | start | end | bottom.freq | top.freq | selec.file |
---|---|---|---|---|---|---|
1 | 1 | 0.7737 | 0.9939384 | 2.0962 | 7.7252 | Phaethornis-guy-2022.Table.1.selections.txt |
2 | 1 | 1.6837 | 1.9068363 | 2.0726 | 7.6074 | Phaethornis-guy-2022.Table.1.selections.txt |
3 | 1 | 10.1657 | 10.3917342 | 1.8371 | 8.0078 | Phaethornis-guy-2022.Table.1.selections.txt |
4 | 1 | 16.3237 | 16.5468363 | 2.0726 | 7.3248 | Phaethornis-guy-2022.Table.1.selections.txt |
5 | 1 | 1.6069 | 1.7517937 | 1.7193 | 8.7615 | Phaethornis-guy-2022.Table.1.selections.txt |
6 | 1 | 1.0129 | 1.1548958 | 1.7193 | 8.9264 | Phaethornis-guy-2022.Table.1.selections.txt |
Note that the column names should be: “start”, “end”, “bottom.freq”, “top.freq” and “sound.files”. In addition frequency columns (“bottom.freq” and “top.freq”) must be in kHz, not in Hz. We can check if the annotations are in the right format using warbleR’s check_sels()
:
sound_file_path <- "./examples/p_guy/converted_sound_files/"
cs <- check_sels(annotations, path = sound_file_path)
## all selections are OK
We can measured several parameters of acoustic structure with the warbleR function spectro_analysis()
:
sp <- spectro_analysis(X = annotations, path = sound_file_path)
Then we summarize those parameters with a Principal Component Analysis (PCA):
# run excluding sound file and selec columns
pca <- prcomp(sp[, -c(1, 2)])
# add first 2 PCs to sound file and selec columns
pca_data <- cbind(sp[, c(1, 2)], pca$x[, 1:2])
At this point should should get someting like this:
head(pca_data)
sound.files | selec | PC1 | PC2 |
---|---|---|---|
Phaethornis-guy-227574.wav | 1 | -22.6069606 | -13.127152 |
Phaethornis-guy-227574.wav | 2 | 0.0586673 | -17.321796 |
Phaethornis-guy-227574.wav | 3 | 5.9795115 | 5.601346 |
Phaethornis-guy-227574.wav | 4 | -6.8159094 | 4.462788 |
Phaethornis-guy-238804.wav | 5 | 11.2315003 | 6.895327 |
Phaethornis-guy-238804.wav | 6 | 4.6828306 | 7.918963 |
‘PC1’ and ‘PC2’ are the 2 new dimensions that will be used to represent the acoustic space.
Now we just need to add any metadata we considered important to try to explain acoustic similarities shown in the acoustic space scatterplot:
# read XC metadata
song_pg <- read.csv("./examples/p_guy/metadata_p_guy_XC.csv")
# create a column with the file name in the metadata
song_pg$sound.files <- paste0(song_pg$Genus, "-", song_pg$Specific_epithet, "-", song_pg$Recording_ID, ".wav")
# and merge based on sound files and any metadata column we need
pca_data_md <- merge(pca_data, song_pg[, c("sound.files", "Country", "Latitude", "Longitude")])
We are ready to plot the acoustic space scatterplot. For this we will use the package ‘ggplot2’:
# install.packages("ggplot2")
library(ggplot2)
# install.packages("viridis")
library(viridis)
## Loading required package: viridisLite
# plot
ggplot(data = pca_data_md, aes(x = PC1, y = PC2, color = Country, shape = Country)) +
geom_point(size = 3) +
scale_color_viridis_d()
You can also add information about their geographic location (in this case longitude) to the plot as follows:
# plot
ggplot(data = pca_data_md, aes(x = PC1, y = PC2, color = Longitude, shape = Country)) +
geom_point(size = 3) +
scale_color_viridis_c()
We can even test if geographic distance is associated to acoustic distance (i.e. if individuals geographically closer produce more similar songs) using a mantel test (mantel
function from the package vegan):
# create geographic and acoustic distance matrices
geo_dist <- dist(pca_data_md[, c("Latitude", "Longitude")])
acoust_dist <- dist(pca_data_md[, c("PC1", "PC2")])
# install.packages("vegan")
library(vegan)
# run test
mantel(geo_dist, acoust_dist)
##
## Mantel statistic based on Pearson's product-moment correlation
##
## Call:
## mantel(xdis = geo_dist, ydis = acoust_dist)
##
## Mantel statistic r: 0.02928
## Significance: 0.247
##
## Upper quantiles of permutations (null model):
## 90% 95% 97.5% 99%
## 0.0742 0.1098 0.1414 0.1870
## Permutation: free
## Number of permutations: 999
In this example no association between geographic and acoustic distance was detected (p value > 0.05).
Session information
## R version 4.1.1 (2021-08-10)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 20.04.2 LTS
##
## Matrix products: default
## BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0
##
## locale:
## [1] LC_CTYPE=es_ES.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=es_CR.UTF-8 LC_COLLATE=es_ES.UTF-8
## [5] LC_MONETARY=es_CR.UTF-8 LC_MESSAGES=es_ES.UTF-8
## [7] LC_PAPER=es_CR.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=es_CR.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] vegan_2.5-7 lattice_0.20-44 permute_0.9-5 viridis_0.6.2
## [5] viridisLite_0.4.0 ggplot2_3.3.5 readxl_1.3.1 warbleR_1.1.27
## [9] NatureSounds_1.0.4 knitr_1.37 seewave_2.2.0 tuneR_1.3.3.1
##
## loaded via a namespace (and not attached):
## [1] Rcpp_1.0.8 fftw_1.0-6.1 assertthat_0.2.1 digest_0.6.29
## [5] utf8_1.2.2 R6_2.5.1 cellranger_1.1.0 signal_0.7-7
## [9] evaluate_0.15 highr_0.9 pillar_1.7.0 rlang_1.0.2
## [13] rstudioapi_0.13 jquerylib_0.1.4 Matrix_1.3-4 rmarkdown_2.10
## [17] splines_4.1.1 labeling_0.4.2 stringr_1.4.0 htmlwidgets_1.5.3
## [21] RCurl_1.98-1.6 munsell_0.5.0 proxy_0.4-26 compiler_4.1.1
## [25] xfun_0.30 pkgconfig_2.0.3 mgcv_1.8-36 htmltools_0.5.2
## [29] tidyselect_1.1.1 gridExtra_2.3 tibble_3.1.6 dtw_1.22-3
## [33] fansi_1.0.2 crayon_1.5.0 dplyr_1.0.7 withr_2.5.0
## [37] shinyBS_0.61 MASS_7.3-54 bitops_1.0-7 grid_4.1.1
## [41] nlme_3.1-152 jsonlite_1.7.2 gtable_0.3.0 lifecycle_1.0.1
## [45] DBI_1.1.1 magrittr_2.0.2 scales_1.1.1 cli_3.2.0
## [49] stringi_1.7.6 pbapply_1.5-0 farver_2.1.0 leaflet_2.0.4.1
## [53] bslib_0.2.5.1 ellipsis_0.3.2 vctrs_0.3.8 generics_0.1.0
## [57] rjson_0.2.21 tools_4.1.1 glue_1.6.2 purrr_0.3.4
## [61] maps_3.3.0 crosstalk_1.1.1 parallel_4.1.1 fastmap_1.1.0
## [65] yaml_2.3.5 colorspace_2.0-3 cluster_2.1.2 soundgen_2.2.0
## [69] sass_0.4.0