Working with higher structural levels in vocal signals

Animal vocalizations can be hierarchically structured: elements group together in syllables, syllables in songs, songs in bouts and so on. Many important biological patterns of vocal variation are better described at higher structural levels, so we are often interested in characterizing vocalizations at those levels. There are several tools in warbleR to explore and measure features above the element level. For simplicity, any level above ’elements’ will be refered to as ‘songs’ in this post as well as in the warbleR functions described here.

We will work on a recording from a Scale-throated Hermit (hummingbird) (Phaethornis eurynome):

It has a very simple song with a couple of elements. The code for selecting elements and adding labels (creating the ‘pe_st’ selection table) is found at the end of the post. For now it’s enough to say that the only extra feature in ‘pe_st’ is a couple of columns containing the element and song labels.

We can make spectrograms of the full recording with boxes on the elements and orange lines above the elements highlighting those that belong to the same song. This can be done using the ‘song’ argument in lspec. The argument simply takes the name of the column with the song labels:

# load warbleR 
library(warbleR)

# set warbleR global parameters
warbleR_options(flim = c(2.5, 14), wl = 200, ovlp = 90)

# create spetrogram of the whole recording
lspec(pe_st, sxrow = 2.5, rows = 7, fast.spec = TRUE, 
      horizontal = TRUE, song = "song")

We can plot single spectrograms of each song using the ‘song’ argument in specreator. The function will label each element using the ‘selec’ column label:

specreator(pe_st, by.song = "song")

The function then makes a single spectrogram per song instead of one per element as is the case when no song column is declared.

We can also use our own labels on the elements. In this case the column ’elm’ has the labels I used to classified the 2 elements in the song:

specreator(pe_st, by.song = "song", sel.labels = "elm")

Song features can be measured using song_param. The function calculates several descriptive features of songs, including start and end time, top and bottom frequency (the lowest bottom and highest top frequency of all elements), mean element duration, song duration, number of elements, frequency range, song rate (elements per second) and gap duration:

song_feat <- song_param(pe_st, song_colm = "song")

head(song_feat)

If the element label column is supplied the function will also return the number of unique element types (’elm.types’ column) and the mean number of times element types are found in a song (‘mean.elm.count’):

song_feat <- song_param(pe_st, song_colm = "song", elm_colm = "elm")

# look at data, exclude some columns just for visualization
head(song_feat[, -c(2:6)])

And if spectral parameters have been measured on the elements they can also be averaged by song as follows:

# measure acoustic parameters
elm_sp <- specan(pe_st)

# add song data
elm_sp <- merge(elm_sp[ , -c(3:4)], pe_st, by = c("sound.files", "selec"))

# calculate mean kurtosis and entropy
song_feat <- song_param(X = elm_sp, song_colm = "song", 
                        mean_colm = c("kurt", "sp.ent"))

# look at data
head(song_feat)

The minimum, maximum and standard error can also be returned using the ‘min_colm’, ‘max_colm’ and ‘sd’ arguments respectively.

Given that the start, end bottom and top frequency are returned by song_param, then the output can be used as a selection table to measure or compare the songs themselves, rather than the elements. For instance, we can run cross-correlation between songs, perhaps as a metric of song consistency, as follows:

# calculate mean kurtosis and entropy
song_feat <- song_param(X = elm_sp, song_colm = "song")

# run cross correlation using the first 10 songs
xc <- xcorr(song_feat[1:10, ])

head(xc)

Finally, extended selection tables, which are objects containing both annotations and acoustic data (see this vignette for a detailed description), can be created at the song level. This means that all elements in a song will be contained in a single wave object within the selection table. This enables users to take song level metrics as those described above using this type of objects (this is not possible when creating them based on elements, which is the default behavior).

The song level extended selection table can be created using the argument ‘by.song’, which takes the song label column, as follows:

# create extended selection table
pe_est <- selection_table(pe_st, extended = TRUE, 
              confirm.extended = FALSE, by.song = "song")

pe_est

## all selections are OK

## object of class 'extended_selection_table' 
##  contains a selection table data frame with 46 rows and 8 columns: 
##                             sound.files selec     start       end
## 1 Phaethornis-eurynome-15607.wav-song_1     1 0.1000000 0.2779820
## 2 Phaethornis-eurynome-15607.wav-song_1     2 0.3028119 0.4781182
## 3 Phaethornis-eurynome-15607.wav-song_2     1 0.1000000 0.2761906
## 4 Phaethornis-eurynome-15607.wav-song_2     2 0.3009752 0.4768937
## 5 Phaethornis-eurynome-15607.wav-song_3     1 0.1000000 0.2809979
## 6 Phaethornis-eurynome-15607.wav-song_3     2 0.2997734 0.4710887
##   bottom.freq top.freq song elm
## 1     4.07925  8.48925    1   a
## 2     2.97675  9.15075    1   b
## 3     4.29975  6.94575    2   a
## 4     2.97675 10.00000    2   b
## 5     4.07925  6.94575    3   a
## 6     3.19725 10.00000    3   b
## ... and 40 more rows 
## 23 wave objects (as attributes): 
## [1] "Phaethornis-eurynome-15607.wav-song_1"
## [2] "Phaethornis-eurynome-15607.wav-song_2"
## [3] "Phaethornis-eurynome-15607.wav-song_3"
## [4] "Phaethornis-eurynome-15607.wav-song_4"
## [5] "Phaethornis-eurynome-15607.wav-song_5"
## [6] "Phaethornis-eurynome-15607.wav-song_6"
## ... and 17 more 
## and a data frame (check.results) generated by checkres() (as attribute) 
## the selection table was created by song (see 'class_extended_selection_table')

It has 23 wave objects, 1 for each song.

Now we can measure things on songs without having to keep the original sound file. The following code deletes the sound file and measures song level parameters using the extended selection table:

# delete sound file
unlink("Phaethornis-eurynome-15607.wav")

# measure song features
song_feat <- song_param(pe_est, song_colm = "song")

Creating example data

# load warbleR 
library(warbleR)

# set warbleR options
warbleR_options(bp =  c(2, 8), flim = c(2.5, 14), wl = 200, 
                ovlp = 90)

# set temporary working directory
 setwd(tempdir())

# Query and download  Xeno-Canto for metadata catalog id
out <- quer_xc(qword = "nr:15607", download = TRUE)

# Convert mp3 to wav format
mp32wav(samp.rate = 44.1)

# detect signals in time
ad <- auto_detec(wl = 200, threshold = 5, ssmooth = 1200, 
                 bp = c(2.5, 8), mindur = 0.05, 
                 maxdur = 0.25, img = FALSE)

# get frequency range
fr_ad <- freq_range(X = ad, bp = c(2, 10), fsmooth = 0.001, 
                    ovlp = 95, wl = 200, threshold = 20, 
                    img = FALSE, impute = TRUE)

# add song label column
fr_ad$song <- rep(1:(nrow(fr_ad) / 2), each = 2)

# add element label column
fr_ad$elm <- rep(c("a", "b"), nrow(fr_ad) / 2)

# create selection table (not mandatory but advice)
pe_st <- selection_table(fr_ad, extended = FALSE)

# create the first spectrogram in the post
lspec(pe_st, sxrow = 2.5, rows = 7, fast.spec = TRUE, 
      horizontal = TRUE, song = "song")

Session information

## R version 3.5.1 (2018-07-02)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 18.10
## 
## Matrix products: default
## BLAS: /usr/lib/x86_64-linux-gnu/openblas/libblas.so.3
## LAPACK: /usr/lib/x86_64-linux-gnu/libopenblasp-r0.3.3.so
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] kableExtra_0.9.0   warbleR_1.1.16     NatureSounds_1.0.1
## [4] seewave_2.1.0      tuneR_1.3.3        maps_3.3.0        
## 
## loaded via a namespace (and not attached):
##  [1] Rcpp_1.0.0        pracma_2.2.2      highr_0.7        
##  [4] compiler_3.5.1    pillar_1.3.0      bitops_1.0-6     
##  [7] iterators_1.0.10  tools_3.5.1       Sim.DiffProc_4.3 
## [10] digest_0.6.18     viridisLite_0.3.0 evaluate_0.12    
## [13] tibble_1.4.2      fftw_1.0-4        pkgconfig_2.0.2  
## [16] rlang_0.3.1       rstudioapi_0.9.0  yaml_2.2.0       
## [19] parallel_3.5.1    xml2_1.2.0        httr_1.4.0       
## [22] stringr_1.3.1     knitr_1.20        hms_0.4.2        
## [25] rprojroot_1.3-2   R6_2.3.0          dtw_1.20-1       
## [28] jpeg_0.1-8        pbapply_1.4-0     rmarkdown_1.10   
## [31] soundgen_1.3.2    readr_1.3.1       magrittr_1.5     
## [34] scales_1.0.0      backports_1.1.3   htmltools_0.3.6  
## [37] MASS_7.3-50       rvest_0.3.2       colorspace_1.3-2 
## [40] Deriv_3.8.5       stringi_1.2.4     proxy_0.4-22     
## [43] munsell_0.5.0     signal_0.7-6      RCurl_1.95-4.11  
## [46] crayon_1.3.4      rjson_0.2.20

See also