This post shows how to create and use the new warbleR object class extended_selection_table.

These objects are created with the selec_table() function. The function takes data frames containing selection data (sound file name, selection, start, end …), checks whether the information is consistent (see checksels() function for details) and saves the ‘diagnostic’ metadata as an attribute. When the argument extended = TRUE the function generates an object of class extended_selection_table which also contains a list of wave objects corresponding to each of the selections in the data frame. Hence, the function transforms selection tables into self-contained objects as they no longer need the original sound files for running most acoustic analysis in warbleR. This can facilitate a lot the storing and sharing of (bio)acoustic data. In addition, it also speeds up processes as sound files do not need to be read every time the data is analyzed.

Let’s first install and/or load warbleR developmental version (if there is an older warbleR version installed it has to be removed first):

# remove warbleR
remove.packages("warbleR")

# install devtools if not installed
if (!"devtools" %in% installed.packages()[,"Package"])  
  install.packages("devtools")

# and install warbleR from github
devtools::install_github("maRce10/warbleR")

# load warbleR
library(warbleR)

… set a temporary folder, load the example sound files and set warbleR options (see warbleR_options() documentation):

# set temporary directory
setwd(tempdir())

# load example data
data(list = c("Phae.long1", "Phae.long2", "Phae.long3", "Phae.long4",
              "selec.table"))

# save recordings as wave files
writeWave(Phae.long1,"Phae.long1.wav")
writeWave(Phae.long2,"Phae.long2.wav")
writeWave(Phae.long3,"Phae.long3.wav")
writeWave(Phae.long4,"Phae.long4.wav")

# set warbleR options
warbleR_options(wl = 300, pb = FALSE, 
          parallel = parallel::detectCores() - 1)

Now, as mentioned above, you need the selec_table() function to create extended selection table. You also need to set the the argument extended = TRUE (otherwise the class would be a “selection_table”). Here the example data that comes with warbleR is used as the data frame to be converted to an object of class extended_selection_table:

selec.table
sound.files channel selec start end bottom.freq top.freq sel.comment rec.comment
Phae.long1.wav 1 1 1.16935 1.34239 2.2201 8.6044 c24 NA
Phae.long1.wav 1 2 2.15841 2.32146 2.1694 8.8071 c25 NA
Phae.long1.wav 1 3 0.34334 0.51826 2.2183 8.7566 c26 NA
Phae.long2.wav 1 1 0.15960 0.29217 2.3169 8.8223 c27 NA
Phae.long2.wav 1 2 1.45706 1.58321 2.2840 8.8880 c28 NA
Phae.long3.wav 1 1 0.62655 0.75777 3.0068 8.8223 c29 NA
Phae.long3.wav 1 2 1.97421 2.10439 2.7768 8.8880 c30 NA
Phae.long3.wav 1 3 0.12336 0.25458 2.3169 9.3151 c31 NA
Phae.long4.wav 1 1 1.51681 1.66224 2.5140 9.2166 c32 NA
Phae.long4.wav 1 2 2.93269 3.07688 2.5797 10.2351 c33 NA
Phae.long4.wav 1 3 0.14540 0.29050 2.5797 9.7423 c34 NA

The following code converts it to an extended selection table:

# make extended selection table
ext_st <- selection_table(X = selec.table, pb = FALSE, 
          extended = TRUE, confirm.extended = FALSE)

And that’s it. Now the acoustic data and the selection data (as well as the additional metadata) are all together in a single R object.

 

Manipulating extended selection tables

Several functions can be used to deal with objects of this class. You can test if the object belongs to the extended_selection_table:

is_extended_selection_table(ext_st)
[1] TRUE

You can subset the selection in the same way that any other data frame in it will maintain its attributes:

ext_st2 <- ext_st[1:2, ]

is_extended_selection_table(ext_st2)
[1] TRUE

There is also a generic version of print() for these class of objects:

## print
print(ext_st)
object of class 'extended_selection_table' 
 contains a selection table data frame with 11 rows and 9 columns: 
       sound.files channel selec start     end bottom.freq top.freq sel.comment rec.comment
1 Phae.long1.wav_1       1     1   0.1 0.27303      2.2201   8.6044         c24          NA
2 Phae.long1.wav_2       1     1   0.1 0.26305      2.1694   8.8071         c25          NA
3 Phae.long1.wav_3       1     1   0.1 0.27492      2.2183   8.7566         c26          NA
4 Phae.long2.wav_1       1     1   0.1 0.23257      2.3169   8.8223         c27          NA
5 Phae.long2.wav_2       1     1   0.1 0.22615      2.2840   8.8880         c28          NA
6 Phae.long3.wav_1       1     1   0.1 0.23122      3.0068   8.8223         c29          NA
... and 5 more rows 
11 wave objects (as attributes): 
[1] "Phae.long1.wav_1" "Phae.long1.wav_2" "Phae.long1.wav_3" "Phae.long2.wav_1" "Phae.long2.wav_2"
[6] "Phae.long3.wav_1"
... and 5 more 
and a data frame (check.results) generated by checkres() (as attribute) 
the selection table was created by element (see 'class_extended_selection_table')
## which is the same than this
ext_st
object of class 'extended_selection_table' 
 contains a selection table data frame with 11 rows and 9 columns: 
       sound.files channel selec start     end bottom.freq top.freq sel.comment rec.comment
1 Phae.long1.wav_1       1     1   0.1 0.27303      2.2201   8.6044         c24          NA
2 Phae.long1.wav_2       1     1   0.1 0.26305      2.1694   8.8071         c25          NA
3 Phae.long1.wav_3       1     1   0.1 0.27492      2.2183   8.7566         c26          NA
4 Phae.long2.wav_1       1     1   0.1 0.23257      2.3169   8.8223         c27          NA
5 Phae.long2.wav_2       1     1   0.1 0.22615      2.2840   8.8880         c28          NA
6 Phae.long3.wav_1       1     1   0.1 0.23122      3.0068   8.8223         c29          NA
... and 5 more rows 
11 wave objects (as attributes): 
[1] "Phae.long1.wav_1" "Phae.long1.wav_2" "Phae.long1.wav_3" "Phae.long2.wav_1" "Phae.long2.wav_2"
[6] "Phae.long3.wav_1"
... and 5 more 
and a data frame (check.results) generated by checkres() (as attribute) 
the selection table was created by element (see 'class_extended_selection_table')

You can also row-bind them together. Here the original extended_selection_table is split into 2 and bind back together using rbind():

ext_st3 <- ext_st[1:5, ]

ext_st4 <- ext_st[6:11, ]

ext_st5 <- rbind(ext_st3, ext_st4)

#print
ext_st5
object of class 'extended_selection_table' 
 contains a selection table data frame with 11 rows and 9 columns: 
       sound.files channel selec start     end bottom.freq top.freq sel.comment rec.comment
1 Phae.long1.wav_1       1     1   0.1 0.27303      2.2201   8.6044         c24          NA
2 Phae.long1.wav_2       1     1   0.1 0.26305      2.1694   8.8071         c25          NA
3 Phae.long1.wav_3       1     1   0.1 0.27492      2.2183   8.7566         c26          NA
4 Phae.long2.wav_1       1     1   0.1 0.23257      2.3169   8.8223         c27          NA
5 Phae.long2.wav_2       1     1   0.1 0.22615      2.2840   8.8880         c28          NA
6 Phae.long3.wav_1       1     1   0.1 0.23122      3.0068   8.8223         c29          NA
... and 5 more rows 
11 wave objects (as attributes): 
[1] "Phae.long1.wav_1" "Phae.long1.wav_2" "Phae.long1.wav_3" "Phae.long2.wav_1" "Phae.long2.wav_2"
[6] "Phae.long3.wav_1"
... and 5 more 
and a data frame (check.results) generated by checkres() (as attribute) 
the selection table was created by element (see 'class_extended_selection_table')
# the same than the original one
all.equal(ext_st, ext_st5)
[1] TRUE

The wave objects can be indvidually read using read_wave(), a wrapper on tuneR’s readWave() function, that can take extended selection tables:

wv1 <- read_wave(X = ext_st, index = 3, from = 0, to = 0.37)

These are regular wave objects:

class(wv1)
[1] "Wave"
attr(,"package")
[1] "tuneR"
wv1
Wave Object
	Number of Samples:      8325
	Duration (seconds):     0.37
	Samplingrate (Hertz):   22500
	Channels (Mono/Stereo): Mono
	PCM (integer format):   TRUE
	Bit (8/16/24/32/64):    16 
spectro(wv1, wl = 150, grid = FALSE, scale = FALSE, ovlp = 90)

plot of chunk extn_sel_8.22

Spectrogram of the third selection in the example 'ext_st' extended selection table

 

par(mfrow = c(3, 2), mar = rep(0, 4))

for(i in 1:6){
  
  wv <- read_wave(X = ext_st, index = i, from = 0.05, to = 0.32)

  spectro(wv, wl = 150, grid = FALSE, scale = FALSE, axisX = FALSE,
          axisY = FALSE, ovlp = 90)

}

plot of chunk extn_sel_8.23

Spectrograms of the first 6 selections in the example 'ext_st' extended selection table

The read_wave() function takes the table as well as the index of the selection to be read (e.g. the row number).

Keep in mind that is likely that other functions that modify data frames will remove the attributes in which wave objects and metadata are stored. For instances, merging and extended selection table will get rid of its attributes:

# create a new data frame 
Y <- data.frame(sound.files = ext_st$sound.files, site = "La Selva", lek = c(rep("SUR", 5), rep("CCL", 6)))

# merge
mrg_ext_st <- merge(ext_st, Y, by = "sound.files")

# check class
is_extended_selection_table(mrg_ext_st)
[1] FALSE

In this case we can use the fix_extended_selection_table() function to transfer the attributes from the original extended selection table:

# fix
mrg_ext_st <- fix_extended_selection_table(X = mrg_ext_st, Y = ext_st)

# check class
is_extended_selection_table(mrg_ext_st)
[1] TRUE

This works as long as some of the original sound files are kept and no other selections are added.

 

Object size

Extended selection table size will be a function of the number of selections, sampling rate, selection duration and margin duration (the margin is how much extra time you want to keep at each side of the selection). In this example a data frame with 1000 selections is created just by repeating the example data frame several times and then converted to an extended selection table:

lng.selec.table <- do.call(rbind, replicate(100, selec.table, 
                        simplify = FALSE))[1:1000,]

lng.selec.table$selec <- 1:nrow(lng.selec.table)

nrow(lng.selec.table)

lng_ext_st <- selection_table(X = lng.selec.table, pb = FALSE, 
                        extended = TRUE, confirm.extended = FALSE)

lng_ext_st
object of class 'extended_selection_table' 
 contains a selection table data frame with 1000 rows and 9 columns: 
       sound.files channel selec start     end bottom.freq top.freq sel.comment rec.comment
1 Phae.long1.wav_1       1     1   0.1 0.27303      2.2201   8.6044         c24          NA
2 Phae.long1.wav_2       1     1   0.1 0.26305      2.1694   8.8071         c25          NA
3 Phae.long1.wav_3       1     1   0.1 0.27492      2.2183   8.7566         c26          NA
4 Phae.long2.wav_4       1     1   0.1 0.23257      2.3169   8.8223         c27          NA
5 Phae.long2.wav_5       1     1   0.1 0.22615      2.2840   8.8880         c28          NA
6 Phae.long3.wav_6       1     1   0.1 0.23122      3.0068   8.8223         c29          NA
... and 994 more rows 
1000 wave objects (as attributes): 
[1] "Phae.long1.wav_1" "Phae.long1.wav_2" "Phae.long1.wav_3" "Phae.long2.wav_4" "Phae.long2.wav_5"
[6] "Phae.long3.wav_6"
... and 994 more 
and a data frame (check.results) generated by checkres() (as attribute) 
the selection table was created by element (see 'class_extended_selection_table')
format(object.size(lng_ext_st), units = "auto")
[1] "31.3 Mb"

As you can see the object size is only ~31 MB. So, as a guide, a selection table with 1000 selections similar to those in ‘selec.table’ (mean duration ~0.15 seconds) at 22.5 kHz sampling rate and the default margin (mar = 0.1) will generate an extended selection table of ~31 MB or ~310 MB for a 10000 row selection table.

 

Running analysis on extended selection tables

These objects can be used as input for most warbleR functions. We need to delete the sound files in order to show the data is actually contained in the new objects:

list.files(pattern = "\\.wav$")
[1] "Phae.long1.wav" "Phae.long2.wav" "Phae.long3.wav" "Phae.long4.wav"
# delete files (be careful not to run this 
# if you have sound files in the working directory!)
unlink(list.files(pattern = "\\.wav$"))

list.files(pattern = "\\.wav$")
character(0)

Here are a few examples of warbleR functions using extended_selection_table:

Spectral parameters

# spectral parameters
sp <- specan(ext_st)

sp
sound.files selec duration meanfreq sd freq.median freq.Q25 freq.Q75 freq.IQR time.median time.Q25 time.Q75 time.IQR skew kurt sp.ent time.ent entropy sfm meandom mindom maxdom dfrange modindx startdom enddom dfslope meanpeakf bottom.freq top.freq
Phae.long1.wav_1 1 0.17303 5.9824 1.3998 6.3317 5.2966 6.8695 1.5729 0.07525 0.05267 0.12039 0.06772 1.9978 7.0216 0.94345 0.93979 0.88664 0.65109 6.6000 4.425 8.250 3.825 3.5882 7.125 7.200 0.43344 7.1351 2.2201 8.6044
Phae.long1.wav_2 1 0.16305 5.9966 1.4244 6.2121 5.3288 6.8808 1.5520 0.07414 0.04448 0.11863 0.07414 1.9229 7.3492 0.94676 0.94412 0.89385 0.66938 6.7043 5.250 8.325 3.075 4.4878 6.900 7.200 1.83995 6.9086 2.1694 8.8071
Phae.long1.wav_3 1 0.17492 6.0208 1.5161 6.4284 5.1528 6.9833 1.8305 0.08749 0.05833 0.13123 0.07291 2.4887 11.0887 0.94484 0.93764 0.88592 0.66888 6.7050 4.200 8.625 4.425 4.5424 7.050 7.200 0.85754 6.8331 2.2183 8.7566
Phae.long2.wav_1 1 0.13257 6.4003 1.3403 6.5960 5.6073 7.3808 1.7735 0.07801 0.05461 0.10922 0.05461 1.5768 6.0674 0.94297 0.95021 0.89602 0.61135 6.3662 5.025 7.575 2.550 4.2941 5.025 5.925 6.78882 7.3617 2.3169 8.8223
Phae.long2.wav_2 1 0.12615 6.3126 1.3707 6.6020 5.6098 7.2132 1.6034 0.07886 0.05520 0.10252 0.04732 2.4717 10.8978 0.93610 0.95051 0.88977 0.62029 6.1721 4.800 7.575 2.775 2.7838 4.800 6.525 13.67418 6.7576 2.2840 8.8880
Phae.long3.wav_1 1 0.13122 6.6120 1.0932 6.6701 6.0672 7.3494 1.2821 0.06176 0.04632 0.09264 0.04632 1.7739 6.6263 0.93026 0.95181 0.88543 0.57033 6.5625 4.875 7.200 2.325 3.1290 6.975 7.050 0.57156 6.7576 3.0068 8.8223
Phae.long3.wav_2 1 0.13018 6.6414 1.1175 6.6742 6.1053 7.4198 1.3145 0.06894 0.04596 0.09958 0.05362 1.5525 5.0724 0.92348 0.95342 0.88047 0.53236 6.2917 4.575 6.900 2.325 2.1613 4.575 6.900 17.86003 6.6821 2.7768 8.8880
Phae.long3.wav_3 1 0.13122 6.5880 1.2534 6.6546 6.0371 7.3941 1.3570 0.06948 0.03860 0.10036 0.06176 1.8047 5.9891 0.91986 0.95764 0.88090 0.53116 6.1458 4.650 6.675 2.025 1.5556 4.650 6.600 14.86088 6.7576 2.3169 9.3151
Phae.long4.wav_1 1 0.14542 6.2233 1.4789 6.2369 5.4593 7.3104 1.8511 0.08422 0.04594 0.11484 0.06891 1.2507 4.2866 0.96418 0.95206 0.91795 0.75803 6.1650 3.975 8.250 4.275 2.8070 5.325 3.975 -9.28314 6.3046 2.5140 9.2166
Phae.long4.wav_2 1 0.14419 6.4692 1.5921 6.3345 5.6335 7.5838 1.9503 0.08350 0.04554 0.11386 0.06832 1.6977 6.4039 0.95838 0.95309 0.91342 0.72121 6.5325 3.750 8.775 5.025 2.8508 8.025 3.750 -29.64911 6.2290 2.5797 10.2351
Phae.long4.wav_3 1 0.14510 6.1237 1.5432 6.0817 5.1786 7.2467 2.0681 0.08404 0.04584 0.11460 0.06876 1.0908 4.1060 0.96434 0.95302 0.91904 0.73709 6.0713 3.975 7.875 3.900 3.8846 7.875 3.975 -26.87821 6.0025 2.5797 9.7423

 

Cross correlation

xc <- xcorr(ext_st, bp = c(1, 11))

xc
Phae.long1.wav_1-1 Phae.long1.wav_2-1 Phae.long1.wav_3-1 Phae.long2.wav_1-1 Phae.long2.wav_2-1 Phae.long3.wav_1-1 Phae.long3.wav_2-1 Phae.long3.wav_3-1 Phae.long4.wav_1-1 Phae.long4.wav_2-1 Phae.long4.wav_3-1
Phae.long1.wav_1-1 1.00000 0.70098 0.67774 0.37965 0.37656 0.41595 0.36969 0.40227 0.31495 0.31186 0.32389
Phae.long1.wav_2-1 0.70098 1.00000 0.64562 0.39796 0.40553 0.40322 0.36674 0.40986 0.31026 0.31066 0.32019
Phae.long1.wav_3-1 0.67774 0.64562 1.00000 0.40397 0.40515 0.40939 0.38032 0.41243 0.30668 0.29962 0.32628
Phae.long2.wav_1-1 0.37965 0.39796 0.40397 1.00000 0.66783 0.64736 0.64770 0.60379 0.32453 0.29737 0.32423
Phae.long2.wav_2-1 0.37656 0.40553 0.40515 0.66783 1.00000 0.60930 0.63169 0.63351 0.31042 0.27856 0.31651
Phae.long3.wav_1-1 0.41595 0.40322 0.40939 0.64736 0.60930 1.00000 0.75640 0.71382 0.30416 0.27581 0.29938
Phae.long3.wav_2-1 0.36969 0.36674 0.38032 0.64770 0.63169 0.75640 1.00000 0.73529 0.30324 0.27773 0.30050
Phae.long3.wav_3-1 0.40227 0.40986 0.41243 0.60379 0.63351 0.71382 0.73529 1.00000 0.31193 0.28016 0.30141
Phae.long4.wav_1-1 0.31495 0.31026 0.30668 0.32453 0.31042 0.30416 0.30324 0.31193 1.00000 0.76814 0.75387
Phae.long4.wav_2-1 0.31186 0.31066 0.29962 0.29737 0.27856 0.27581 0.27773 0.28016 0.76814 1.00000 0.75907
Phae.long4.wav_3-1 0.32389 0.32019 0.32628 0.32423 0.31651 0.29938 0.30050 0.30141 0.75387 0.75907 1.00000

 

Signal-to-noise ratio

# signal-to-noise ratio
snr <- sig2noise(ext_st, mar = 0.05)

snr
sound.files channel selec start end bottom.freq top.freq sel.comment rec.comment SNR
Phae.long1.wav_1 1 1 0.1 0.27303 2.2201 8.6044 c24 NA 21.182
Phae.long1.wav_2 1 1 0.1 0.26305 2.1694 8.8071 c25 NA 20.355
Phae.long1.wav_3 1 1 0.1 0.27492 2.2183 8.7566 c26 NA 19.164
Phae.long2.wav_1 1 1 0.1 0.23257 2.3169 8.8223 c27 NA 23.273
Phae.long2.wav_2 1 1 0.1 0.22615 2.2840 8.8880 c28 NA 26.206
Phae.long3.wav_1 1 1 0.1 0.23122 3.0068 8.8223 c29 NA 25.326
Phae.long3.wav_2 1 1 0.1 0.23018 2.7768 8.8880 c30 NA 25.508
Phae.long3.wav_3 1 1 0.1 0.23122 2.3169 9.3151 c31 NA 24.669
Phae.long4.wav_1 1 1 0.1 0.24542 2.5140 9.2166 c32 NA 27.620
Phae.long4.wav_2 1 1 0.1 0.24419 2.5797 10.2351 c33 NA 28.852
Phae.long4.wav_3 1 1 0.1 0.24510 2.5797 9.7423 c34 NA 24.290

 

Dynamic time warping distance

dtw.dist <- dfDTW(ext_st, img = FALSE)

dtw.dist
calculating DTW distances (step 2 of 2, no progress bar):
Phae.long1.wav_1-1 Phae.long1.wav_2-1 Phae.long1.wav_3-1 Phae.long2.wav_1-1 Phae.long2.wav_2-1 Phae.long3.wav_1-1 Phae.long3.wav_2-1 Phae.long3.wav_3-1 Phae.long4.wav_1-1 Phae.long4.wav_2-1 Phae.long4.wav_3-1
Phae.long1.wav_1-1 0.000 6.972 7.884 16.465 18.364 11.436 16.843 18.746 22.682 22.932 21.909
Phae.long1.wav_2-1 6.972 0.000 9.494 19.740 23.317 13.306 19.885 23.230 25.976 24.078 25.023
Phae.long1.wav_3-1 7.884 9.494 0.000 19.919 19.306 14.960 18.880 23.295 27.062 28.581 25.867
Phae.long2.wav_1-1 16.465 19.740 19.919 0.000 9.064 10.258 7.864 9.698 18.931 24.143 24.309
Phae.long2.wav_2-1 18.364 23.317 19.306 9.064 0.000 10.031 6.988 7.283 16.920 23.744 21.980
Phae.long3.wav_1-1 11.436 13.306 14.960 10.258 10.031 0.000 7.563 8.528 23.179 25.229 22.872
Phae.long3.wav_2-1 16.843 19.885 18.880 7.864 6.988 7.563 0.000 5.606 19.530 24.345 23.278
Phae.long3.wav_3-1 18.746 23.230 23.295 9.698 7.283 8.528 5.606 0.000 16.576 23.187 20.182
Phae.long4.wav_1-1 22.682 25.976 27.062 18.931 16.920 23.179 19.530 16.576 0.000 8.237 10.729
Phae.long4.wav_2-1 22.932 24.078 28.581 24.143 23.744 25.229 24.345 23.187 8.237 0.000 8.542
Phae.long4.wav_3-1 21.909 25.023 25.867 24.309 21.980 22.872 23.278 20.182 10.729 8.542 0.000

 

Performance

Using extended_selection_table objects can improve performance (in our case measured as time). Here we used the microbenchmark to compare the performance of sig2noise() and ggplot2 to plot the results. We also need to save the wave files again to be able to run the analysis with regular data frames:

# save recordings as wave files
writeWave(Phae.long1,"Phae.long1.wav")
writeWave(Phae.long2,"Phae.long2.wav")
writeWave(Phae.long3,"Phae.long3.wav")
writeWave(Phae.long4,"Phae.long4.wav")

#run this one if microbenchmark is not installed
# install.packages("microbenchmark")
library(microbenchmark)

# install.packages("ggplot2")
library(ggplot2)

# use only 1 core
warbleR_options(parallel = 1, pb = FALSE)

# use the first 100 selection for the long selection tables
mbmrk.snr <- microbenchmark(extended = sig2noise(lng_ext_st[1:100, ], 
      mar = 0.05), regular = sig2noise(lng.selec.table[1:100, ], 
                    mar = 0.05), times = 50)

autoplot(mbmrk.snr) + ggtitle("sig2noise")

plot of chunk extn_sel_13

Distribution of `sig2noise()` timing on regular and extended selection tables

The function runs much faster on extended selection tables. The gain in performance is likely to improve when using longer recordings and data sets (i.e. compensate for computing overhead).

By song

The extended selection tables above were all made ‘by selection’. This is, each sound file inside the object contains a single selection (i.e. 1:1 correspondence between selections and wave objects). Extended selection tables, however, can also be created by using a higher hierarchical level with the argument by.song. In this case, ‘song’ represents a higher level that contains one or more selections and that the user may want to keep together for some particular analysis (e.g. gap duration). The argument by.song takes the name of the character or factor column with the IDs of the different “songs” within a sound file (note that the function assumes that a given song can only be found in a single sound file so selections with the same song ID but from different sound files is taken as different ‘songs’).

For the sake of the example, let’s add an artificial song column to our example data set in which each sound files 2 songs:

# add column
selec.table$song <- c(1, 1, 2, 1, 2, 1, 1, 2, 1, 2, 2)

The data frame looks like this:

sound.files channel selec start end bottom.freq top.freq sel.comment rec.comment song
Phae.long1.wav 1 1 1.16935 1.34239 2.2201 8.6044 c24 NA 1
Phae.long1.wav 1 2 2.15841 2.32146 2.1694 8.8071 c25 NA 1
Phae.long1.wav 1 3 0.34334 0.51826 2.2183 8.7566 c26 NA 2
Phae.long2.wav 1 1 0.15960 0.29217 2.3169 8.8223 c27 NA 1
Phae.long2.wav 1 2 1.45706 1.58321 2.2840 8.8880 c28 NA 2
Phae.long3.wav 1 1 0.62655 0.75777 3.0068 8.8223 c29 NA 1
Phae.long3.wav 1 2 1.97421 2.10439 2.7768 8.8880 c30 NA 1
Phae.long3.wav 1 3 0.12336 0.25458 2.3169 9.3151 c31 NA 2
Phae.long4.wav 1 1 1.51681 1.66224 2.5140 9.2166 c32 NA 1
Phae.long4.wav 1 2 2.93269 3.07688 2.5797 10.2351 c33 NA 2
Phae.long4.wav 1 3 0.14540 0.29050 2.5797 9.7423 c34 NA 2

Now we can create an extended selection table ‘by song’ using the name of the ‘song’ column (which in this silly example is also ‘song’) as the input for the by.song argument:

bs_ext_st <- selection_table(X = selec.table, extended = TRUE,
                              confirm.extended = FALSE, by.song = "song")

In this case we should only have 8 wave objects instead of 11 as when the object was created ‘by selection’:

# by element
length(attr(ext_st, "wave.objects"))
[1] 11
# by song
length(attr(bs_ext_st, "wave.objects"))
[1] 8

Again, these objects can also be used on further analysis:

# signal-to-noise ratio
bs_snr <- sig2noise(bs_ext_st, mar = 0.05)
sound.files channel selec start end bottom.freq top.freq sel.comment rec.comment song SNR
Phae.long1.wav-song_1 1 1 0.1000 0.27303 2.2201 8.6044 c24 NA 1 21.182
Phae.long1.wav-song_1 1 2 1.0891 1.25210 2.1694 8.8071 c25 NA 1 20.357
Phae.long1.wav-song_2 1 1 0.1000 0.27492 2.2183 8.7566 c26 NA 2 19.164
Phae.long2.wav-song_1 1 1 0.1000 0.23257 2.3169 8.8223 c27 NA 1 23.273
Phae.long2.wav-song_2 1 1 0.1000 0.22615 2.2840 8.8880 c28 NA 2 26.206
Phae.long3.wav-song_1 1 1 0.1000 0.23122 3.0068 8.8223 c29 NA 1 25.326
Phae.long3.wav-song_1 1 2 1.4477 1.57784 2.7768 8.8880 c30 NA 1 25.512
Phae.long3.wav-song_2 1 1 0.1000 0.23122 2.3169 9.3151 c31 NA 2 24.669
Phae.long4.wav-song_1 1 1 0.1000 0.24542 2.5140 9.2166 c32 NA 1 27.620
Phae.long4.wav-song_2 1 1 2.8873 3.03148 2.5797 10.2351 c33 NA 2 28.841
Phae.long4.wav-song_2 1 2 0.1000 0.24510 2.5797 9.7423 c34 NA 2 24.290

The margin would be an important parameter to take into consideration for some downstream functions like those producing plots or using additional time segments around selection to run analysis (e.g. sig2noise() or xcorr()).

Sharing acoustic data

The new object class allows to share complete data sets, including the acoustic data. For instance, with the following code you can download a subset of the data used in Araya-Salas et al (2017) (it can also be downloaded here):

URL <- "https://marceloarayasalas.weebly.com/uploads/2/5/5/2/25524573/extended.selection.table.araya-salas.et.al.2017.bioacoustics.100.sels.rds"

dat <- readRDS(gzcon(url(URL)))

nrow(dat)
[1] 100
format(object.size(dat), units = "auto")
[1] "10.1 Mb"

The total size of the 100 sound files from which these selections were taken adds up to 1.1 GB. The size of the extended selection table is just 10.1 MB.

This data is ready to be used:

sp <- specan(dat, bp = c(2, 10))

head(sp)
sound.files selec duration meanfreq sd freq.median freq.Q25 freq.Q75 freq.IQR time.median time.Q25 time.Q75 time.IQR skew kurt sp.ent time.ent entropy sfm meandom mindom maxdom dfrange modindx startdom enddom dfslope meanpeakf
Pyrrhura rupicola Macaulay Library 132 .wav_2 1 0.15048 4.6570 1.7692 4.2876 3.4364 5.6243 2.1879 0.06543 0.03926 0.10469 0.06543 2.6003 11.8598 0.92349 0.95082 0.87807 0.54127 3.9580 2.2395 6.8045 4.5650 3.5660 4.4789 4.0482 -2.8620 4.0208
0.CCE.1971.4.4.ITM70863A-23.wav_1 1 0.16556 6.2556 1.6484 6.3505 5.6072 7.2024 1.5952 0.08279 0.03821 0.12100 0.08279 2.3775 10.1358 0.93975 0.94103 0.88434 0.63367 6.6003 3.5314 8.1826 4.6512 3.2037 8.1826 6.8906 -7.8036 6.0961
0.SAT.1989.6.2.ITM70866A-32.wav_5 1 0.15425 6.0960 1.6472 5.8475 4.9067 7.3982 2.4915 0.08356 0.04499 0.11570 0.07070 1.9681 7.2878 0.93913 0.94322 0.88581 0.60968 6.3221 3.2731 8.0965 4.8234 3.1071 6.3738 3.2731 -20.1021 7.0473
23.CCE.2011.7.21.7.42.wav_6 1 0.15496 5.4239 1.4632 5.1832 4.4278 6.4358 2.0081 0.06458 0.03875 0.10333 0.06458 2.1063 8.0438 0.92290 0.94473 0.87189 0.50696 4.9854 2.3256 7.7519 5.4264 2.1587 7.3213 2.3256 -32.2397 4.3667
Cyanoliseus patagonus Macaulay Library 79 .wav_5 1 0.15989 3.1546 1.2261 2.5692 2.2439 3.6575 1.4136 0.06396 0.03838 0.09595 0.05757 4.5462 29.4729 0.82325 0.93564 0.77027 0.12652 2.5166 2.0672 4.2205 2.1533 3.2000 2.4117 2.2395 -1.0774 2.2913
0.HC1.2011.8.7.9.20.wav_4 1 0.15380 6.0302 1.7597 6.4163 4.9203 7.1837 2.2634 0.07692 0.04487 0.10896 0.06410 4.1566 27.8331 0.92869 0.94595 0.87849 0.60299 6.1017 4.9096 8.5272 3.6176 4.0714 5.3402 4.9096 -2.8002 4.9720

And the spectrograms can be displayed:

par(mfrow = c(3, 2), mar = rep(0, 4))

for(i in 1:6){
  
  wv <- read_wave(X = dat, index = i, from = 0.17, to = 0.4)

  spectro(wv, wl = 250, grid = FALSE, scale = FALSE, axisX = FALSE,
          axisY = FALSE, ovlp = 90, flim = c(0, 12), 
          palette = reverse.gray.colors.1)
}

plot of chunk extn.sel_21

Spectrograms of the first 6 selections in the 'dat' extended selection table

The ability to compress large data sets and the easiness of conducting analyses requiring only a single R object can potentially simplify data sharing and the reproducibility of bioacoustic analyses.

Please report any bugs here.


Session information
R version 3.4.4 (2018-03-15)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.5 LTS

Matrix products: default
BLAS: /usr/lib/openblas-base/libblas.so.3
LAPACK: /usr/lib/libopenblasp-r0.2.18.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] ggplot2_3.0.0        microbenchmark_1.4-4 kableExtra_0.9.0     knitr_1.20           warbleR_1.1.15      
[6] NatureSounds_1.0.0   seewave_2.1.0        tuneR_1.3.3          maps_3.3.0          

loaded via a namespace (and not attached):
 [1] rgl_0.95.1441        Rcpp_0.12.18         fftw_1.0-4           assertthat_0.2.0     rprojroot_1.3-2     
 [6] digest_0.6.16        R6_2.2.2             plyr_1.8.4           Sim.DiffProc_4.1     backports_1.1.2     
[11] signal_0.7-6         evaluate_0.11        pracma_2.1.5         httr_1.3.1           highr_0.7           
[16] pillar_1.3.0         rlang_0.2.2          lazyeval_0.2.1       curl_3.2             rstudioapi_0.7      
[21] rmarkdown_1.10       devtools_1.13.6      moments_0.14         readr_1.1.1          stringr_1.3.1       
[26] RCurl_1.95-4.11      munsell_0.5.0        proxy_0.4-22         compiler_3.4.4       Deriv_3.8.5         
[31] pkgconfig_2.0.2      htmltools_0.3.6      tidyselect_0.2.4     tibble_1.4.2         dtw_1.20-1          
[36] bioacoustics_0.1.5   viridisLite_0.3.0    crayon_1.3.4         dplyr_0.7.6          withr_2.1.2         
[41] MASS_7.3-50          bitops_1.0-6         grid_3.4.4           gtable_0.2.0         git2r_0.23.0        
[46] magrittr_1.5         scales_1.0.0         stringi_1.2.4        pbapply_1.3-4        scatterplot3d_0.3-41
[51] bindrcpp_0.2.2       xml2_1.2.0           rjson_0.2.20         iterators_1.0.10     tools_3.4.4         
[56] glue_1.3.0           purrr_0.2.5          hms_0.4.2            jpeg_0.1-8           parallel_3.4.4      
[61] yaml_2.2.0           colorspace_1.3-2     soundgen_1.3.1       rvest_0.3.2          memoise_1.1.0       
[66] bindr_0.1.1