The spectrogram is a fundamental tool in the study of acoustic communication in vertebrates. They are basically a visual representation of the sound where the variation in energy (or power spectral density) is shown on both the frequency and the time domains. Spectrograms allow us to visually explore acoustic variation in our study systems, which makes it easy to distinguish structural differences at small temporal/spectral scales that our ears cannot detect.
We will use the seewave package and its sample data:
library(seewave)
# load examples
data(tico)
data(orni)
In order to understand the information contained in a spectrogram it is necessary to understand, at least briefly, the Fourier transformation. In simple words, this is a mathematical transformation that detects the periodicity in time series, identifying the different frequencies that compose them and their relative energy. Therefore it is said that it transforms the signals from the time domain to the frequency domain.
To better understand how it works, we can simulate time series composed of pre-defined frequencies. In this example we simulate 3 frequencies and join them in a single time series:
# freq
<- 11025
f
# time sequence
<- seq(1/f, 1, length.out = f)
t
# period
<- 1/440
pr <- 2 * pi/pr
w0
# frec 1
<- 5 * cos(w0 * t)
h1
plot(h1[1:75], type = "l", col = "blue", xlab = "Time (samples)", ylab = "Amplitude (no units)")
# frec 2
<- 10 * cos(2 * w0 * t)
h2
plot(h2[1:75], type = "l", col = "blue", xlab = "Time (samples)", ylab = "Amplitude (no units)")
# frec 3
<- 15 * sin(3 * w0 * t)
h3
plot(h3[1:75], type = "l", col = "blue", xlab = "Time (samples)", ylab = "Amplitude (no units)")
This is what the union of the three frequencies looks like:
<- 0.5 + h1 + h2 + h3
H0
plot(H0[1:75], type = "l", col = "blue", xlab = "Time (samples)", ylab = "Amplitude (no units)")
Now we can apply the Fourier transform to this time series and graph the frequencies detected using a periodogram:
<- Mod(fft(H0))
fspc
plot(fspc, type = "h", col = "blue", xlab = "Frecuency (Hz)", ylab = "Amplitude (no units)")
abline(v = f/2, lty = 2)
text(x = (f/2) + 1650, y = 8000, "Nyquist Frequency")
We can make zoom in to frequencies below the Nyquist frequency:
plot(fspc[1:(length(fspc)/2)], type = "h", col = "blue", xlab = "Frecuency (Hz)",
ylab = "Amplitude (no units)")
This diagram (taken from Sueur 2018) summarizes the process we just simulated:
Tomado de Sueur 2018
The periodogram next to the spectrogram of these simulated sounds looks like this:
The spectrograms are constructed of the spectral decomposition of discrete time segments of amplitude values. Each segment (or window) of time is a column of spectral density values in a frequency range. Take for example this simple modulated sound, which goes up and down in frequency:
If we divide the sound into 10 segments and make periodograms for each of them we can see this pattern in the frequencies:
This animation shows in a very simple way the logic behind the spectrograms: if we calculate Fourier transforms for short segments of time through a sound (e.g. amplitude changes in time) and concatenate them, we can visualize the variation in frequencies over time.
When frequency spectra are combined to produce a spectrogram, the frequency and amplitude modulations are not gradual:
There are several “tricks” to smooth out the contours of signals with high modulation in a spectrogram, although the main and most common is window overlap. The overlap recycles a percentage of the amplitude samples of a window to calculate the next window. For example, the sound used as an example, with a window size of 512 points divides the sound into 15 segments:
A 50% overlap generates windows that share 50% of the amplitude values with the adjacent windows. This has the visual effect of making modulations much more gradual:
Which increases (in some way artificially) the number of time windows, without changing the resolution in frequency. In this example, the number of time windows is doubled:
Therefore, the greater the overlap the greater the smoothing of the contours of the sounds:
This increases the number of windows as a function of the overlap for this particular sound:
This increase in spectrogram sharpness does not come without a cost. The longer the time windows, the greater the number of Fourier transformations to compute, and therefore, the greater the duration of the process. This graphic shows the increase in duration as a function of the number of windows on my computer:
It is necessary to take this cost into account when producing spectrograms of long sound files (> 1 min).
However, there is a trade-off between the resolution between the 2 domains: the higher the frequency resolution, the lower the resolution in time. The following animation shows, for the sound of the previous example, how the resolution in frequency decreases as the resolution in time increases:
This is the relationship between frequency resolution and time resolution for the example signal:
There are several R packages with functions that produce spectrograms in the graphical device. This chart (taken from Sueur 2018) summarizes the functions and their arguments:
We will focus on making spectrograms using the spectro ()
function of seewave:
<- cutw(tico, from = 0.55, to = 0.9, output = "Wave")
tico2
spectro(tico2, f = 22050, wl = 512, ovlp = 90, collevels = seq(-40, 0, 0.5), flim = c(2,
6), scale = FALSE)
Exercise
How can I increase the overlap between time windows?
How much longer it takes to create a 99%-overlap spectrogram compare to a 5%-overlap spectrogram?
What does the argument ‘collevels’ do? Increase the range and look at the spectrogram.
What do the ‘flim’ and ‘tlim’ arguments determine?
Run the examples that come in the spectro()
function documentation
Almost all components of a spectrogram in seewave can be modified. We can add scales:
spectro(tico2, f = 22050, wl = 512, ovlp = 90, collevels = seq(-40, 0, 0.5), flim = c(2,
6), scale = TRUE)
Change the color palette:
spectro(tico2, f = 22050, wl = 512, ovlp = 90, collevels = seq(-40, 0, 0.5), flim = c(2,
6), scale = TRUE, palette = reverse.cm.colors)
spectro(tico2, f = 22050, wl = 512, ovlp = 90, collevels = seq(-40, 0, 0.5), flim = c(2,
6), scale = TRUE, palette = reverse.gray.colors.1)
Remove the vertical lines:
spectro(tico2, f = 22050, wl = 512, ovlp = 90, collevels = seq(-40, 0, 0.5), flim = c(2,
6), scale = TRUE, palette = reverse.gray.colors.1, grid = FALSE)
Add oscillograms (waveforms):
spectro(tico2, f = 22050, wl = 512, ovlp = 90, collevels = seq(-40, 0, 0.5), flim = c(2,
6), scale = TRUE, palette = reverse.gray.colors.1, grid = FALSE, osc = TRUE)
Use contours instead of colors:
<- colorRampPalette("white")
blanc
spectro(tico2, contlevels = seq(-30, 0, 4), cont = TRUE, colcont = temp.colors(8),
palette = blanc, scale = FALSE, flim = c(2, 6))
Exercise
Change the color of the oscillogram to a ‘heat color’ palette
These are some of the color palettes that fit well the gradients in spectrograms:
From Sueur 2018
Use at least 3 palettes to generate the “tico2” spectrogram
Change the relative height of the oscillogram so that it corresponds to 1/6 of the height of the spectrogram
Change the relative width of the amplitude scale so that it corresponds to 1/8 of the spectrogram width
What does the “zp” argument do? (hint: try zp = 100
and notice the effect on the spectrogram)
Which value of “wl” (window size) generates smoother spectrograms for the example “orni” object?
The package viridis
provides some color palettes that are better perceived by people with forms of color blindness and/or color vision deficiency. Install the package and try some of the color palettes available (try ?viridis
)
The spectrogram()
function of the soundgen package produces spectrograms slightly different from those of other packages:
library(soundgen)
spectrogram(x = as.numeric(tico2@left), samplingRate = tico2@samp.rate, windowLength = 30,
overlap = 90, ylim = c(2, 6))
It also allows you to use spectral derivatives to produce spectrograms (similar to the program Sound Analysis Pro):
spectrogram(x = as.numeric(tico2@left), samplingRate = tico2@samp.rate, windowLength = 30,
overlap = 90, method = "spectralDerivative", ylim = c(2, 6))
It has a large number of arguments that allow modification of the color and “resolution”. For instance, we can change the brightness:
spectrogram(x = as.numeric(tico2@left), samplingRate = tico2@samp.rate, windowLength = 30,
overlap = 90, ylim = c(2, 6), brightness = -0.1)
Or apply smoothing in frequency and time (‘smoothTime’ and smoothFreq’):
spectrogram(x = as.numeric(tico2@left), samplingRate = tico2@samp.rate, windowLength = 30,
overlap = 90, ylim = c(2, 6), smoothFreq = 5, smoothTime = 5)
The monitoR package provides the ViewSpec()
function to generate spectrograms:
library(monitoR)
viewSpec(tico2, main = NA, frq.lim = c(2, 6), ovlp = 90)
The arguments are very similar to those of spectro()
of seewave since ViewSpec()
uses that function internally.
Other options are specgram()
of signal:
library(signal)
specgram(tico2@left, n = 512, Fs = 8000, overlap = round(512 * 0.9))
spectrogram()
from phonTools:
library(phonTools)
::spectrogram(tico2@left, fs = tico2@samp.rate, maxfreq = 6000, windowlength = round(length(tico2@left)/512)) phonTools
powS()
of tuneR does not generate the display, it only calculates the spectrogram (i.e. the matrix of amplitude values in time and frequency). To visualize it, use the image()
function and add the axes manually:
library(tuneR)
# calcular espectrograma
<- powspec(tico2@left, sr = tico2@samp.rate,
ps wintime = 512 / f, steptime = 0.25 * 512 / tico2@samp.rate)
# normalizar
<- ps / max(ps)
ps
# pasar a dB
<- 10*log10(ps)
ps
# graficar
image(t(ps), col = gray((512:0) / 512),
xlab = "Time (s)", ylab = "Frequency (Hz)", # axes labels
axes=FALSE)
# añadir ejes manualmente
<- round(seq(0, duration(tico2), length=5), 1)
time
<- round(seq(tico2@samp.rate/512, tico2@samp.rate/2, length=5))
frequency
axis(side=1, at=seq(0, 1,length=5), labels = time)
axis(side=2, at=seq(0, 1,length=5), labels = frequency)
Exercise
Pick up an example acoustic signal from your own research and make a spectrogram (or a recording from Xeno-Canto)
Improve the visualization by optimizing the parameters ‘wl’, ‘collevels’, ‘palette’ and ‘overlap’
The package dynaSpec allows to create static and dynamic visualizations of sounds, ready for publication or presentation. These dynamic spectrograms are produced natively with base graphics, and are save as an .mp4 video in the working directory:
<- read_sound_file("https://www.xeno-canto.org/518334/download")
ngh_wren
<- colorRampPalette(c("#2d2d86", "#2d2d86", reverse.terrain.colors(10)[5:10]))
custom_pal
library(dynaSpec)
scrolling_spectro(wave = ngh_wren, wl = 600, t.display = 3, ovlp = 95, pal = custom_pal,
grid = FALSE, flim = c(2, 8), width = 700, height = 250, res = 100, collevels = seq(-40,
0, 5), file.name = "../nightingale_wren.mp4", colbg = "#2d2d86", lcol = "#FFFFFFE6")
Exercise
Rerun the example code above but this time using a waveform in the bottom panel
Slow down the spectrogram (see the argument ‘speed’)
Use a viridis
color palette
Pick up an example acoustic signal from your project and make a dynamic spectrogram (or a recording from Xeno-Canto)
## References
Araya-Salas, Marcelo and Wilkins, Matthew R. (2020), dynaSpec: dynamic spectrogram visualizations in R. R package version 1.0.0.
Sueur J, Aubin T, Simonis C. 2008. Equipment review: seewave, a free modular tool for sound analysis and synthesis. Bioacoustics 18(2):213–226.
Sueur, J. (2018). Sound Analysis and Synthesis with R.
Session information
## R version 4.1.1 (2021-08-10)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 20.04.2 LTS
##
## Matrix products: default
## BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0
##
## locale:
## [1] LC_CTYPE=es_ES.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=es_CR.UTF-8 LC_COLLATE=es_ES.UTF-8
## [5] LC_MONETARY=es_CR.UTF-8 LC_MESSAGES=es_ES.UTF-8
## [7] LC_PAPER=es_CR.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=es_CR.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] tuneR_1.3.3.1 phonTools_0.2-2.1 signal_0.7-7 soundgen_2.2.0
## [5] shinyBS_0.61 seewave_2.2.0 knitr_1.37
##
## loaded via a namespace (and not attached):
## [1] Rcpp_1.0.8 bslib_0.2.5.1 compiler_4.1.1 formatR_1.11
## [5] later_1.3.0 jquerylib_0.1.4 highr_0.9 tools_4.1.1
## [9] digest_0.6.29 jsonlite_1.7.2 evaluate_0.15 lifecycle_1.0.1
## [13] lattice_0.20-44 rlang_1.0.2 shiny_1.6.0 cli_3.2.0
## [17] rstudioapi_0.13 yaml_2.3.5 xfun_0.30 fastmap_1.1.0
## [21] stringr_1.4.0 sass_0.4.0 grid_4.1.1 R6_2.5.1
## [25] rmarkdown_2.10 monitoR_1.0.7 magrittr_2.0.2 promises_1.2.0.1
## [29] htmltools_0.5.2 ellipsis_0.3.2 MASS_7.3-54 mime_0.11
## [33] xtable_1.8-4 httpuv_1.6.2 stringi_1.7.6 zoo_1.8-9