Sound waves are characterized by compression and expansion of the medium as sound energy moves through it. There is also back and forth motion of the particles making up the medium:
taken from https://dosits.org
The variation in pressure that is perceived at a fixed point in space can be represented by a graph of pressure (amplitude) by time:
Sound waves can be represented by 3 kinds of R objects:
Any numerical vector can be treated as a sound if a sampling frequency is provided. For example, a 440 Hz sinusoidal sound sampled at 8000 Hz for one second can be generated like this:
library(seewave)
# create sinewave at 440 Hz
<- sin(2 * pi * 440 * seq(0, 1, length.out = 8000))
s1
is.vector(s1)
## [1] TRUE
mode(s1)
## [1] "numeric"
These sequences of values only make sense when specifying the sampling rate at which they were created:
oscillo(s1, f = 8000, from = 0, to = 0.01)
You can read any single column matrix:
<- as.matrix(s1)
s2
is.matrix(s2)
## [1] TRUE
dim(s2)
## [1] 8000 1
oscillo(s2, f = 8000, from = 0, to = 0.01)
If the matrix has more than one column, only the first column will be considered:
<- rnorm(8000)
x
<- cbind(s2, x)
s3
is.matrix(s3)
## [1] TRUE
dim(s3)
## [1] 8000 2
oscillo(s3, f = 8000, from = 0, to = 0.01)
The class ts
and related functions ts()
, as.ts()
, is.ts()
can also be used to generate sound objects in R. Here the command to similarly generate a series of time is shown corresponding to a 440 Hz sinusoidal sound sampled at 8000 Hz for one second:
<- ts(data = s1, start = 0, frequency = 8000)
s4
str(s4)
## Time-Series [1:8000] from 0 to 1: 0 0.339 0.637 0.861 0.982 ...
To generate a random noise of 0.5 seconds:
<- ts(data = runif(4000, min = -1, max = 1), start = 0, end = 0.5, frequency = 8000)
s4
str(s4)
## Time-Series [1:4001] from 0 to 0.5: 0.851 -0.08 -0.304 -0.889 -0.145 ...
The frequency()
and deltat()
functions return the sampling frequency (\(f\)) and the time resolution (\(Delta t\)) respectively:
frequency(s4)
## [1] 8000
deltat(s4)
## [1] 0.000125
As the frequency is incorporated into the ts
objects, it is not necessary to specify it when used within functions dedicated to audio:
oscillo(s4, from = 0, to = 0.01)
In the case of multiple time series, seewave functions will consider only the first series:
<- ts(data = s3, f = 8000)
s5
class(s5)
## [1] "mts" "ts" "matrix"
oscillo(s5, from = 0, to = 0.01)
There are 3 kinds of objects corresponding to the wav
binary format or themp3
compressed format:
Wave
class of the package tuneRsound
class of the package phonToolsAudioSample
class of the package audio
Wave
class (tuneR)The Wave
class comes with the tuneR package. This S4 class includes different “slots” with the amplitude data (left or right channel), the sampling frequency (or frequency), the number of bits (8/16/24/32) and the type of sound (mono/stereo). High sampling rates (> 44100 Hz) can be read on these types of objects.
The function to import .wav
files from the hard drive is readWave
:
# load packages
library(tuneR)
<- readWave("./examples/Phae.long1.wav") s6
We can verify the class of the object like this:
# object class
class(s6)
## [1] "Wave"
## attr(,"package")
## [1] "tuneR"
S4 objects have a structure similar to lists but use ‘@’ to access each position (slot):
# structure
str(s6)
## Formal class 'Wave' [package "tuneR"] with 6 slots
## ..@ left : int [1:56251] 162 -869 833 626 103 -2 43 19 47 227 ...
## ..@ right : num(0)
## ..@ stereo : logi FALSE
## ..@ samp.rate: int 22500
## ..@ bit : int 16
## ..@ pcm : logi TRUE
# extract 1 position
@samp.rate s6
## [1] 22500
“Pulse-code modulation (PCM) is a method used to digitally represent sampled analog signals. It is the standard form of digital audio. In a PCM stream, the amplitude of the analog signal is sampled regularly at uniform intervals, and each sample is quantized to the nearest value within a range of digital steps” (Wikipedia).
The samples come in the slot ‘@left’:
# samples
@left[1:40] s6
## [1] 162 -869 833 626 103 -2 43 19 47 227 -4 205 564 171 457
## [16] 838 -216 60 76 -623 -213 168 -746 -248 175 -512 -58 651 -85 -213
## [31] 586 40 -407 371 -51 -587 -92 94 -527 40
The number of samples is given by the duration and the sampling rate.
Exercise
wave
object using the information in the object?
s6
using indexing (and squared brackets)
An advantage of using readWave()
is the ability to read specific segments of sound files, especially useful with long files. This is done using the from
andto
arguments and specifying the units of time with the units
arguments. The units can be converted into “samples”, “minutes” or “hours”. For example, to read only the section that begins in 1s and ends in 5s of the file “Phae.long1.wav”:
<- readWave("./examples/Phae.long1.wav", from = 1, to = 5, units = "seconds")
s7
s7
##
## Wave Object
## Number of Samples: 33751
## Duration (seconds): 1.5
## Samplingrate (Hertz): 22500
## Channels (Mono/Stereo): Mono
## PCM (integer format): TRUE
## Bit (8/16/24/32/64): 16
The .mp3
files can be imported to R although they are imported inWave
format. This is done using the readMP3()
function:
<- readMP3("./examples/Phae.long1.mp3")
s7
s7
##
## Wave Object
## Number of Samples: 56448
## Duration (seconds): 2.56
## Samplingrate (Hertz): 22050
## Channels (Mono/Stereo): Mono
## PCM (integer format): TRUE
## Bit (8/16/24/32/64): 16
To obtain information about the object (sampling frequency, number of bits, mono/stereo), it is necessary to use the indexing of S4 class objects:
@samp.rate s7
## [1] 22050
@bit s7
## [1] 16
@stereo s7
## [1] FALSE
A property that does not appear in these calls is that readWave
does not normalize the sound. The values that describe the sound will be included between \(\pm2^{bit} - 1\):
range(s7@left)
## [1] -32768 32767
Exercise
The function Wave
can be used to create wave objects.
Run the example code in the function documentation
Plot the oscillogram for the first 0.01 s of ‘Wobj’
Note that the function sine
provides a shortcut that can be used to create wave object with a sine wave. Check out other similar functions described in the sine
function documentation. Try 4 of these alternative functions and plot the oscillogram of the first 0.01 s for each of them.
The function read_sound_files
from warbleR is a wrapper over several sound file reading functions, that can read files in ‘wav’, ‘mp3’, ‘flac’ and ‘wac’ format:
library(warbleR)
# wave
<- read_sound_file("Phaethornis-eurynome-15607.wav", path = "./examples")
rsf1
class(rsf1)
## [1] "Wave"
## attr(,"package")
## [1] "tuneR"
# mp3
<- read_sound_file("Phaethornis-striigularis-154074.mp3", path = "./examples")
rsf2
class(rsf2)
## [1] "Wave"
## attr(,"package")
## [1] "tuneR"
# flac
<- read_sound_file("Phae.long1.flac", path = "./examples")
rsf3
class(rsf3)
## [1] "Wave"
## attr(,"package")
## [1] "tuneR"
# wac
<- read_sound_file("recording_20170716_230503.wac", path = "./examples") rsf4
## WAC with trigger(s), converting individual segment(s) to WAV file(s).
class(rsf4)
## [1] "Wave"
## attr(,"package")
## [1] "tuneR"
The function can also read recordings hosted in an online repository:
<- read_sound_file(X = "https://xeno-canto.org/35340/download")
rsf5
class(rsf5)
## [1] "Wave"
## attr(,"package")
## [1] "tuneR"
<- read_sound_file(X = "https://github.com/maRce10/BOKU-Analysis-of-animal-acoustic-signals-in-R-2022/raw/master/examples/Phae.long1.flac")
rsf5
class(rsf5)
## [1] "Wave"
## attr(,"package")
## [1] "tuneR"
sound
(phonTools)The loadsound()
function of phonTools also imports ‘wave’ sound files into R, in this case as objects of class sound
:
library(phonTools)
<- loadsound("./examples/Phae.long1.wav")
s8
s8
##
## Sound Object
##
## Read from file: ./examples/Phae.long1.wav
## Sampling frequency: 22500 Hz
## Duration: 2500.044 ms
## Number of Samples: 56251
str(s8)
## List of 5
## $ filename : chr "./examples/Phae.long1.wav"
## $ fs : int 22500
## $ numSamples: num 56251
## $ duration : num 2500
## $ sound : Time-Series [1:56251] from 0 to 2.5: 0.00494 -0.02652 0.02542 0.0191 0.00314 ...
## - attr(*, "class")= chr "sound"
This function only imports files with a dynamic range of 8 or 16 bits.
audioSample
(audio)The audio package is another option to handle .wav
files. The sound can be imported using the load.wave()
function. The class of the resulting object is audioSample
which is essentially a numerical vector (for mono) or a numerical matrix with two rows (for stereo). The sampling frequency and resolution are saved as attributes:
library(audio)
<- load.wave("./examples/Phae.long1.wav")
s10
head(s10)
## sample rate: 22500Hz, mono, 16-bits
## [1] 4.943848e-03 -2.652058e-02 2.542114e-02 1.910400e-02 3.143311e-03
## [6] -6.103702e-05
$rate s10
## [1] 22500
$bits s10
## [1] 16
The main advantage of the audio package is that the sound can be acquired directly within an R session. This is achieved first by preparing a NAs
vector and then using therecord()
function. For example, to obtain a mono sound of 5 seconds sampled at 16 kHz:
<- rep(NA_real_, 16000 * 5)
s11
record(s11, 16000, 1)
A recording session can be controlled by three complementary functions: pause()
, rewind()
, and resume()
.
For maximum compatibility with other sound programs, it may be useful to save a sound as a simple .txt
file. The following commands will write a “tico.txt” file:
data(tico)
export(tico, f = 22050)
tuneR and audio have a function to write .wav
files: writeWave()
and save.wave()
respectively. Within seewave, the savewav()
function, which is based on writeWave()
, can be used to save data in .wav
format. By default, the object name will be used for the name of the .wav
file:
savewav(tico)
Free Lossless Audio Codec (FLAC) is a file format for lossless audio data compression. FLAC reduces bandwidth and storage requirements without sacrificing the integrity of the audio source. Audio sources encoded in FLAC are generally reduced in size from 40 to 50 percent. See the flac website for more details (flac.sourceforge.net).
The .flac
format cannot be used as such with R. However, the wav2flac()
function allows you to call the FLAC software directly from the console. Therefore, FLAC must be installed on your operating system. If you have a .wav
file that you want to compress in .flac
, call:
wav2flac(file = "./examples/Phae.long1.wav", overwrite = FALSE)
To compress a .wav
file to a .flac
format, the argument reverse = TRUE
must be used:
wav2flac("Phae.long1.flac", reverse = TRUE)
wave
objectsWave
objects can be played with the play()
function of tuneR. It may happen that the default players of the play()
function are not installed in the operating system. setWavPlayer()
can be used to define the command that will be used by play
. For example, if Audacious is the player to use on Linux:
setWavPlayer("audacious")
play(tico)
The homonymous function of the audio package does the same on audioSample
objects:
<- audioSample(sin(1:8000/10), 8000)
x
play(x)
The seewave package includes the listen()
(based on play()
of tuneR) function that works similarly, but also accepts all specific and non-specific kinds of sound objects in R and also allows to reproduce segments using the arguments from
andto
:
<- sin(1:160000/10)
x
listen(x, f = 16000, from = 0, to = 2)
This table, taken from Sueur (2018), summarizes the functions available to import and export sound files in R. The table is incomplete since it does not mention the functions of the phonTools
package:
Exercise
How does the sampling rate affect the size of an audio file?
How does the dynamic range affect the size of an audio file?
Use the system.time()
function to compare the performance of the different functions to import audio files in R. For this use the file “LBH.374.SUR.wav” (Long-billed hermit songs) which lasts about 2 min
Earlier we created wave objects using the sine
function and analogous functions. Import those wave objects as audio files and play them. Try increasing their duration to get a better sense how they sound like.
The following code creates a plot similar to oscillo
but using dots instead of lines:
# generate sine wave
<- sine(freq = 440, duration = 500, xunit = "samples", samp.rate = 44100)
wav
# plot
plot(wav@left)
downsample
to reduce the sampling rate of ‘wav’ (below 44100) and plot the output object. Decrease the sampling rate until you cannot recognize the wave pattern from the original wave object. Try several values so you get a sense at which sampling rate this happens.
Sueur J, Aubin T, Simonis C. 2008. Equipment review: seewave, a free modular tool for sound analysis and synthesis. Bioacoustics 18(2):213–226.
Sueur, J. (2018). Sound Analysis and Synthesis with R.
Sueur J. (2018). I/O of sound with R. seewave package vignette. url: https://cran.r-project.org/web/packages/seewave/vignettes/seewave_IO.pdf
Session information
## R version 4.1.1 (2021-08-10)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 20.04.2 LTS
##
## Matrix products: default
## BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0
##
## locale:
## [1] LC_CTYPE=es_ES.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=es_CR.UTF-8 LC_COLLATE=es_ES.UTF-8
## [5] LC_MONETARY=es_CR.UTF-8 LC_MESSAGES=es_ES.UTF-8
## [7] LC_PAPER=es_CR.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=es_CR.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] audio_0.1-10 phonTools_0.2-2.1 warbleR_1.1.27 NatureSounds_1.0.4
## [5] tuneR_1.3.3.1 seewave_2.2.0 knitr_1.37
##
## loaded via a namespace (and not attached):
## [1] Rcpp_1.0.8 bslib_0.2.5.1 compiler_4.1.1 formatR_1.11
## [5] jquerylib_0.1.4 highr_0.9 moments_0.14 bitops_1.0-7
## [9] tools_4.1.1 digest_0.6.29 jsonlite_1.7.2 evaluate_0.15
## [13] fftw_1.0-6.1 rlang_1.0.2 cli_3.2.0 rstudioapi_0.13
## [17] yaml_2.3.5 parallel_4.1.1 xfun_0.30 fastmap_1.1.0
## [21] stringr_1.4.0 sass_0.4.0 R6_2.5.1 dtw_1.22-3
## [25] pbapply_1.5-0 rmarkdown_2.10 magrittr_2.0.2 htmltools_0.5.2
## [29] bioacoustics_0.2.8 MASS_7.3-54 stringi_1.7.6 proxy_0.4-26
## [33] signal_0.7-7 RCurl_1.98-1.6 rjson_0.2.21