Un aspecto de reproducibilidad científica que es específico de las ciencias biológicas es cómo se obtiene, usa y reporta la información taxonómica. En la discusión tratamos los temas del uso de vouchers y la importancia de documentar adecuadamente la justificación de identificaciones taxonómicas. Aquí vamos a ver una herramienta para obtener y curar información taxonómica de manera reproducible.
Vamos a usar el paquete en R ‘taxize’. Por cierto, el artículo que describe el paquete fue publicado en la revista F1000Research, una de las revistas que vimos al comienzo del semestre es completamente abierta.
Existen bases de datos en línea de las que se puede obtener información taxonómica para diversos organismos biológicos. Pero existen ventajas de hacer estás búsquedas de manera programática:
La idea de taxize es hacer la extracción y uso de la iformación taxonómica fácil y reproducible.
Imagen: Rohan Chakravarty/CC BY-NC-ND 3.0.
‘taxize’ se conecta con varias bases de datos taxonómicas y más pueden ir siendo agregagas paulatinamente. Esta información se puede utilizar para llevar a cabo tareas comunes en el proceso de investigación. Por ejemplo:
Si tenemos una lista de especímenes, posiblemente queremos saber si estamos usando nombres actualizados y si los nombres que tenemos están escritos correctamente. Podemos hacer esto usando la aplicación Global Names Resolver (GNR) de la Encyclopedia of Life, a través de taxize.
Como un ejemplo, veamos los datos de ocurrencia que bajé de GBIF. Bajé los registros de pajaritos del género Ramphocelus en Costa Rica, en la Colección Nacional de Zoología. Tal vez, estoy haciendo o planeo hacer un trabajo con estos espécimenes.
Los datos están aquí (https://doi.org/10.15468/dl.d8frtc)
y este es un ejemplo del pajarito:
Ramphocelus sanguinolentus, La Fortuna, Costa Rica
# leamos los datos
dat <- read.csv(file = "./additional_files/0098054-200613084148143.csv",
header = T, sep = "\t")
# cuales son las especies en CR?
Ram.names <- levels(dat$species)
Ram.names
## NULL
Veamos cuáles bases de datos puedo usar para buscar los nombres de mis especies
require(taxize)
require(kableExtra)
data.sources <- gnr_datasources()
data.sources[, c(1, 5, 8, 9)] %>%
kbl() %>%
kable_minimal()
created_at | id | refresh_period_days | title |
---|---|---|---|
2012-07-06T11:36:36Z | 1 | 14 | Catalogue of Life Checklist |
2012-07-06T11:38:14Z | 2 | 14 | Wikispecies |
2012-02-09T10:31:13Z | 3 | 14 | Integrated Taxonomic Information SystemITIS |
2012-02-09T10:47:55Z | 4 | 14 | National Center for Biotechnology Information |
2012-02-09T11:16:43Z | 5 | 14 | Index Fungorum (Species Fungorum) |
2012-02-09T11:28:38Z | 6 | 14 | GRIN Taxonomy for Plants |
2012-02-09T11:32:18Z | 7 | 14 | Union 4 |
2012-02-09T12:08:54Z | 8 | 14 | The Interim Register of Marine and Nonmarine Genera |
2012-02-09T12:40:45Z | 9 | 14 | World Register of Marine Species |
2012-02-09T12:55:04Z | 10 | 14 | Freebase |
2012-02-09T13:01:40Z | 11 | 14 | GBIF Backbone Taxonomy |
2012-02-09T15:36:33Z | 12 | 14 | Encyclopedia of Life |
2012-02-09T18:21:08Z | 93 | 14 | Passiflora vernacular names |
2012-02-09T18:21:09Z | 94 | 14 | Inventory of Fish Species in the Wami River Basin |
2012-02-09T18:21:09Z | 95 | 14 | Pheasant Diversity and Conservation in the Mt. Gaoligonshan Region |
2012-02-09T18:21:09Z | 96 | 14 | Finding Species |
2012-02-09T18:21:10Z | 97 | 14 | Birds of Lindi Forests Plantation |
2012-02-09T18:21:11Z | 98 | 14 | Nemertea |
2012-02-09T18:21:12Z | 99 | 14 | Kihansi Gorge Amphibian Species Checklist |
2012-02-09T18:21:12Z | 100 | 14 | Mushroom Observer |
2012-02-09T18:21:14Z | 101 | 14 | TaxonConcept |
2012-02-09T18:21:15Z | 102 | 14 | Amphibia and Reptilia of Yunnan |
2012-02-09T18:21:17Z | 103 | 14 | Common names of Chilean Plants |
2012-07-06T11:49:07Z | 104 | 14 | Invasive Species of Belgium |
2012-02-09T18:21:20Z | 105 | 14 | ZooKeys |
2012-02-09T18:21:23Z | 106 | 14 | COA Wildlife Conservation List |
2012-02-09T18:21:25Z | 107 | 14 | AskNature |
2012-02-09T18:21:31Z | 108 | 14 | China: Yunnan, Southern Gaoligongshan, Rapid Biological Inventories Report No. 04 |
2012-02-09T18:21:34Z | 109 | 14 | Native Orchids from Gaoligongshan Mountains, China |
2012-02-09T18:21:37Z | 110 | 14 | Illinois Wildflowers |
2012-02-09T18:21:45Z | 112 | 14 | Coleorrhyncha Species File |
2012-02-09T18:21:46Z | 113 | 14 | /home/dimus/files/dwca/zoological names.zip |
2012-02-09T18:21:57Z | 114 | 14 | Peces de la zona hidrogeográfica de la Amazonia, Colombia (Spreadsheet) |
2012-02-09T18:22:04Z | 115 | 14 | Eastern Mediterranean Syllidae |
2012-02-09T18:22:06Z | 116 | 14 | Gaoligong Shan Medicinal Plants Checklist |
2012-02-09T18:22:14Z | 117 | 14 | birds_of_tanzania |
2012-02-09T18:22:23Z | 118 | 14 | AmphibiaWeb |
2012-02-09T18:22:38Z | 119 | 14 | tanzania_plant_sepecimens |
2012-02-09T18:22:45Z | 120 | 14 | Papahanaumokuakea Marine National Monument |
2012-02-09T18:23:21Z | 121 | 14 | Taiwanese IUCN species list |
2012-02-09T18:23:27Z | 122 | 14 | BioPedia |
2012-02-09T18:24:06Z | 123 | 14 | AnAge |
2012-02-09T18:24:25Z | 124 | 14 | Embioptera Species File |
2012-02-09T18:24:28Z | 125 | 14 | Global Invasive Species Database |
2012-02-09T18:24:38Z | 126 | 14 | Sendoya S., Fernández F. AAT de hormigas (Hymenoptera: Formicidae) del Neotrópico 1.0 2004 (Spreadsheet) |
2012-02-09T18:25:00Z | 127 | 14 | Flora of Gaoligong Mountains |
2012-02-09T18:25:16Z | 128 | 14 | ARKive |
2012-02-09T18:25:27Z | 129 | 14 | True Fruit Flies (Diptera, Tephritidae) of the Afrotropical Region |
2012-02-09T18:25:30Z | 130 | 14 | 3i - Typhlocybinae Database |
2012-02-09T18:26:09Z | 131 | 14 | CATE Sphingidae |
2012-02-09T18:26:28Z | 132 | 14 | ZooBank |
2012-02-09T18:26:44Z | 133 | 14 | Diatoms |
2012-02-09T18:27:14Z | 134 | 14 | AntWeb |
2012-02-09T18:27:40Z | 135 | 14 | Endemic species in Taiwan |
2012-02-09T18:28:15Z | 136 | 14 | Dermaptera Species File |
2012-02-09T18:28:21Z | 137 | 14 | Mantodea Species File |
2012-02-09T18:28:29Z | 138 | 14 | Birds of the World: Recommended English Names |
2012-02-09T18:29:01Z | 139 | 14 | New Zealand Animalia |
2012-02-09T18:30:39Z | 140 | 14 | Blattodea Species File |
2012-02-09T18:30:57Z | 141 | 14 | Plecoptera Species File |
2012-02-09T18:31:58Z | 143 | 14 | Coreoidea Species File |
2012-02-09T18:32:28Z | 144 | 14 | Freshwater Animal Diversity Assessment - Normalized export |
2012-02-09T18:33:38Z | 145 | 14 | Catalogue of Vascular Plant Species of Central and Northeastern Brazil |
2012-02-09T18:35:12Z | 146 | 14 | Wikipedia in EOL |
2012-02-09T18:36:49Z | 147 | 14 | Database of Vascular Plants of Canada (VASCAN) |
2012-02-09T18:38:13Z | 148 | 14 | Phasmida Species File |
2012-02-09T18:38:29Z | 149 | 14 | OBIS |
2012-02-09T18:40:09Z | 150 | 14 | USDA NRCS PLANTS Database |
2012-02-09T18:42:04Z | 151 | 14 | Catalog of Fishes |
2012-02-09T18:43:41Z | 152 | 14 | Aphid Species File |
2012-02-09T18:44:03Z | 153 | 14 | The National Checklist of Taiwan |
2012-02-09T18:46:06Z | 154 | 14 | Psocodea Species File |
2012-02-09T18:46:24Z | 155 | 14 | FishBase |
2012-02-09T18:48:19Z | 156 | 14 | 3i - Typhlocybinae Database |
2012-02-09T18:48:44Z | 157 | 14 | Belgian Species List |
2012-02-09T18:51:49Z | 158 | 14 | EUNIS |
2012-02-09T18:58:36Z | 159 | 14 | CU*STAR |
2012-02-09T19:10:42Z | 161 | 14 | Orthoptera Species File |
2012-02-09T19:11:37Z | 162 | 14 | Bishop Museum |
2012-02-09T19:18:20Z | 163 | 14 | IUCN Red List of Threatened Species |
2012-02-09T19:20:46Z | 164 | 14 | BioLib.cz |
2012-02-09T19:43:03Z | 165 | 14 | Tropicos - Missouri Botanical Garden |
2012-02-09T20:05:41Z | 166 | 14 | nlbif |
2012-02-09T20:36:27Z | 167 | 14 | The International Plant Names Index |
2012-05-07T13:45:07Z | 168 | 14 | Index to Organism Names |
2012-05-07T13:50:15Z | 169 | 14 | uBio NameBank |
2013-05-31T01:17:28Z | 170 | 14 | Arctos |
2013-12-10T03:02:58Z | 171 | 14 | Checklist of Beetles (Coleoptera) of Canada and Alaska. Second Edition. |
2014-12-08T11:17:24Z | 172 | 14 | The Paleobiology Database |
2014-12-08T19:50:56Z | 173 | 14 | The Reptile Database |
2014-12-09T21:27:18Z | 174 | 14 | The Mammal Species of The World |
2014-12-11T00:19:59Z | 175 | 14 | BirdLife International |
2015-03-03T13:48:51Z | 176 | 14 | Checklist da Flora de Portugal (Continental, Açores e Madeira) |
2016-07-20T11:13:25Z | 177 | 14 | FishBase Cache |
2016-10-18T20:00:31Z | 178 | 14 | Silva |
2016-10-19T10:13:10Z | 179 | 14 | Open Tree of Life Reference Taxonomy |
2016-10-30T00:46:40Z | 180 | 14 | iNaturalist |
2016-11-03T16:09:05Z | 181 | 14 | The Interim Register of Marine and Nonmarine Genera |
2017-03-22T15:26:50Z | 182 | 14 | Gymno |
2020-05-25T02:43:22Z | 183 | 14 | Index Animalium |
2020-05-25T10:32:16Z | 184 | 14 | ASM Mammal Diversity Database |
2020-05-27T01:41:11Z | 185 | 14 | IOC World Bird List |
2020-05-28T00:01:22Z | 186 | 14 | MCZbase |
2020-05-28T16:50:17Z | 187 | 14 | The eBird/Clements Checklist of Birds of the World |
2020-05-30T00:58:32Z | 188 | 14 | American Ornithological Society |
2020-05-31T00:36:35Z | 189 | 14 | Howard and Moore Complete Checklist of the Birds of the World |
2020-05-31T01:23:24Z | 193 | 14 | Myriatrix |
2021-03-19T16:48:41Z | 194 | 14 | PLAZI treatments |
2021-10-21T12:27:48Z | 195 | 14 | AlgaeBase |
2021-12-28T13:04:34Z | 196 | 14 | World Flora Online consortium |
2021-12-29T13:29:11Z | 197 | 14 | World Checklist of Vascular Plants |
2021-12-30T14:32:51Z | 198 | 14 | The Leipzig Catalogue of Vascular Plants |
2022-01-14T22:16:40Z | 200 | 14 | The Terrestrial Parasite Tracker |
2022-02-14T15:59:43Z | 201 | 14 | ICTV Virus Taxonomy |
2022-02-18T21:58:40Z | 202 | 14 | Discover Life Bee Species Guide |
Revisemos si están escritos correctamente
name.res <- gnr_resolve(sci = Ram.names, data_source_ids = c(3:4))
name.res[, -1] %>%
kbl() %>%
kable_minimal()
¿Y si no lo estuvieran?
Ram.names2 <- Ram.names
Ram.names2[2] <- "Ramphocelus passerini"
name.res2 <- gnr_resolve(sci = Ram.names2, data_source_ids = c(3:4))
name.res2[, -1] %>%
kbl() %>%
kable_minimal()
submitted_name | matched_name | data_source_title | score |
---|---|---|---|
Ramphocelus passerini | Ramphocelus passerinii Bonaparte, 1831 | Integrated Taxonomic Information SystemITIS | 0.75 |
Ramphocelus passerini | Ramphocelus passerinii | National Center for Biotechnology Information | 0.75 |
Busquemos si hay sinónimos para estas especies
Para usar algunas bases de datos es necesario obtener un ‘API key’ Esto no se puede hacer automáticamente con ‘taxize’ pero se puede obtener instrucciones de cómo obtener y guardar el API key para usarlo desde R. Veamos un par de ejemplos:
use_tropicos()
use_iucn()
use_entrez()
# para más información
`?`(key_helpers())
`?`(`taxize-authentication`)
install.packages("usethis")
usethis::edit_r_environ()
getkey()
Podemos obtener información sobre la clasificación taxonómica superior de nuestras especies. Si su clave es para ‘tropicos’ o ‘entrez’ puede usar las bases de datos respectivas (tropicos y ncbi). Por ejemplo:
y si sólo queremos saber la familia…
Ram.fam <- tax_name(sci = Ram.names, get = "family", db = "ncbi")
Ram.fam %>%
kbl() %>%
kable_minimal()
Ta vez queremos saber cuáles o cuántos son los miembros de un cierto grupo taxonómico. Por ejemplo, ¿cuántas especies hay en el género Ramphocelus?
Si tienen el ‘API key’ para IUCN, pueden obtener información sobre el estado de consevación.
OJO: los autores de ‘taxize’ advierten usar con mucho cuidado ya que puede haber errores
Veamos algunos ejemplos de cómo usar herramientas como ’taxize’contribuyen a investigación más reproducible.
Artículo de Gibb et al. 2020, Nature
De la sección de métodos:
“We compiled animal host–pathogen associations from several source databases, to provide as comprehensive a dataset as possible of zoonotic host species and their pathogens: the Enhanced Infectious Diseases (EID2) database; the Global Mammal Parasite Database v.2.0 (GMPD2) which collates records of parasites of cetartiodactyls, carnivores and primates; a reservoir hosts database; a mammal–virus associations database; and a rodent zoonotic reservoirs database augmented with pathogen data from the Global Infectious Disease and Epidemiology Network (GIDEON) (Supplementary Table 8). We harmonized species names across all databases, excluding instances in which either hosts or pathogens could not be classified to species level. To prevent erroneous matches due to misspelling or taxonomic revision, all host species synonyms were accessed from Catalogue Of Life using ‘taxize’ v.0.8.939. Combined, the dataset contained 20,382 associations between 3,883 animal host species and 5,694 pathogen species.”
Veamos el código del artículo y hagamos una pequeña modificación para aplicar la función a nuestros datos
# taxize/GBIFr
require(taxize)
require(rgbif)
library(plyr)
# function to find and resolve taxonomic synonyms based on
# Encyclopedia of Life
findSyns2 = function(x) {
# get specific species name taxname = hosts_vec[x] un cambio
# pequenio para usar la funcion con nuestros datos
taxname = x
# print progress
print(paste("Processing:", taxname, sep = " "))
# phyla
phyla = c("Chordata", "Arthropoda", "Gastropoda", "Mollusca")
# (1) resolve misspellings
taxname_resolved = gnr_resolve(taxname, with_canonical_ranks = TRUE)$matched_name2[1]
if (!is.null(taxname_resolved)) {
if (length(strsplit(taxname_resolved, " ", fixed = TRUE)[[1]]) ==
2) {
taxa = taxname_resolved
}
}
if (!is.null(taxname_resolved)) {
if (length(strsplit(taxname_resolved, " ", fixed = TRUE)[[1]]) >
2) {
taxa = paste(strsplit(taxname_resolved, " ", fixed = TRUE)[[1]][1:2],
collapse = " ")
}
}
# if taxa == NA, return list with nothing defined
if (is.na(taxa)) {
if (class(syns)[1] == "simpleError") {
return(data.frame(Original = taxname, Submitted = taxname_resolved,
Accepted_name = NA, Selected_family = NA, Selected_order = NA,
Selected_class = NA, Synonyms = NA))
}
}
# (2) remove sub-species categorisations and set 'genus' and
# 'species' variables
genus = NULL
if (length(strsplit(taxa, " ", fixed = TRUE)[[1]]) %in% c(2, 3)) {
genus = strsplit(taxa, " ", fixed = TRUE)[[1]][1]
species = strsplit(taxa, " ", fixed = TRUE)[[1]][2]
}
if (length(strsplit(taxa, "_", fixed = TRUE)[[1]]) %in% c(2, 3)) {
genus = strsplit(taxa, "_", fixed = TRUE)[[1]][1]
species = strsplit(taxa, "_", fixed = TRUE)[[1]][2]
}
if (length(strsplit(taxa, " ", fixed = TRUE)[[1]]) > 3 | length(strsplit(taxa,
"_", fixed = TRUE)[[1]][1]) > 3) {
return("name error")
}
if (is.null(genus)) {
genus = taxa
species = NA
}
# (3) use genus to lookup family, order, class
syns = tryCatch(name_lookup(genus)$data, error = function(e) e)
if (class(syns)[1] == "simpleError") {
return(data.frame(Original = taxname, Submitted = taxa, Accepted_name = NA,
Selected_family = NA, Selected_order = NA, Selected_class = NA,
Synonyms = NA))
}
# for cases where the lookup does not find a phylum within
# the specified range
if (all(!syns$phylum %in% phyla)) {
fam1 = syns$family[!is.na(syns$family) & !is.na(syns$phylum)]
order1 = syns$order[!is.na(syns$family) & !is.na(syns$phylum)]
class1 = syns$class[!is.na(syns$family) & !is.na(syns$phylum)]
datfam = data.frame(fam1 = fam1, order = 1:length(fam1), order1 = order1,
class1 = class1)
# select highest frequency fam/class/order combo
fam2 = as.data.frame(table(datfam[, c(1, 3, 4)]))
family2 = as.vector(fam2[fam2$Freq == max(fam2$Freq, na.rm = TRUE),
"fam1"])
order2 = as.vector(fam2[fam2$Freq == max(fam2$Freq, na.rm = TRUE),
"order1"])
class2 = as.vector(fam2[fam2$Freq == max(fam2$Freq, na.rm = TRUE),
"class1"])
if (length(fam2) > 1) {
datfam2 = datfam[datfam$fam1 %in% family2, ]
family2 = as.vector(datfam2[datfam2$order == min(datfam2$order,
na.rm = TRUE), "fam1"])
order2 = as.vector(datfam2[datfam2$order == min(datfam2$order,
na.rm = TRUE), "order1"])
class2 = as.vector(datfam2[datfam2$order == min(datfam2$order,
na.rm = TRUE), "class1"])
}
} else {
# for everything else
fam1 = syns$family[!is.na(syns$family) & !is.na(syns$phylum) &
(syns$phylum %in% phyla)]
order1 = syns$order[!is.na(syns$family) & !is.na(syns$phylum) &
(syns$phylum %in% phyla)]
class1 = syns$class[!is.na(syns$family) & !is.na(syns$phylum) &
(syns$phylum %in% phyla)]
datfam = data.frame(fam1 = fam1, order = 1:length(fam1), order1 = order1,
class1 = class1)
# select highest frequency fam/class/order combo
fam2 = as.data.frame(table(datfam[, c(1, 3, 4)]))
family2 = as.vector(fam2[fam2$Freq == max(fam2$Freq, na.rm = TRUE),
"fam1"])
order2 = as.vector(fam2[fam2$Freq == max(fam2$Freq, na.rm = TRUE),
"order1"])
class2 = as.vector(fam2[fam2$Freq == max(fam2$Freq, na.rm = TRUE),
"class1"])
# select highest in list if more than one max
if (length(family2) > 1) {
datfam2 = datfam[datfam$fam1 %in% family2, ]
family2 = as.vector(datfam2[datfam2$order == min(datfam2$order,
na.rm = TRUE), "fam1"])
order2 = as.vector(datfam2[datfam2$order == min(datfam2$order,
na.rm = TRUE), "order1"])
class2 = as.vector(datfam2[datfam2$order == min(datfam2$order,
na.rm = TRUE), "class1"])
}
}
# (4) search for species synonyms in ITIS
syns = tryCatch(suppressMessages(synonyms(taxa, db = "itis")),
error = function(e) e)
if (class(syns)[1] == "simpleError") {
return(data.frame(Original = taxname, Submitted = taxa, Accepted_name = "failed",
Selected_family = family2, Selected_order = order2, Selected_class = class2,
Synonyms = "failed"))
}
syns = as.data.frame(syns[[1]])
# get info
original = taxa
accepted_name = taxa # save accepted name as original searched name
if ("acc_name" %in% names(syns))
{
accepted_name = syns$acc_name
} # unless search shows that this is not the accepted name
if ("syn_name" %in% names(syns)) {
synonyms = unique(syns$syn_name)
} else {
synonyms = NA
}
# combine into list and add synonyms
result = data.frame(Original = taxname, Submitted = taxa, Accepted_name = accepted_name,
Selected_family = family2, Selected_order = order2, Selected_class = class2)
result = do.call("rbind", replicate(length(synonyms), result[1,
], simplify = FALSE))
result$Synonyms = synonyms
return(result)
}
# nest function within a tryCatch call in case of any errors
findSyns3 = function(x) {
result = tryCatch(findSyns2(x), error = function(e) NULL)
return(result)
}
Antes vimos que hay un nombre que es sinónimo de Ramphocelus sanguinolentus ¿qué haría la función de Gibb et al con ese?
## [1] "Processing: Phlogothraupis sanguinolenta"
## ══ 1 queries ═══════════════
## ✔ Found: Phlogothraupis sanguinolenta
## ══ Results ═════════════════
##
## • Total: 1
## • Found: 1
## • Not Found: 0
Original | Submitted | Accepted_name | Selected_family | Selected_order | Selected_class | Synonyms |
---|---|---|---|---|---|---|
Phlogothraupis sanguinolenta | Phlogothraupis sanguinolenta | Ramphocelus sanguinolentus | Thraupidae | Passeriformes | Aves | Phlogothraupis sanguinolenta |
¿Y con uno que está mal escrito?
## [1] "Processing: Ramphocelus passerini"
## ══ 1 queries ═══════════════
## ✔ Found: Ramphocelus passerinii
## ══ Results ═════════════════
##
## • Total: 1
## • Found: 1
## • Not Found: 0
Original | Submitted | Accepted_name | Selected_family | Selected_order | Selected_class | Synonyms |
---|---|---|---|---|---|---|
Ramphocelus passerini | Ramphocelus passerinii | Ramphocelus passerinii | Thraupidae | Passeriformes | Aves | NA |
Artículo de Niedballa et al. 2016,Methods Ecol. Evol.
De la sección de métodos:
“Users are free to use any species names (or abbreviations or codes) they wish. If scientific or common species names are used, the function checkSpeciesNames can check them against the ITIS taxonomic database (www.itis.gov) and returns their matching counterparts (utilizing the R package taxize (Chamberlain & Szöcs 2013) internally), making sure species names and spelling are standardized and taxonomically sound, and thus making it easier to combine data sets from different studies.”
Veamos algunos ejemplos del las viñetas de camtrapR. No vamos a bajar el paquete, porque la versión de CRAN de ‘camtrapR’ no es compatible con la versión de CRAN de ‘taxize’
Vamos a reescribir la función actualizando los argumentos de ‘taxize’
checkSpeciesNames <- function(speciesNames, searchtype, accepted = TRUE,
ask = TRUE) {
if (!requireNamespace("taxize", quietly = TRUE)) {
stop("Please install the package taxize to run this function")
}
if (!requireNamespace("ritis", quietly = TRUE)) {
stop("Please install the package ritis to run this function")
}
searchtype <- match.arg(searchtype, choices = c("scientific",
"common"))
stopifnot(is.logical(accepted))
stopifnot(is.character(speciesNames) | is.factor(speciesNames))
speciesNames <- unique(as.character(speciesNames))
file.sep <- .Platform$file.sep
tsns <- try(taxize::get_tsn(sci_com = speciesNames, searchtype = searchtype,
accepted = accepted, ask = ask, messages = FALSE))
if (inherits(tsns, "try-error")) {
message(paste("error in get_tsn. Exiting without results:\n",
tsns, sep = ""))
return(invisible(NULL))
}
tsns <- taxize::as.tsn(unique(tsns), check = FALSE)
if (any(is.na(tsns))) {
not.matched <- which(is.na(tsns))
warning(paste("found no matches for", length(not.matched),
"name(s):\n", paste(speciesNames[not.matched], collapse = ", ")),
immediate. = TRUE, call. = FALSE)
tsns_worked <- taxize::as.tsn(tsns[-not.matched], check = FALSE)
} else {
tsns_worked <- tsns
}
if (length(tsns_worked) >= 1) {
scientific <- common <- author <- rankname <- taxon_status <- data.frame(matrix(NA,
nrow = length(tsns_worked), ncol = 2), stringsAsFactors = FALSE)
colnames(scientific) <- c("tsn", "combinedname")
colnames(common) <- c("tsn", "commonName")
colnames(author) <- c("tsn", "authorship")
colnames(rankname) <- c("tsn", "rankname")
colnames(taxon_status) <- c("tsn", "taxonUsageRating")
for (i in 1:length(tsns_worked)) {
scientific_tmp <- ritis::scientific_name(tsns_worked[i])
common_tmp <- ritis::common_names(tsns_worked[i])
author_tmp <- ritis::taxon_authorship(tsns_worked[i])
rankname_tmp <- ritis::rank_name(tsns_worked[i])
if ("tsn" %in% colnames(scientific_tmp)) {
scientific[i, ] <- scientific_tmp[c("tsn", "combinedname")]
}
if ("tsn" %in% colnames(common_tmp)) {
if (table(common_tmp$tsn) > 1) {
common2 <- tapply(common_tmp$commonName, INDEX = common_tmp$tsn,
FUN = paste, collapse = file.sep)
common_tmp <- data.frame(commonName = common2, tsn = rownames(common2),
stringsAsFactors = FALSE)
}
common[i, ] <- common_tmp[, c("tsn", "commonName")]
}
if ("tsn" %in% colnames(author_tmp)) {
author[i, ] <- author_tmp[c("tsn", "authorship")]
}
if ("tsn" %in% colnames(rankname_tmp)) {
rankname[i, ] <- rankname_tmp[c("tsn", "rankname")]
}
if (accepted == FALSE) {
taxon_status_tmp <- ritis::core_metadata(tsns_worked[i])
if ("tsn" %in% colnames(taxon_status_tmp)) {
taxon_status[i, ] <- taxon_status_tmp[c("tsn", "taxonUsageRating")]
}
}
}
dat.out <- data.frame(user_name = speciesNames, tsn = as.numeric(tsns))
dat.out <- merge(x = dat.out, y = scientific, by = "tsn",
all.x = TRUE, sort = FALSE)
dat.out <- merge(x = dat.out, y = common, by = "tsn", all.x = TRUE,
sort = FALSE)
dat.out <- merge(x = dat.out, y = author, by = "tsn", all.x = TRUE,
sort = FALSE)
dat.out <- merge(x = dat.out, y = rankname, by = "tsn", all.x = TRUE,
sort = FALSE)
dat.out$itis_url <- NA
dat.out$itis_url[match(tsns_worked, dat.out$tsn)] <- attributes(tsns_worked)$uri
colnames(dat.out)[colnames(dat.out) == "combinedname"] <- "scientificName"
if (accepted == FALSE) {
dat.out <- merge(x = dat.out, y = taxon_status, by = "tsn",
all.x = TRUE, sort = FALSE)
} else {
dat.out$taxon_status[!is.na(dat.out$tsn)] <- "valid"
}
return(dat.out)
} else {
stop("found no TSNs for speciesNames", call. = FALSE)
}
}
¡Ahora sí! ¿Qué podemos hacer?
checkNames1 <- checkSpeciesNames(speciesNames = c("Bearded Pig", "Malayan Civet"),
searchtype = "common")
checkNames1 %>%
kbl() %>%
kable_minimal()
tsn | user_name | scientificName | commonName | authorship | rankname | itis_url | taxon_status |
---|---|---|---|---|---|---|---|
625012 | Bearded Pig | Sus barbatus | bearded pig/Bearded Pig | Müller, 1838 | Species | https://www.itis.gov/servlet/SingleRpt/SingleRpt?search_topic=TSN&search_value=625012 | valid |
622004 | Malayan Civet | Viverra tangalunga | Malayan Civet | Gray, 1832 | Species | https://www.itis.gov/servlet/SingleRpt/SingleRpt?search_topic=TSN&search_value=622004 | valid |
checkNames2 <- checkSpeciesNames(speciesNames = "Viverra tangalunga tangalunga",
searchtype = "scientific")
checkNames2 %>%
kbl() %>%
kable_minimal()
tsn | user_name | scientificName | commonName | authorship | rankname | itis_url | taxon_status |
---|---|---|---|---|---|---|---|
726578 | Viverra tangalunga tangalunga | Viverra tangalunga tangalunga | NA | Gray, 1832 | Subspecies | https://www.itis.gov/servlet/SingleRpt/SingleRpt?search_topic=TSN&search_value=726578 | valid |
checkNames3 <- checkSpeciesNames(speciesNames = "Felis bengalensis",
searchtype = "scientific", accepted = FALSE)
checkNames3 %>%
kbl() %>%
kable_minimal()
tsn | user_name | scientificName | commonName | authorship | rankname | itis_url | taxonUsageRating |
---|---|---|---|---|---|---|---|
183793 | Felis bengalensis | Felis bengalensis | leopard cat | Kerr, 1792 | Species | https://www.itis.gov/servlet/SingleRpt/SingleRpt?search_topic=TSN&search_value=183793 | invalid |
checkNames4 <- checkSpeciesNames(speciesNames = "Chevrotain", searchtype = "common")
# escoger del menu
1
checkNames4 %>%
kbl() %>%
kable_minimal()
Información de la sesión
## R version 4.1.1 (2021-08-10)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 20.04.2 LTS
##
## Matrix products: default
## BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0
##
## locale:
## [1] LC_CTYPE=es_ES.UTF-8
## [2] LC_NUMERIC=C
## [3] LC_TIME=es_CR.UTF-8
## [4] LC_COLLATE=es_ES.UTF-8
## [5] LC_MONETARY=es_CR.UTF-8
## [6] LC_MESSAGES=es_ES.UTF-8
## [7] LC_PAPER=es_CR.UTF-8
## [8] LC_NAME=C
## [9] LC_ADDRESS=C
## [10] LC_TELEPHONE=C
## [11] LC_MEASUREMENT=es_CR.UTF-8
## [12] LC_IDENTIFICATION=C
##
## attached base packages:
## [1] stats graphics grDevices utils datasets
## [6] methods base
##
## other attached packages:
## [1] plyr_1.8.7 rgbif_3.7.3
## [3] taxize_0.9.100 tufte_0.12
## [5] rticles_0.24 revealjs_0.9
## [7] rmdformats_1.0.4 kableExtra_1.3.4
## [9] rmarkdown_2.14 sketchy_1.0.2
## [11] remotes_2.4.2 leaflet_2.1.1
## [13] knitr_1.39 xaringanExtra_0.7.0
## [15] emo_0.0.0.9000 cowsay_0.8.0
##
## loaded via a namespace (and not attached):
## [1] uuid_1.1-0 workflowr_1.7.0
## [3] systemfonts_1.0.4 lazyeval_0.2.2
## [5] crosstalk_1.2.0 usethis_2.1.6
## [7] ggplot2_3.3.6 urltools_1.7.3
## [9] digest_0.6.29 foreach_1.5.2
## [11] htmltools_0.5.3 rsconnect_0.8.26
## [13] fansi_1.0.3 magrittr_2.0.3
## [15] memoise_2.0.1 vertical_0.1.0.0000
## [17] svglite_2.1.0 prettyunits_1.1.1
## [19] colorspace_2.0-3 rvest_1.0.2
## [21] xfun_0.31 dplyr_1.0.9
## [23] callr_3.7.0 crayon_1.5.1
## [25] jsonlite_1.8.0 zoo_1.8-10
## [27] iterators_1.0.14 ape_5.6-2
## [29] glue_1.6.2 gtable_0.3.0
## [31] emmeans_1.7.4-1 webshot_0.5.3
## [33] pkgbuild_1.3.1 scales_1.2.0
## [35] oai_0.3.2 mvtnorm_1.1-3
## [37] solrium_1.2.0 DBI_1.1.3
## [39] Rcpp_1.0.9 viridisLite_0.4.0
## [41] xtable_1.8-4 bold_1.2.0
## [43] clisymbols_1.2.0 datawizard_0.4.1
## [45] htmlwidgets_1.5.4 httr_1.4.3
## [47] papaja_0.1.1 RColorBrewer_1.1-3
## [49] ellipsis_0.3.2 pkgconfig_2.0.3
## [51] reshape_0.8.9 sass_0.4.1
## [53] utf8_1.2.2 here_1.0.1
## [55] conditionz_0.1.0 crul_1.2.0
## [57] tidyselect_1.1.2 rlang_1.0.4
## [59] later_1.3.0 effectsize_0.7.0
## [61] munsell_0.5.0 tools_4.1.1
## [63] cachem_1.0.6 fortunes_1.5-4
## [65] cli_3.3.0 generics_0.1.2
## [67] tinylabels_0.2.3 devtools_2.4.3
## [69] evaluate_0.15 stringr_1.4.0
## [71] fastmap_1.1.0 yaml_2.3.5
## [73] processx_3.6.1 fs_1.5.2
## [75] purrr_0.3.4 ritis_1.0.0
## [77] packrat_0.8.0 rrtools_0.1.5
## [79] nlme_3.1-152 whisker_0.4
## [81] formatR_1.12 xml2_1.3.3
## [83] brio_1.1.3 compiler_4.1.1
## [85] rstudioapi_0.13 curl_4.3.2
## [87] testthat_3.1.4 tibble_3.1.8
## [89] bslib_0.3.1 stringi_1.7.8
## [91] highr_0.9 ps_1.7.1
## [93] parameters_0.18.1 desc_1.4.1
## [95] lattice_0.20-44 vctrs_0.4.1
## [97] pillar_1.8.0 lifecycle_1.0.1
## [99] rmsfact_0.0.3 triebeard_0.3.0
## [101] jquerylib_0.1.4 estimability_1.3
## [103] data.table_1.14.2 insight_0.17.1
## [105] httpuv_1.6.5 R6_2.5.1
## [107] bookdown_0.27 promises_1.2.0.1
## [109] sessioninfo_1.2.2 codetools_0.2-18
## [111] assertthat_0.2.1 pkgload_1.2.4
## [113] rprojroot_2.0.3 withr_2.5.0
## [115] httpcode_0.3.0 bayestestR_0.12.1
## [117] parallel_4.1.1 grid_4.1.1
## [119] coda_0.19-4 git2r_0.30.1
## [121] getPass_0.2-2 lubridate_1.8.0