Online ecology - Individual species description

June 19, 2017

  R ecology open-data open-science

  biomod2 kableExtra knitr magrittr raster rfishbase rglobis robis sdmpredictors sf taxize

David Beauchesne   Kevin Cazelles   Rémi Daigle  

Online ecology

Let’s imagine that we are interested in a species in a given area and wish to know as much as possible about it. But, you can’t go out in the field because funding is running short. What we do have, however, is a certain knowledge of the open data science tools that are at our disposal. In a series of post about online ecology, we will find out exactly just how far these tools allow us to delve into the ecology of the species that interest us.

Online ecology posts:

Individual species description

This post focuses on the simplest yet still complicated aspect of ecology, i.e. describing a species as thoroughly as possible with the tools at our disposal.

Special thanks to the developers at ROpenSci, who built many of the R package used to access the open access tools we present in this post.

Setting up R

R version used to build the last update of this post

#R> [1] "R version 3.5.0 (2017-01-27)"

Defining species and area of interest

We start by selecting the species and the area in which we are interested. For this post, we focus on the Atlantic cod (Gadus morhua) in the estuary and gulf of St. Lawrence in eastern Canada.

Let’s set the parameters and create the spatial bounding box that we will be using for the area of interest. Note that all these parameters can be changed to extract information for other species in other habitats.

# Species of interest
    sp <- 'Gadus morhua'

# Extent of area of interest
    latmax <- 52.01312
    latmin <- 45.52399
    lonmax <- -55.73636
    lonmin <- -71.06333

# Create a spatial bounding box for the area of interest using the 'sf' package:
# create a matrix:
    bb <- cbind(c(lonmin,lonmax,lonmax,lonmin,lonmin),
                   c(latmin,latmin,latmax,latmax,latmin)) %>%
        # put that matrix into a list, because that's what `st_polygon()` needs
        list() %>%
        # Make the matrix a 'simple features' polygon:
        sf::st_polygon() %>%
        # and let's make it a simple feature column and give it information about the projection:
        sf::st_sfc(crs="+proj=longlat +datum=WGS84") %>%
        # finally, let's put the sfc in a simple features data.frame in the variable `geometry`:
        sf::st_sf(name="Study Site",geometry=.)

Describe your species

Retrieve miscellaneous ecological information: Fishbase

We’ll start with a description of the species. First, let’s see what fishbase has to offer. This online data repository, along with sealifebase, contains a lot of precious information on marine and aquatic species all over the world and is accessible through the package rfishbase

# Species ecology
ecol <- rfishbase::ecology(sp)
ecol <- cbind(colnames(ecol), t(ecol))
rownames(ecol) <- NULL
ecol <- ecol[ecol[,2] != 0, ] # remove 0
ecol <- ecol[![,2]), ] # remove NAs

knitr::kable(ecol, col.names = c('Descriptors', 'Attributes'), "html") %>%
    kableExtra::kable_styling(full_width = FALSE)
Descriptors Attributes
autoctr 33
sciname Gadus morhua
StockCode 79
EcologyRefNo 1371
HabitatsRef 1371
Neritic -1
Intertidal -1
Oceanic -1
Estuaries -1
Herbivory2 mainly animals (troph. 2.8 and up)
HerbivoryRef 5743
FeedingType hunting macrofauna (predator)
FeedingTypeRef 5743
DietTroph 4.09
DietSeTroph 0.179
DietTLu 4.34
DietseTLu 0.72
DietRemark Troph of adults from 7 studies.
DietRef 26813
FoodTroph 4.29
FoodSeTroph 1
FoodRemark Trophic level estimated from a number of food items using a randomized resampling routine.
AddRems Opportunistic predator that forages mainly at dawn and dusk (Refs. 1371, 46189). Larvae feed mainly on zooplankton while juveniles prey predominantly on benthic crustaceans; adults feed mainly on zoobenthos and fish (Refs. 5743, 9604, 26813) including juvenile cod. Fish prey becomes more common in the diet with increasing body size (Refs. 1371, 89387). Adults may cover large distances during the feeding period (Ref. 89387). Young cod are also preyed upon by different fish species and octopus. Adult cod are prey items of top predators like sharks, rays, whales, dolphins, seals, and sea birds (Refs. 9023, 9581, 26954, 43651, 45735). In the Baltic it grows up to 5 kg weight in 7-8 years; in the North Sea it reaches 8 kg in the same time span . Natural mortality for adults of both stocks is assumed to be around M=0.2, resulting in a mean adult life expectancy and mean duration of the reproductive phase of 5 years (Ref. 88171). Parasites of the species include protozoans (trypanosome), myxosporidians, monogeneid, trematodes, cestodes, nematodes, acanthocephalan, hirudinid and copepods (Ref. 5951).
Schooling -1
SchoolingFrequency sometimes
SchoolingLifestage juveniles and adults
SchoolShoalRef 1371
AssociationsRemarks Generally considered a demersal fish although its habitat may become pelagic under certain hydrogrphic conditions, when feeding or spawning. There is some evidence that cod leave the bottom and school pelagically to spawn in preferred temperatures when bottom tempetatures are unsuitable. Gregarious during the day, forming compact schools that swim between 30-80 m above the bottom, and scatter at night (Ref. 1371). Schooling behavior may be adaptive for feeding. Reproductive behavior during spawning involves the circling of a female often by only one male per spawning bout (Ref. 86779).
SoftBottom -1
HardBottom -1
Rocky -1
SeaGrassBeds -1
Entered 2
Dateentered 1991-10-17T00:00:00.000Z
Modified 2374
Datemodified 2014-02-06T00:00:00.000Z
SpecCode 69

Retrieving taxonomic information: Taxize

We can also extract taxonomic information using the package taxize. This package allows you to extract and validate, among other things, the taxonomy of millions of species by accessing an important number of online databases accessible through their Application Programming Interface (API).

# Export the taxonomy of the species of interest
taxize::classification(sp, db = 'worms', accepted = TRUE, verbose = FALSE)[[1]] %>%
    knitr::kable("html") %>% kableExtra::kable_styling(full_width = FALSE)
name rank id
Animalia Kingdom 2
Chordata Phylum 1821
Vertebrata Subphylum 146419
Gnathostomata Superclass 1828
Pisces Superclass 11676
Actinopterygii Class 10194
Gadiformes Order 10313
Gadidae Family 125469
Gadus Genus 125732
Gadus morhua Species 126436

# Retrieve a tsn for Gadus morhua, i.e. an unique identifier from the itis db
idtsn <- taxize::get_tsn(sp, accepted=TRUE, verbose = FALSE, row=1)[1]
# We can also extract the common or scientific names using sci2comm() & comm2sci(), respectively.
taxize::sci2comm(taxize::as.tsn(idtsn), db = 'itis')
#R> $`164712`
#R> [1] "morue de l'Atlantique" "bacalao del Atlántico" "cod"                  
#R> [4] "rock cod"              "morue franche"         "Atlantic cod"

# Or find out whether there are other names under which the species is known
taxize::synonyms(taxize::as.tsn(idtsn), db = 'itis')
#R> $`164712`
#R>   sub_tsn acc_tsn       message
#R> 1  164712  164712 no syns found
# Another really interesting feature is to extract all known species at a given
# taxonomic scale. With the itis db, you should first find the tsn associated
# with Gadus. Using `taxize::get_tsn('gadus')` you'll find out that it is 164710
knitr::kable(taxize::children(164710, db = 'itis')[[1]], "html") %>%
  kableExtra::kable_styling(full_width = F)
parentname parenttsn rankname taxonname tsn
Gadus 164710 Species Gadus macrocephalus 164711
Gadus 164710 Species Gadus morhua 164712
Gadus 164710 Species Gadus ogac 164717
Gadus 164710 Species Gadus chalcogrammus 934083

Sounds like this is consistent with what Wikipedia says!

Retrieving trophic informations: GloBI

We can also retrieve information on known biotic interactions involving our species of interest. The Global Biotic Interactions web platform contains thousands of empirical binary interactions for multiple types of interactions, all over the world, and is accessible using the package rglobi.

# There are multiple types of interactions available on GloBI
    knitr::kable(rglobi::get_interaction_types()[,1:3], 'html') %>%
        kable_styling(full_width = F)
interaction source target
eats consumer food
eatenBy food consumer
preysOn predator prey
preyedUponBy prey predator
kills killer victim
killedBy victim killer
parasiteOf parasite host
hasParasite host parasite
hostOf host symbiont
hasHost symbiont host
pollinates pollinator plant
pollinatedBy plant pollinator
pathogenOf pathogen host
hasPathogen host pathogen
vectorOf vector pathogen
hasVector pathogen vector
dispersalVectorOf vector seed
hasDispersalVector seed vector
symbiontOf source target
flowersVisitedBy plant visitor
visitsFlowersOf visitor plant
interactsWith source target
# For now let's focus on predator-prey interactions
    prey <- rglobi::get_prey_of(sp)$target_taxon_name # Retrieve prey
    pred <- rglobi::get_predators_of(sp)$target_taxon_name # Retrieve predators
    length(prey) # Number of prey
#R> [1] 170
    length(pred) # Number of predators
#R> [1] 53
    prey[1:20] # First 20 prey
#R>  [1] "Ammodytes tobianus"        "Rossia moelleri"          
#R>  [3] "Bathypolypus bairdii"      "Natatolana borealis"      
#R>  [5] "Arctica islandica"         "Bathypolypus arcticus"    
#R>  [7] "Rossia macrosoma"          "Buenia jeffreysii"        
#R>  [9] "Zeugopterus punctatus"     "Phrynorhombus norvegicus" 
#R> [11] "Neocalanus tonsus"         "Lithodes maja"            
#R> [13] "Buccinum undatum"          "Lethenteron camtschaticum"
#R> [15] "Eledone cirrhosa"          "Crangon allmanni"         
#R> [17] "Atelecyclus rotundatus"    "Corystes cassivelanus"    
#R> [19] "Ammodytes dubius"          "Myxine glutinosa"
    pred[1:20] # First 20 predators
#R>  [1] "Thalasseus sandvicensis"      "Myxine glutinosa"            
#R>  [3] "Sebastes mentella"            "Scomber scombrus"            
#R>  [5] "Anarhichas lupus"             "Molva molva"                 
#R>  [7] "Eutrigla gurnardus"           "Lophius piscatorius"         
#R>  [9] "Merlangius merlangus"         "Sebastes"                    
#R> [11] "Xiphias gladius"              "Reinhardtius hippoglossoides"
#R> [13] "Phoca vitulina"               "Petromyzon marinus"          
#R> [15] "no name"                      "Thalasseus sandvicensis"     
#R> [17] "Myxine glutinosa"             "Sebastes mentella"           
#R> [19] "Scomber scombrus"             "Anarhichas lupus"

Making our search spatially explicit

Since we are interested in a specific area, making our search spatially explicit would be highly valuable. Luckily, there are tools that allow us to do just that.

Retrieving: occurrence data: OBIS & GBIF

OBIS is the Ocean Biogeographic Information System and their vision is: “To be the most comprehensive gateway to the world’s ocean biodiversity and biogeographic data and information required to address pressing coastal and world ocean concerns.” We can get access to their HUGE database through the robis package.

Similarly, GBIF is the Global Biodiversity Information Facility and it aims at providing free and open access to biodiversity data. This open source platform can be accessed through the rgbif package.

We only cover the OBIS package in this post since the targeted species is marine, but visit rgbif github repository for more information on its use.

# Download occurrence data for species and area of interest between 2010 and 2017
    OBIS <- robis::occurrence(scientificname = sp, geometry=sf::st_as_text(bb$geometry), startdate = as.Date("2010-01-01"), enddate = as.Date("2017-01-01"), fields = c("species", "yearcollected","decimalLongitude", "decimalLatitude"))
Retrieved 1342 records of 1342 (100%)

# Remove duplicates
    OBIS <- unique(OBIS)

# Transform as spatial file
    OBIS <- sf::st_as_sf(OBIS,
                     coords = c("decimalLongitude", "decimalLatitude"),
                     crs="+proj=longlat +datum=WGS84",

# Visualize with mapview
    mapview::mapview(OBIS, cex = 4)@map