R in Space - Import spatial objects

April 11, 2018

  R in Space R Spatial

  sf raster

David Beauchesne   Marie-Hélène Brice   Nicolas Casajus   Kevin Cazelles   Elliot Dreujou   Steve Vissault  


Reading and writing spatial objects with sf and raster

Spatial data are encoded in various GIS file formats such as Shapefiles (.shp), Geodatabase (.gdb), GeoPackage (.gpkg) and GeoJSON (.geojson), GeoTIFF (.tiff). In this post we show how to read commonly used formats and import the data they include in you R session. We start by describing the steps to turn a set of coordinates included in a .csv file into a spatial object and write it to a GIS file format. We then exemplify the use of two functions: st_read() from the sf package and raster() from the raster package and import different GIS file available on the Portail de données ouvertes de Montreal.

Let us start by loading the packages required:

library(sf)
library(raster)

Import spatial data from .csv file

Spatial data are sometimes stored in a text file format (.txt or .csv), in such case columns providing the coordinates must be included and the user must know the CRS (Coordinate Reference System). A sf object allows us to store both the coordinates of each point as well as associated attribute data, i.e. columns describing each feature in the spatial object.

The dataset we work with below represents sampling points of a monitoring program of water quality in Montreal (available here).

# Create a new directory to download data
dir.create("data")

# Download csv file from web page in your working directory
if (!file.exists("data/ruisso.csv"))
  download.file("http://donnees.ville.montreal.qc.ca/dataset/86843d31-4251-4002-b10d-620aaa751092/resource/adad6c48-fb48-40fc-a031-1ac870d978b4/download/scri03.-infor03.02-informatique03.02.07-donneesouvertesrsmaruissostationsstations_ruisso.csv", destfile = "data/ruisso.csv")

# Read csv file in R
pts <- read.csv("data/ruisso.csv", header = T, dec = ",")

Convert a data frame to simple feature objects

The data frame containing sampling points can be converted into simple feature objects using st_as_sf(). To do so, we need to specify where coordinates are, i.e. columns LATITUDE and LONGITUDE as well as the CRS. In our example, the CRS is specified in the metadata of the data set: the datum is WGS84 and the EPSG is 4326. Remember that without the CRS, you cannot locate correctly your coordinates on the Earth’s surface. In the package sf, the reference system can be define using the proj4 format or directly with the EPSG code.

pts_sf <- st_as_sf(x = pts,
  coords = c("LONGITUDE", "LATITUDE"),
  crs = 4326)
head(pts_sf)
## Simple feature collection with 6 features and 5 fields
## geometry type:  POINT
## dimension:      XY
## bbox:           xmin: -73.93704 ymin: 45.43462 xmax: -73.9012 ymax: 45.45022
## epsg (SRID):    4326
## proj4string:    +proj=longlat +datum=WGS84 +no_defs
##         Plan.d.eau Point.d.échantillonnage
## 1 Rivière à l'Orme                 AAO-0.0
## 2 Rivière à l'Orme               AAO-1.5P1
## 3 Rivière à l'Orme               AAO-2.0P4
## 4 Rivière à l'Orme               AAO-3.3P6
## 5 Rivière à l'Orme                 AAO-3.5
## 6 Rivière à l'Orme                 AAO-3.6
##                                                                                                           Localisation
## 1            Pierrefonds, boul. Gouin O, 40m au nord de la rue de l'Anse à l'Orme, exutoire au lac des Deux Montagnes.
## 2                             Pierrefonds, N ponceau boul.Gouin, 1500m en amont exutoire,  branche provenant de l'est.
## 3                            Ste-A.-de-Bellevue, branche drainant secteur ouest, 140m à l'est de la rue Leslie Dowker.
## 4 Kirkland, 60m au sud de l'intersection des rues de l'Anse à l'Orme et de Timberley trail, derrière le dépôt à neige.
## 5                          Sainte-Anne-de-Bellevue, 10m au nord du ch. Ste-Marie, 200m à l'ouest du ch. Anse à l'Orme.
## 6                       Beaconsfield, 250m à l'est de la rue Lee et 25m au sud de l'autoroute 40, en amont du pluvial.
##            Administration Activité                   geometry
## 1     Pierrefonds-Roxboro    Actif POINT (-73.93704 45.45022)
## 2     Pierrefonds-Roxboro  Inactif POINT (-73.91931 45.44744)
## 3 Sainte-Anne-de-Bellevue  Inactif POINT (-73.91535 45.44288)
## 4                Kirkland    Actif POINT (-73.90147 45.43689)
## 5 Sainte-Anne-de-Bellevue    Actif POINT (-73.90144 45.43566)
## 6             Baie d'Urfé    Actif  POINT (-73.9012 45.43462)

As you can see, we now have a MULTIPOINT geometry, and the spatial information is now stored in a simple feature list-column (sfc). The other columns contain all the attributes related to the sample points.

The default plot of a simple features object is a multi-facet of all attributes.

plot(pts_sf)  

As you can see, instead of creating a single map, as with sp object, the default plot of sf object creates multiple maps, one for each attribute, which can sometimes be useful for exploring the spatial distribution of different variables.

Export your points in a Shapefile

We can write a simple features object to a file (e.g. a shapefile) using the st_write() function in sf, which needs at least two arguments, the object and a filename:

st_write(pts_sf, "data/pts_sf.shp", delete_dsn = T)
## Warning in abbreviate(fld_names, minlength = 7): abbreviate used with non-
## ASCII chars
## Warning in abbreviate_shapefile_names(obj): Field names abbreviated for
## ESRI Shapefile driver
## Warning in CPL_write_ogr(obj, dsn, layer, driver,
## as.character(dataset_options), : GDAL Error 1: data/pts_sf.shp does not
## appear to be a file or directory.
## Deleting source `data/pts_sf.shp' failed
## Writing layer `pts_sf' to data source `data/pts_sf.shp' using driver `ESRI Shapefile'
## features:       66
## fields:         5
## geometry type:  Point

Note that st_write() selected one driver base on the file extension provided. The driver can also be made explicit using the driver argument like so: st_write(pts_sf, "data/pts_sf.shp", driver = "ESRI Shapefile"). Last but not least, in order to have details about drivers available, have a look at the data frame st_drivers() returns.

Import Shapefile data

The shapefile contains polygons delimiting the woods of the Montreal agglomeration and information about the forest composition (found here).

# Download shapefile from web page in your working directory
if (!file.exists("data/bois.zip"))
  download.file("http://donnees.ville.montreal.qc.ca/dataset/29791562-f050-401e-b405-5c1fbf427f65/resource/9fa20d3a-5dee-43d6-9bc9-5d86fe225e16/download/bois.zip", destfile = "data/bois.zip")

unzip("data/bois.zip", exdir = "data")

# Read shapefile in R
bois <- st_read(dsn = "data/", layer = "bois")
## Reading layer `bois' from data source `/home/travis/build/inSileco/inSileco.github.io/content/post/rinspace_import/data' using driver `ESRI Shapefile'
## Simple feature collection with 2716 features and 4 fields
## geometry type:  MULTIPOLYGON
## dimension:      XY
## bbox:           xmin: -73.97557 ymin: 45.40957 xmax: -73.48204 ymax: 45.69943
## epsg (SRID):    4326
## proj4string:    +proj=longlat +datum=WGS84 +no_defs
head(bois)
## Simple feature collection with 6 features and 4 fields
## geometry type:  MULTIPOLYGON
## dimension:      XY
## bbox:           xmin: -73.86056 ymin: 45.43293 xmax: -73.49655 ymax: 45.69584
## epsg (SRID):    4326
## proj4string:    +proj=longlat +datum=WGS84 +no_defs
##   OBJECTID GR_ESSENC Shape_Leng Shape_Area                       geometry
## 1        2  Feuillus   515.4464   7199.995 MULTIPOLYGON (((-73.49961 4...
## 2        3  Feuillus   364.7666   8354.159 MULTIPOLYGON (((-73.51461 4...
## 3        5  Feuillus   134.2015   1139.293 MULTIPOLYGON (((-73.56527 4...
## 4        6  Feuillus   336.2568   4222.548 MULTIPOLYGON (((-73.4967 45...
## 5        8  Feuillus   510.9489   5168.669 MULTIPOLYGON (((-73.85932 4...
## 6        9  Feuillus   605.2992  10936.294 MULTIPOLYGON (((-73.85413 4...

The bois dataset has been turned into a MULTIPOLYGON object and has the same CRS (EPSG: 4326) than the sample points we have manipulated above. This allows us to work directly with the two objects otherwise we should have transformed one dataset using the CRS of the other. To plot only the geometry and not all attributes, we retrieve the geometry list-column using st_geometry():

plot(st_geometry(bois))

To plot the polygons with a thematic color scale according to one attribute of interest, we actually subset the object (here we use the name of the column):

plot(bois["Shape_Area"])

Import GeoJSON file

This GeoJSON dataset contains watercourses (stream, river) and main ditches of the Montreal agglomeration (available here). Hence, it is a MULTILINE object.

# Download shapefile from web page in your working directory
if (!file.exists("data/courseau.geojson"))
  download.file("http://donnees.ville.montreal.qc.ca/dataset/c128aff5-325c-4599-ab66-1c9d0b3abc94/resource/0f64976e-19c1-4d29-bcc5-4b663a392617/download/courseau.geojson", destfile="data/courseau.geojson")

# For GeoJSON, dsn may be the character string holding the geojson data

courseau <- st_read(dsn = "data/courseau.geojson")
## Reading layer `courseau' from data source `/home/travis/build/inSileco/inSileco.github.io/content/post/rinspace_import/data/courseau.geojson' using driver `GeoJSON'
## Simple feature collection with 1306 features and 5 fields
## geometry type:  MULTILINESTRING
## dimension:      XY
## bbox:           xmin: -73.97268 ymin: 45.41593 xmax: -73.4975 ymax: 45.69939
## epsg (SRID):    4326
## proj4string:    +proj=longlat +datum=WGS84 +no_defs
head(courseau)
## Simple feature collection with 6 features and 5 fields
## geometry type:  MULTILINESTRING
## dimension:      XY
## bbox:           xmin: -73.9206 ymin: 45.42815 xmax: -73.9066 ymax: 45.47614
## epsg (SRID):    4326
## proj4string:    +proj=longlat +datum=WGS84 +no_defs
##   OBJECTID_1              NOM     TYPE Shape_Le_1 NuméroRui
## 1          1 rivière à l'Orme  rivière  177.95299      <NA>
## 2          2 rivière à l'Orme  rivière  128.51146      <NA>
## 3          3             <NA>    fossé  172.42988      <NA>
## 4          4 rivière à l'Orme  rivière  216.66838      <NA>
## 5          8             <NA>    fossé  540.29539      <NA>
## 6          9             <NA> ruisseau   97.66412      <NA>
##                         geometry
## 1 MULTILINESTRING ((-73.9107 ...
## 2 MULTILINESTRING ((-73.90824...
## 3 MULTILINESTRING ((-73.90667...
## 4 MULTILINESTRING ((-73.91472...
## 5 MULTILINESTRING ((-73.91029...
## 6 MULTILINESTRING ((-73.9206 ...
plot(st_geometry(courseau))

Import raster data

We now import raster data use a .tif file file of a canopy index from Montreal (found here). The canopy index, computed by the Montreal Metropolitain Community (CMM) from an NDVI index and an elevation surface model, represents the low vegetation cover, the high vegetation cover of more than 3 meters (the canopy), the low mineral surfaces and the high mineral surfaces more than 3 meters (roof).

# Download tif file from web page in your working directory
if (!file.exists("data/canopee.zip"))
  download.file("http://cmm.qc.ca/fileadmin/user_upload/geomatique/IndiceCanopee/2015/660_Canopee2015_3m.zip", destfile="data/canopee.zip")

unzip("data/canopee.zip", exdir = "data")

# Read tif in R using raster
# The file named "660_CLASS_3m.tif" contains the canopy index for all the Montreal area, so we can read this file only

canopee_mtl <- raster("data/660_CLASS_3m.tif")

The canopy index raster has values from 1 to 5, has nrow(canopee_mtl) pixels by row and ncol(canopee_mtl) pixels by column. Note that raster uses the proj4string representation of the coordinate reference system.

canopee_mtl
## class       : RasterLayer 
## dimensions  : 35754, 40854, 1460693916  (nrow, ncol, ncell)
## resolution  : 1, 1  (x, y)
## extent      : 265961, 306815, 5027324, 5063078  (xmin, xmax, ymin, ymax)
## coord. ref. : +proj=tmerc +lat_0=0 +lon_0=-73.5 +k=0.9999 +x_0=304800 +y_0=0 +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs 
## data source : /home/travis/build/inSileco/inSileco.github.io/content/post/rinspace_import/data/660_CLASS_3m.tif 
## names       : X660_CLASS_3m 
## values      : 1, 5  (min, max)

Similar to the sf package, raster also provides plot methods for its own classes.

plot(canopee_mtl)