library(sf)
library(terra)
library(tidyverse)
library(spData)
library(spDataLarge)
library(magrittr)
So far, we have learned how to perform operations on your data according to the attribute (non-spatial data) and then using the spatial data itself. This section will teach you how to perform operations on the actual spatial geometry data which will allow you to manipulate the geometries, add buffers, or create centroids.
Vector geometries can be manipulated using either unary or binary operations. Unary operations only use the geometry that you are manipulating. Binary operations rely on a separate geometry to make the changes to your target geometry.
Raster geometries can also be manipulated to change the resolution (number of pixels) and the extent and origin of the raster. This is useful when trying to align different raster data so that you can perform map algebra.
Finally, we will discuss raster-vector interactions.
This section focuses on operations performed on the geometry of ‘sf’ objects which means we will be working with objects of class ‘sfc’ in addition to objects of class ‘sf’
Simplification generalizes lines and polygons to remove detail from the geometries. This is useful to reduce the amount of memory required to visualize these geometries. st_simplify() uses an algorithm for simplifying known as the Douglas-Peucker algorithm with an argument of dTolerance to control the intensity of the simplication.
plot(st_geometry(seine))
seine_simp = st_simplify(seine, dTolerance = 2000) # 2000 m
plot(st_geometry(seine_simp))
We can also do this with polygons
us_states2163 = st_transform(us_states, 2163)
plot(st_geometry(us_states2163))
us_states_simp1 = st_simplify(us_states2163, dTolerance = 100000) # 100 km
plot(st_geometry(us_states_simp1))
One issue that you can see is that st_simplify() works on a per-geometry basis. In order to make the borders line up we can use ms_simplify() from ‘rmapshaper’
#install.packages("rmapshaper")
library(rmapshaper)
## Registered S3 method overwritten by 'geojsonlint':
## method from
## print.location dplyr
# proportion of points to retain (0-1; default 0.05)
us_states2163$AREA = as.numeric(us_states2163$AREA)
us_states_simp2 = rmapshaper::ms_simplify(us_states2163, keep = 0.01,
keep_shapes = TRUE)
plot(st_geometry(us_states_simp2))
Centroid operations identify the center of geographic objects. Like the center of anything, there are different ways to define it (e.g. how would you define the center of your body?). One of the most common ways to define the center of geographic objects is to use the center of mass (imagine cutting out the map of Georgia from a piece of paper and balancing it on your finger). st_centroid() allows for calculation of geographic centroids.
nz_centroid <- st_centroid(nz)
## Warning in st_centroid.sf(nz): st_centroid assumes attributes are constant over
## geometries of x
seine_centroid <- st_centroid(seine)
## Warning in st_centroid.sf(seine): st_centroid assumes attributes are constant
## over geometries of x
It’s possible that the centroid could fall outside of the geometry (consider a donut). In this case, you can use st_point_on_surface which will ensure that points are on the surface of the geometry
nz_pos = st_point_on_surface(nz)
## Warning in st_point_on_surface.sf(nz): st_point_on_surface assumes attributes
## are constant over geometries of x
seine_pos = st_point_on_surface(seine)
## Warning in st_point_on_surface.sf(seine): st_point_on_surface assumes attributes
## are constant over geometries of x
library(tmap)
p_centr1 = tm_shape(nz) + tm_borders() +
tm_shape(nz_centroid) + tm_symbols(shape = 1, col = "black", size = 0.5) +
tm_shape(nz_pos) + tm_symbols(shape = 1, col = "red", size = 0.5)
p_centr2 = tm_shape(seine) + tm_lines() +
tm_shape(seine_centroid) + tm_symbols(shape = 1, col = "black", size = 0.5) +
tm_shape(seine_pos) + tm_symbols(shape = 1, col = "red", size = 0.5)
tmap_arrange(p_centr1, p_centr2, ncol = 2)
Buffers are polygons representing the area within a given distance of a geometric feature: regardless of whether the input is a point, line or polygon, the output is a polygon. Buffers are often used for data analysis when you want to know answers to questions such as: how many points are within this distance of this line?
seine_buff_5km = st_buffer(seine, dist = 5000)
seine_buff_50km = st_buffer(seine, dist = 50000)
plot(st_geometry(seine))
plot(st_geometry(seine_buff_5km))
plot(st_geometry(seine_buff_50km))
Affine transformation is any transformation that preserves lines and parallelism. However, angles or length are not necessarily preserved. Affine transformations include, among others, shifting (translation), scaling and rotation. Additionally, it is possible to use any combination of these. Affine transformations are an essential part of geocomputation.
The ‘sf’ package implements affine transformation for objects of classes ‘sfg’ and ‘sfc’.
First, make an object of class ‘sfc’
nz_sfc = st_geometry(nz)
Shifting moves every point by the same distance in map units.
nz_shift = nz_sfc + c(0, 100000)
Scaling enlarges or shrinks objects by a factor. This can be done globally or locally. If global, the relative position of the geometry features will stay the same and be scaled together. If scaling locally, then each of the geometry features will scale relative to a point that you must define (usually the centroid).
nz_centroid_sfc = st_centroid(nz_sfc)
nz_scale = (nz_sfc - nz_centroid_sfc) * 0.5 + nz_centroid_sfc
Rotating requires a rotation matrix to be defined
rotation = function(a){
r = a * pi / 180 #degrees to radians
matrix(c(cos(r), sin(r), -sin(r), cos(r)), nrow = 2, ncol = 2)
}
nz_rotate can then be used with the defined rotation matrix wrapped in the rotation function
nz_rotate = (nz_sfc - nz_centroid_sfc) * rotation(30) + nz_centroid_sfc
st_crs(nz_shift) = st_crs(nz_sfc)
st_crs(nz_scale) = st_crs(nz_sfc)
st_crs(nz_rotate) = st_crs(nz_sfc)
p_at1 = tm_shape(nz_sfc) + tm_polygons() +
tm_shape(nz_shift) + tm_polygons(col = "red") +
tm_layout(main.title = "Shift")
p_at2 = tm_shape(nz_sfc) + tm_polygons() +
tm_shape(nz_scale) + tm_polygons(col = "red") +
tm_layout(main.title = "Scale")
p_at3 = tm_shape(nz_sfc) + tm_polygons() +
tm_shape(nz_rotate) + tm_polygons(col = "red") +
tm_layout(main.title = "Rotate")
tmap_arrange(p_at1, p_at2, p_at3, ncol = 3)
Finally, the newly created geometries can replace the old ones with the st_set_geometry() function:
nz_scale_sf = st_set_geometry(nz, nz_scale)
Spatial clipping is a form of spatial subsetting that only affects the geometry columns of some of the affected features. Clipping only applies to features more complex than points.
This creates a venn diagram with two points x and y for us to use as an example
b = st_sfc(st_point(c(0, 1)), st_point(c(1, 1))) # create 2 points
b = st_buffer(b, dist = 1) # convert points to circles
plot(b)
text(x = c(-0.5, 1.5), y = 1, labels = c("x", "y")) # add text
This selects the area that is shared between x and y
x = b[1]
y = b[2]
x_and_y = st_intersection(x, y)
plot(b)
plot(x_and_y, col = "lightgrey", add = TRUE) # color intersecting area
Use these other functions instead of st_intersection() to try and select other areas of the venn diagram: st_union(), st_difference(), st_sym_difference()
When working with attribute data, we were still able to aggregate features and dissolve polygons together. When we did this we used summarize() and aggregate() but the function that was working behind the scenes to dissolve the polygons was actually st_union(). This function can take two geometries and then unite them.
us_west = us_states[us_states$REGION == "West", ]
us_west_union = st_union(us_west)
Geometry casting is a powerful operation that enables transformation of the geometry type. st_cast() is used for this operation and it is important to note that it will behave differently on ‘sfg’, ‘sfc’, and ‘sf’ objects. st_cast() will allow you to change between points, lines, and polygons. When changing between multi versions of these, it will take a multi-object and turn it into multiple different objects. Refer to the online textbook for more information on using st_cast and perform geometric type transformations.
Geometric raster operations include the shift, flipping, mirroring, scaling, rotation or warping of images.
There are certain geometric operations on raster data that R does not handle well so if you need to perform these then you can work with dedicated GIS softwares and the online textbook also goes in to bridging between R and GIS software in chapter 9.
However, R can handle operations such as changing the extent, resolution, and origin of a raster. This can be important when trying to match raster layers and it can also be useful to aggregate (lower the resolution) of a raster so that it is less computationally intensive.
We previously saw how to extract values from a raster overlaid by other spatial objects. To retrieve a spatial output, we can use almost the same subsetting syntax. The only difference is that we have to make clear that we would like to keep the matrix structure by setting the drop argument to FALSE. This will return a raster object containing the cells whose midpoints overlap with clip.
elev = rast(system.file("raster/elev.tif", package = "spData"))
clip = rast(xmin = 0.9, xmax = 1.8, ymin = -0.45, ymax = 0.45,
resolution = 0.3, vals = rep(1, 9))
crop(elev,clip)
## class : SpatRaster
## dimensions : 2, 1, 1 (nrow, ncol, nlyr)
## resolution : 0.5, 0.5 (x, y)
## extent : 1, 1.5, -0.5, 0.5 (xmin, xmax, ymin, ymax)
## coord. ref. : +proj=longlat +datum=WGS84 +no_defs
## source : memory
## name : elev
## min value : 18
## max value : 24
For the same operation we can also use the intersect() and crop() command.
In order for rasters to match, the extent and origin of the raster need to match. When using publicly available data such as satellite imagery, the files may often come in different resolution or projections.
If the rasters only differ in their extent then you can use extend() with both rasters as the argument so that the smaller one extends to match the larger one. Also, terra will conveniently throw an error if two rasters do not match.
elev = rast(system.file("raster/elev.tif", package = "spData"))
elev_2 = extend(elev, c(1, 2)) #extended to be longer than elev
Try to run this code chunk. What happens?
#elev_3 = elev + elev_2
elev_4 = extend(elev, elev_2)
By default, the origin is the cell corner closest to the coordinates (0,0). origin() will retrieve the origin of a raster.
origin(elev_4)
## [1] 0 0
elev_4 conveniently has a corner with coordinates (0,0)
If the origin does not match then the rasters will not match. This can be fixed by using origin() again with the raster as the argument.
# change the origin
#origin(elev_4) <- c(0.25, 0.25)
plot(elev_4)
plot(elev, add = TRUE) # and add the original raster
Raster datasets can also differ with regard to their resolution. To match resolutions, one can either decrease, using aggregate(), or increase, using disagg(), the resolution of one raster.
In this example, we decrease the resolution by aggregating by a factor of 5. The resulting cell uses the mean of the input cells but the function could be set to other functions such as median() or sum().
data(dem, package="spDataLarge")
dem <- rast(dem)
dem_agg = aggregate(dem, fact = 5, fun = mean)
plot(dem)
plot(dem_agg)
By contrast, the disaggregate() function increases the resolution. However, we have to specify a method on how to fill the new cells. The disaggregate() function provides two methods. The default one (method = “near”) simply gives all output cells the value of the input cell, and hence duplicates values which leads to a blocky output image.
dem_disagg = terra::disaggregate(dem_agg, fact = 5, method = "near")
bilinear is another method that may produce better looking results.
dem_disagg = terra::disaggregate(dem_agg, fact = 5, method = "bilinear")
However, with disaggregation we are always interpolating the values for the new cells so the outputs will always be limited by the lower resolution input. Unfortunately,there is no enhance() function.