Monday, April 27, 2020

Using R to quantify land use around a sampling point

Backstory: I was in Argentina in 2017 for a bird meeting at Iguazu Falls. The meeting was fantastic and I was able to see a bunch of old friends from my tropical days and meet a bunch of new friends and there were many great talks. The two students that went with me did a great job presenting our poster. There were two big downers. One, a student was robbed by a taxi driver in Iguazu. She had to hand over her phone and wallet and he dropped her off along the road in the middle of a national park. Very not cool. Two, in Buenos Aires we were robbed by a taxi when our driver took off with our backpacks as we were removing our luggage. Now I've been taking pictures of taxi drivers and my photos are automatically uploaded. 

We're all told to back up our important documents and that I did. I had code on my hard drive and on a USB drive but both were stolen in the backpack - so I now I have everything on a Google drive and one other place. This will be my third place for this code. 

The problem: So, let's suppose you're doing something ecological at a particular site and you have a bunch of those sites. If you are working in a heterogeneous system you might be interested in the interaction between what is happening at a site and the landscape around the site. Since what constitutes the landscape is nebulous, we can examine the effect of spatial extent on site properties. For example, take the decision of a bird to use a bird house or not. The bird likes trees around the nesting site but does it need to be a forest or just a woodlot? I often refer to larger spatial extents as the context (city, forest, suburban) and smaller spatial extents of local conditions (e.g., a woodlot in a city, a parking lot in the country, and, of course, a woodlot in the woods and a parking lot in the city). So maybe the bird's decision is based on context or local conditions or a combination (much much tougher to sort out since everything in the landscape tends to correlated). 

The general approach: Your ecological points need to geo-referenced in a spreadsheet. To that we will add land use information that we will extract from remote sensing. Then we can run statistics linking the ecological information to the land use. A focal analysis is used where a sampling point serves as the center of a circular buffer. We will count the number of pixels and estimate the proportions of land cover by the relative number of pixels. We will use a 1000 m buffer and a 200 m buffer nested within the large buffer. 

An example of a point (star) where ecological data were collected. The larger circle gives the context of the point (urban) and the smaller circle indicates the local conditions (wood lot).

The land use data set: I use the NLCD 2016 Land Use data from the MRLC website. The map is a raster image (matrix of pixels with numbers representing different land uses). Each pixel is 20 x 20 m pixels so a few km can represent a great deal of data. The image below is roughly, the whole spatial extent that I work. The red banana is the Scranton/Wilkes-Barre greater urban area (aka "The Valley"). To the east is NJ and NY to the northeast. The squiggly line leading to the Scranton/Wilkes-Barre area is the North Branch of the Susquehanna. 
I like examining how ecology changes over a gradient and, unfortunately, the Scranton/Wilkes-Barre area does not really have much of a gradient. The red is high intensity urban and green is forest. You can see that The Valley is intensely urban and the area surrounding the value is very green and beyond that a mix. That ring of green around The Valley is the steep slopes of the surrounding land that is heavily forested (which is nice). To the west and north is farm land (yellow) and to the east and south is a mix of small towns, lakes, and game lands. 

The data are downloaded as a Geo-TIF

The ecology: For this example, I am using predation rates on model clay caterpillars we put on branches to examine predation by birds. When birds bite the clay they leave behind a mark and we consider the caterpillar predated. We put out 20 at a site and those sites are georeferenced with lat and long recorded in degrees decimal (e.g., 72.234). We go back a week later and count to the bite marks to get a predation rate.

And, finally, the code

install.packages("raster", "tmap", "sf", “sp”)

library(raster)
library(sp)
library(sf)
library(tmap)

# the file is NLCD_2016.tiff
landcov <- raster(file.choose()) 
# check it out
plot(landcov)

#get the coordinate reference system to show the spatial info 
crs(landcov)


# get the ecological sampling/clay model data
clay <- read.csv(file.choose(), header=T)

#  turn the points into spatial points and reference them to the location
clay.pts <- st_as_sf(clay, coords=c("long","lat"), crs = 4326)

# look at the coordinate reference system (to make sure the above worked) 
crs(clay.pts)

library(tmap, gstat)
# create a map with the points on top of the map
tm_shape(landcov) + tm_raster() +
  tm_shape(clay.pts) + tm_dots(size = 0.5)

# extract 
landextract2 <- extract(landcov,             # raster layer
  clay.pts,   # SPDF with centroids for buffer
    buffer = 1000,     # buffer size, units depend on CRS
normalizeWeights=TRUE,
    fun=NULL,         # what to value to extract
    df=TRUE)         # return a dataframe? 

# make the output a data frame so I can mess with it as I know how
land <- as.data.frame(landextract2)
# make R count thousands and thousands of pixels for me
a <- table(land$ID, land$NLCD_2016)
write.table(a, "clipboard", sep="\t")

From this run a bunch of models and we can create this graph

Hope this helps someone!! 






No comments:

Post a Comment