Light-Level geolocator analysis using the SGAT package

Michael T. Hallworth
Smithsonian Conservation Biology Institute
Migratory Bird Center

Introduction to Archival light-level geolocators

The following document outlines the steps for analyzing data from archival light-level geolocators (hereafter geolocators). Geolocators have been used to track individuals since the early 1990s but were restricted to large organisms because of their large size. Recently, with the miniturization of geolocators, researchers are now able to deploy geolocators on smaller and smaller species. Geolocators are devices that record ambient light levels every 2, 5, or 10 min depending on the model. Geolocators are attached to individuals which then migrate with the device while it records ambient light-levels throughout the year. Once recovered, the data are downloaded and analyzed to determine the time of sunrise and sunset. From the sunrise/sunset data the time of relative noon and midnight are determine. Geographical cooridnates are then derived from the relative ‘noon’ and ‘midnight’ times to give an approximate location of where the individual was throughout the year.

This tutorial uses geolocator data from two male Ovenbirds (Seiurus aurocapilla). One breeding at Prince Albert National Park, Saskatchewan, CA and another at Hubbard Brook Experimental Forest, NH. These individuals were used in a previous project with quantified migratory connectivity of Ovenbirds which was published in 2015. Click here for more details.The geolocators used for the two projects were purchased from British Antarctic Survey (BAS) and Lotek.

Below is an example of how to analyze light-level geolocators using the SGAT package (pronouced Tags backwards).

We will need a few packages to complete the analysis. Two of the packages are only available through github. You can use the devtools library to download the packages. You will need to username and package name to download the package using devtools.

Load the required packages for the analysis.

# Install necessary packages from Github using the devtools library #
library(devtools)
install_github("SWotherspoon/SGAT")
install_github("SWotherspoon/BAStag")

Once you have the required packages you can load them into the R environment.

library(raster)
library(sp)
library(rgeos)
library(geosphere)
library(SGAT)
library(BAStag)
library(MASS)
library(maptools)
data(wrld_simpl)

We next need to read in the raw data from the geolocator into R. The code to format the raw data off geolocators purchased from BAS can be read in using the source code.

source("read_lig.R")

The following code runs through the analysis of light-level geolocators. The general workflow is as follows:

Read in the data
Use calibration data from capture location to determine error-distribution around true sunrise sunset times.
Determine transition events sunrise/sunset
Derive location data in a Bayesian framework by:

initalizing migratory paths and subsequent (non-breeding) locations
generate a bunch of paths given the error distribution determined above
generate the final path based on the refined parameters

Read in the data for multiple individuals

# Get the names of the files for birds you want to analyze #
GLfiles<-list.files(paste0(getwd(),"/data"),pattern = ".lig")

# Create an empty list to store the data #
GLdata<-vector('list',length(GLfiles))

# Read each lig file into R #
for(i in 1:length(GLfiles)){
GLdata[[i]]<-read.lig(paste0("data/",GLfiles[i]))

# save only the date and light data from the lig file #
GLdata[[i]]<-GLdata[[i]][,c("Date","Light")]
}

# Confirm that we only have two columns with date and light # 
# show the first 5 rows of data from the first .lig file #
head(GLdata[[1]])

##                  Date Light
## 1 2011-06-09 02:05:27    44
## 2 2011-06-09 02:07:27    26
## 3 2011-06-09 02:09:27    64
## 4 2011-06-09 02:11:27    36
## 5 2011-06-09 02:13:27    36
## 6 2011-06-09 02:15:27    36

In order to determine the error distribution around transition events (sunrise/sunset) the true sunrise/sunset data are needed. Here, you need to provide the coordinates of where the bird was captured (as close to the true location as possible).

# Set the capture coordinates for each bird #
cap.lat<-c(53.898,43.94)
cap.long<- c(-106.168,-71.73)

The next portion of code will estimate the predicted twilight times given the capture location and a preset zenith angle (sun-elevation).

# Create an emply list to store the predicted twilight times given location and zenith #
# zenith here is analagous to sun-elevation angel in Geolight - different values but similar concept #
tm<-rise<-vector('list',length(GLfiles))

for(i in 1:length(GLfiles)){
tm[[i]] <- seq(GLdata[[i]][1,1], GLdata[[i]][nrow(GLdata[[i]]),1], by = "day")
rise[[i]] <- rep(c(TRUE, FALSE), length(tm[[i]]))
}

# making predicted twilight times given location and zenith #
c.dat<-vector('list',length(GLfiles))

for(i in 1:length(GLfiles)){
c.dat[[i]] <- data.frame(Twilight = twilight(rep(tm[[i]], each = 2),
                    lon = cap.long[i], 
                    lat = cap.lat[i], 
                    rise = rise[[i]], zenith = 93.17),
                    Rise = rise[[i]]) 
}

offset=19

# Plot the expected sunrise/sunset times per day and the measured sunrise/sunset 
# Dark periods are shown in white areas on the graph - the black represents daylight recoreded by the geolocator #
# Sunrise shown in blue - sunset shown in red #

lightImage(GLdata[[2]], offset = offset, zlim = c(0,12)) 
tsimagePoints(c.dat[[2]]$Twilight, offset = offset, 
              pch = 16, cex = 0.5, 
              col = ifelse(c.dat[[1]]$Rise, "dodgerblue", "firebrick"))
# adds line at two equinoxes for Change the dates if necessary (can vary by year) #
eqnx<-as.POSIXct(c("2011-09-23", "2012-03-20"), tz = "GMT") 
abline(v = eqnx, lwd=3, lty=3, col="purple")

The above figure shows the light data recorded by the geolocator. The light area of the figure indicates daylight hours the dark portions indicate no light data. The blue and red lines show sunrise and sunset times, respectively at the capture location. The vertical purple line represents the equinox. Notice that day length gets shorter during the winter months. In other words, the expected dark period gets larger in Nov-Jan at the capture location. This is exactly what is expected during the winter months in North America. Large deviations from the anticipated sunrise and sunset times likely indicate that the bird had moved from that location.

Exercise:

Can you tell about when the bird left the study area and approximately when it returned again?

Define twilight transitions

This next step can’t be done interactively. Here, I provide the code and some photos of the process.

# set threshold #
threshold <- 1
# Create a list to hold the transition data #
twl<-vector('list',length(GLfiles))

# Here we define the transitions #
# NOTE - this is an interactive processes !
# Two plot windows will open for each bird (even in a loop it goes through one ind at a time #
# The two plot windows may open on top on one another - be sure to move them so you can see both#
# The process takes 4 steps - pressing "a" moves you to the next step 
# Step 1) Determine the range of dates to find twilights 
#          Plot heading = "Select Subset"
#          The current selection is shown as a red bar over the plot
#         1) To change the selection - left click on the start date and right click on the end date
#            press "a"
# Step 2) Determine a few transitions so that the whole file can be done 'automatically' 
#          Plot heading = "Find Twilights"
#         1) left click on the plot with the heading in an area that looks "decent"
#            the second plot should zoom to that area
#         2) click on the night (white) portion of the figure for both sunrise and sunset 
#            where there is a nice solid black to white transition - continue to do this until
#            the transitions show up automatically - basically you are providing a "training"
#            dataset to search for transitions - once all transitions show up (Orange and Blue)
#         3) press "a"
# Step 3) Adding transitions - this is not discussed here - we will take all the ones that were found
# Step 4) Editing transitions - this is not discussed here - we will take all the ones that were found
# IMPORTANT - press "q" to save the data to R 

for(i in 1:length(GLfiles)){
twl[[i]] <- preprocessLight(GLdata[[i]], threshold, offset = offset, zlim = c(0, 12))
}

First, select the time frame you’re interested in - here I selected all the data. The current selection is shown as a red bar across the top of one of the figures. Press “a” to select all data. If you want to select only a portion of the data - left click on the first date and right click on the second date - the red bar at the top shows which dates are selected. Press “a” to continue

SelectSubset

Once you accept the time frame by pressing “a” - select an area with strong constrasts - sunrise and sunset - a zoomed in window should appear (bottom panel below). Strong constrasts are shown as hard breaks between light and dark - areas that look flared or feathered (for lack of a better term) are not good areas to select.

DefineTwilights

Using the zoomed in image - click at the edge of the black where the constrasts are large - continue to do so until the transitions are highlighted. Once highligted press “a” to make your selection.

TwilightsHiglighted

We will not insert any new transitions or edit any transitions in the dataset - press “a” until you see the individual transitions. However, if you wanted to edit transitions you could do that here. For example, some transitions at the start or end of the file may show weird transtions - these may have occurred prior to deploying the unit but after it started collecting data or after recovering it but before it was downloaded. Those data can influence calibrartion data later on - those spurious transitions can be removed now or later in the process. I chose to remove them later.

Important To save the transtions into the R environment press “q”.

Exercise:

Find and assign twilights. Unfortunately, the interactive process only works on PCs.

I saved the twilights into a csv file. I suggest doing this so that 1) you don’t need to assign twilights every time you open a new session in R and 2) to have a record of the twilight times that were used in the analysis (for reproducibility). Use the following code to read in the data in the correct format for analysis.

twl<-vector('list',length(GLfiles))
twl[[1]]<-read.csv("data/twl1561_25060.csv",
                   colClasses=c("character","logical","logical","integer","logical","character","integer"))
twl[[2]]<-read.csv("data/twl2391_68559.csv",
                   colClasses=c("character","logical","logical","integer","logical","character","integer"))

for(i in 1:length(GLfiles)){
twl[[i]][,1]<-as.POSIXct(strptime(twl[[i]][,1],"%Y-%m-%d %H:%M:%S",tz="GMT"))
twl[[i]][,6]<-as.POSIXct(strptime(twl[[i]][,6],"%Y-%m-%d %H:%M:%S",tz="GMT"))
}

The data should look like this now.
Note - the edits that you made in the previous step (if any) are now included in the data.frame. Having this information documented is a great step forward for reproducibility with regards to geolocator analyses.

##              Twilight  Rise Deleted Marker Inserted           Twilight3 Marker3
## 1 2011-06-09 03:56:25 FALSE   FALSE      0    FALSE 2011-06-09 03:57:25       0
## 2 2011-06-09 10:51:27  TRUE   FALSE      0    FALSE 2011-06-09 10:51:27       0
## 3 2011-06-10 03:57:27 FALSE   FALSE      0    FALSE 2011-06-10 03:58:27       0
## 4 2011-06-10 09:59:27  TRUE   FALSE      0    FALSE 2011-06-10 09:59:27       0
## 5 2011-06-11 04:03:27 FALSE   FALSE      0    FALSE 2011-06-11 04:04:27       0
## 6 2011-06-11 09:59:27  TRUE   FALSE      0    FALSE 2011-06-11 09:59:27       0

Sunsets need to be corrected for the sampling duration of the tag you’re using. These particular tags sampled every 2 minutes - thus the sampling interval in seconds that we need to adjust sunsets by 60 seconds - the following code does just that.

for(i in 1:length(GLfiles)){
twl[[i]]<-twilightAdjust(twilights=twl[[i]], interval=60)
}

Calibration

This next step will use the transitions we just defined to determine
1) Error distribution around true twilight
2) Determine the zenith angle (the angle measured from directly overhead to the geometric center of the sun’s disc) This value is roughly 90 degrees off from the sun-elevation angle.

Next we Define the dates when the individual was known to be at the deployment site. These dates are important because the zenith angle is calculated using data from a known capture location. Including dates here when the bird was not in the correct location can lead to errors in the zenith angle and lead to incorrect location estimates later in the analysis. The longer the period the better - here I decided to cut the calibration date on July 31st because I know Ovenbirds are still at the breeding site on that date (the date I stopped fieldwork). I can’t be sure birds were there after that point, to be safe I used July 31st. It’s important to consider the species ecology when defining these periods. For example, if birds use wildly different habitats after breeding, the zenith angle may change as they move between habitat types. Keep these types of things in mind when defining the calibrartion period.

# Create a vector with the dates known to be at deployment #
calib.dates<-vector('list',length(GLfiles))
calib.dates[[1]]<-as.POSIXct(c("2011-06-10","2011-07-31"))
calib.dates[[2]]<-as.POSIXct(c("2011-06-30","2011-07-31"))

Once those dates are in R we can subset the twilight data to include only those dates so we can calculate the twilight error around the known sunrise/sunset. This is the error between known sunrise/sunset times and the time when the geolocator indicated sunrise/sunset.

# Extract the twilight data for that period #
cal.data<-vector('list',length(GLfiles))
for(i in 1:length(GLfiles)){
cal.data[[i]]<-subset(twl[[i]],twl[[i]]$Twilight>=calib.dates[[i]][1] & twl[[i]]$Twilight<=calib.dates[[i]][2])
}

Determining the zenith angle of the sun when the geolocator identifies a transition is very important. This angle can drastically change the location of the individual. Here, we also determine the error distribution around the known sun rise and when the geolocator indicated the sun had risen. This is needed to generate error around each location later in the analysis.

# Determine the sun elevation angle - here called the Zenith angle #

# create empty vectors to store data #
sun<-z<-zenith0<-zenith1<-twl_t<-twl_dev<-alpha<-fitml<-vector("list",length(GLfiles))

# loop through each of the two individuals #
for(i in 1:length(GLfiles)){
# Calculate solar time from calibration data 
sun[[i]]  <- solar(cal.data[[i]][,1])
# Adjust the solar zenith angle for atmospheric refraction
z[[i]]<- refracted(zenith(sun[[i]], cap.long[i], cap.lat[i]))

twl_t[[i]]   <- twilight(cal.data[[i]][,1], cap.long[i], cap.lat[i], rise = cal.data[[i]][,2], zenith = max(z[[i]])+0.1)
# Determine the difference in minutes from when the sun rose and the geolocator said it rose 
twl_dev[[i]] <- ifelse(cal.data[[i]]$Rise, as.numeric(difftime(cal.data[[i]][,1], twl_t[[i]], units = "mins")),
                  as.numeric(difftime(twl_t[[i]], cal.data[[i]][,1], units = "mins")))
# Describe the distribution of the error 
fitml[[i]] <- fitdistr(twl_dev[[i]], "log-Normal")
# save the Twilight model parameters
alpha[[i]] <- c(fitml[[i]]$estimate[1], fitml[[i]]$estimate[2]) 
}

Here are the median and 95% Confidence interval for the zenith (sun-elevation) angle during the calibrartion period.

##      50%     2.5%    97.5% 
## 93.03000 89.51742 94.36493 
##      50%     2.5%    97.5% 
## 92.11759 88.49339 95.29984

Let’s plot the error distribution to see what it looks like. The code used to generate the plots are below.

seq <- seq(0,60, length = 100)
par(mfrow=c(1,2))
for(i in 1:length(GLfiles)){
hist(twl_dev[[i]], freq = F,
     yaxt="n",
     ylim = c(0, 0.06),
     xlim = c(0, 60),
     breaks=15,
     main = paste(GLfiles[i]),
     xlab = "Twilight error (mins)")
axis(2,las=2)
lines(seq, dlnorm(seq, alpha[[i]][1], alpha[[i]][2]), col = "firebrick", lwd = 3, lty = 2)
}

Exercise:

Does the log-Normal distribution fit the data well? Try changing how the data are displayed in the histogram by changing the breaks value. Does the model still fit the data? Try some different distributions to see if they fit better/worse. Use ??fitdistr to see which distributions are available. Note- some error distributions will return errors.

Visualizing the prelimiary location data

Now that we have the error distribution around known sunrise and sunset times and the zenith angle we can look at the location data. First, we will subset the data to only show transitions that 1) were after the first calibrartion date (presumably the deployment date) and 2) were not deleted above. We didn’t add or delete any transitions but this statement would remove the twilights that were deleted above had edits been made.

Here we generate the first general path taken by the individual.

Important - the tol setting in the thresholdPath model sets the tolerance around equinox (i.e. filter out those points). Latitudinal estimates are unrelibale around the equinox periods because the change in day length is similar everywhere. tol values = 0 indicates save all points - larger tol values (0.2) filter out quite a few points. For this example, I set the tolerance to 0 (not the default) so I can look at all the data - even around equinox periods.

# Create empty vectors to store objects #
d.twl<-path<-vector('list',length(GLfiles))
zenith0<-zenith1<-rep(NA,length(GLfiles))
# loop through the two birds #
for(i in 1:length(GLfiles)){
# Store the zenith (sun-elevation angle)
zenith0[i] <-quantile(z[[i]],prob=0.5)
zenith1[i]<-quantile(z[[i]],prob=0.95)
}

# subset the twilight file for dates after the first calibration date (presumably the deployment date)  
# and exclude points that were deleted  
# note we didn't delete any transitions here

for(i in 1:length(GLfiles)){  
d.twl[[i]]<-subset(twl[[i]],twl[[i]]$Twilight>=calib.dates[[i]][1] & !Deleted)
path[[i]] <- thresholdPath(d.twl[[i]]$Twilight, d.twl[[i]]$Rise, zenith=zenith0[i],tol=0)
}

Make a plot that shows the path on a map and the change in latitude. The solid gray horizontal line represents the deployment latitude. The points should be very close to this line at the start. The red dotted lines represent autumnal and vernal equinoxes.

Note - the variability in latitude around this period.

par(mfrow = c(2, 2), mar = c(2,4,1,1)+0.1)
for(i in 1:length(GLfiles)){
plot(path[[i]]$time, path[[i]]$x[, 2], type = "b", pch = 16, cex = 0.5, ylab = "Lat", xlab = '')
abline(h = cap.lat[[i]])
abline(v = as.POSIXct("2011-09-23"),col="red",lty=2,lwd=1.5)
abline(v = as.POSIXct("2012-03-20"),col="red",lty=2,lwd=1.5)

plot(path[[i]]$x, type = "n")
plot(wrld_simpl, add = T, col = "grey95")
box()
lines(path[[i]]$x, col = "blue")
points(path[[i]]$x, pch = 16, cex = 0.5, col = "blue")
}

Excercise:

Play around with the tol values changing them slightly by small increments (0.01) and see how the positions change. What might be the benefit to filtering out those locations around the equinox period and replacing them with a modeled location? What assumptions are being made about the data / position / movements of the individuals? How would you know that you used the “correct” tol value?

note - after changing the tol values in the model you will need to display the figure again.

Creating a path

We can now use this initial path to start initialize the analysis done using Markov Chain Monte Carlo model. This model will sample the data, while incorporating error to predict the path for each individual. However, we need to give it some starting values. In this first example we won’t restrict the path to fall only on land or within the species distribution. That will be done further below.

The workflow for creating the final path is:
1) Give the model an inital path - just created above
2) define the mid-points between locations (needed to generate path)
3) define a flight distance parameter - this restrict really long - unlikely flights. (likelihood a bird will fly x distance based on flight speed)
4) run model multiple times until model converges and throw out the first interations as burn-in
5) refine the model further from previous runs
6) create the final path

Here, we define the initial path (x0) from our results above (the mapped route).

# Use path as inital path to start Estella model #
x0<-z0<-vector('list',length(GLfiles))
for(i in 1:length(GLfiles)){
  # Take the location estimates created above
x0[[i]]<- path[[i]]$x
  # the model also needs the mid-points - generate those here
z0[[i]]<- trackMidpts(x0[[i]])
}

This next bit of code - sets the flight distance parameter (beta), sets the capture and recapture location as fixed locations, sets the area in which to search for the path (xlim,ylim) - it is very important to set limits around your data - or where the bird could be. If the longitude and latitude don’t include your data the map will be empty.

Here we set the flight distance parameter. We use the gamma distribution for the flight distance because at least for Ovenbirds - they are stationary for most of the time (entire breeding and non-breeding seasons). However, when they move they are moving relatively large distances.

# Set the flight distance parameter #
beta <- c(0.7, 0.08)

We also need to set some of the locations as fixed locaitons where we know the bird was. These locations are the capture location and the recapture locations. This next bit of code sets the first few and last locations as fixed points.

# This function sets the known location of the bird - capture location and recapture location
fixedx<-vector('list',length(GLfiles))
for(i in 1:length(GLfiles)){
fixedx[[i]]<- rep(F, nrow(x0[[i]]))
fixedx[[i]][1:5] <- T
#fixedx[[i]][(nrow(x0[[i]])-5):nrow(x0[[i]])] <-T

x0[[i]][fixedx[[i]], 1] <- cap.long[i]
x0[[i]][fixedx[[i]], 2] <- cap.lat[i]

z0[[i]] <- trackMidpts(x0[[i]]) # update z0 positions
}

Here we set the longitudinal and latitudinal limits of our dataset. The error distribution around tracks will be limited to within this box. Set these values larger of an area than you need.

# xlim and ylim values need to span the range of your dataset #
xlim <- c(-115, -60)
ylim <- c(0, 90)

Now we can run the model. After the model runs we set the fixed locations.

# Create a list to store our model results
model<-vector('list',length(GLfiles))
for(i in 1:length(GLfiles)){
  
# Define the threshold model - slimilar to above #
model[[i]] <- thresholdModel(d.twl[[i]]$Twilight,d.twl[[i]]$Rise,
                             twilight.model="ModifiedLogNormal",
                             alpha=alpha[[i]],beta=beta,
                             # Here is where we will set the constraints for land/spp dist
                             #logp.x = log.prior, logp.z = log.prior, 
                             x0=x0[[i]],z0=z0[[i]],zenith=zenith1[i],fixedx=fixedx[[i]])

# Here you need to set the first few locations as "known locations"-fixed locations. These are periods  
# when you know the bird was at a specific location - capture site - when first deployed and captured. 
model[[i]]$fixedx<-c(model[[i]]$fixedx,rep(FALSE,(dim(model[[i]]$x0)[1]-length(model[[i]]$fixedx))))
model[[i]]$fixedx[(length(model[[i]]$fixedx)-3):length(model[[i]]$fixedx)]<-TRUE
}

Here we define the error distribution around each location.

# This defines the error distribution around each location #
proposal.x<-proposal.z<-vector('list',length(GLfiles))
for(i in 1:length(GLfiles)){
proposal.x[[i]] <- mvnorm(S=diag(c(0.005,0.005)),n=nlocation(x0[[i]]))
proposal.z[[i]] <- mvnorm(S=diag(c(0.005,0.005)),n=nlocation(z0[[i]]))
}

Finally fit the model - this can take a while - set the number of iterations to run the model with iters the larger the number of interations the longer it takes but more precise the estimates. To help save memory use the thin option to save every nth interation. The last parameter chains defines how many chains you want to run - if you run 2 chains of 1000 iterations you will have 2000 total iterations.

note this package does not currently run in parallel.

fit<-vector('list',length(GLfiles))
for(i in 1:length(GLfiles)){
fit[[i]] <- estelleMetropolis(model[[i]],proposal.x[[i]],proposal.z[[i]],iters=200,thin=25,chains=2)
}

Plot the intial results

Here we compile the neccessary information from the model and plot the inital results - which includes error.

# set plot parameters 
xlim = c(-115, -60)
ylim = c(0, 90)
opar <- par(mar=c(2,2,2,2)+0.1, mfrow=c(1,2))

# Gather the data needed to make the plot 
s<-fixedz<-dt<-im<-vector('list',length(GLfiles))

for(i in 1:length(GLfiles)){
s[[i]] <- locationSummary(fit[[i]]$x,time=model[[i]]$time,collapse=F)

fixedz[[i]] <- fixedx[[i]][-length(fixedx[[i]])] > 0 & fixedx[[i]][-length(fixedx[[i]])]==fixedx[[i]][-1]
dt[[i]] <- ifelse(fixedz[[i]],0,model[[i]]$dt)
im[[i]] <- locationImage(fit[[i]]$z,xlim=xlim,ylim=ylim,nx=4*diff(xlim),ny=4*diff(ylim),
                         weight=dt[[i]][1:dim(fit[[i]]$z[[1]])[1]],collapse=TRUE)

# Generate the plot #
  plot(wrld_simpl,col= "grey90",border="grey10" , xlim = range(s[[i]][[i]][,"Lon.mean"]), 
       ylim = range(s[[i]][[i]][,"Lat.mean"]))
  plot(elide(wrld_simpl,shift=c(360,0)),xlim=xlim,ylim=ylim,add=TRUE,col= "grey90",border="grey10")
  image(im[[i]]$x,im[[i]]$y,im[[i]]$W,xlab="",ylab="",cex.axis=0.7, add = T, 
        col = c("transparent", rev(topo.colors(200))))
  plot(wrld_simpl, add = TRUE)
  for(k in 1:2) {
    lines(s[[i]][[k]][,"Lon.mean"],s[[i]][[k]][,"Lat.mean"],
        col=rgb(t(col2rgb(c("darkblue", "darkgreen")))/255,alpha=0.4))
  }
abline(h=cap.lat[i],v=cap.long[i])
 box()
}

The above maps look pretty good. They include the error distribution (color ramp) and the estimated paths (lines).

Fine tuning the final path

Take the previous model results (the path above) and feed it into the same model as the inital path. This will fine tune the final model.

# Fine Tuning

for(i in 1:length(GLfiles)){
x0[[i]] <- chainLast(fit[[i]]$x)
z0[[i]] <- chainLast(fit[[i]]$z)

model[[i]] <- thresholdModel(d.twl[[i]]$Twilight,d.twl[[i]]$Rise,
                             twilight.model="LogNormal",
                             alpha=alpha[[i]],beta=beta,
                             #logp.x = log.prior, logp.z = log.prior,
                             x0=x0[[i]],z0=z0[[i]],zenith=zenith1[i],fixedx=fixedx[[i]])
model[[i]]$fixedx<-c(model[[i]]$fixedx,rep(FALSE,(dim(model[[i]]$x0[[1]])[1]-length(model[[i]]$fixedx))))
model[[i]]$fixedx[(length(model[[i]]$fixedx)-3):length(model[[i]]$fixedx)]<-TRUE
}

proposal.x<-proposal.z<-vector('list',length(GLfiles))
for(i in 1:length(GLfiles)){
proposal.x[[i]] <- mvnorm(S=diag(c(0.005,0.005)),n=nlocation(x0[[i]]))
proposal.z[[i]] <- mvnorm(S=diag(c(0.005,0.005)),n=nlocation(z0[[i]]))
}

fit<-vector('list',length(GLfiles))
for(i in 1:length(GLfiles)){
fit[[i]] <- estelleMetropolis(model[[i]],proposal.x[[i]],proposal.z[[i]],iters=200,thin=25,chains=2)

# Final Run

proposal.x[[i]] <- mvnorm(chainCov(fit[[i]]$x),s=0.25)
proposal.z[[i]] <- mvnorm(chainCov(fit[[i]]$z),s=0.25)

# Note the increase in number of interations - this takes a bit longer to run
fit[[i]] <- estelleMetropolis(model[[i]],proposal.x[[i]],proposal.z[[i]],
                          x0=chainLast(fit[[i]]$x),
                          z0=chainLast(fit[[i]]$z),
                          iters=1000,thin=20,chains=2)
}

Plot the final results

# xlim and ylim values need to span the range of your dataset #
xlim = c(-115, -60)
ylim = c(0, 90)
opar <- par(mar=c(2,2,2,2)+0.1, mfrow=c(1,2))

s<-fixedz<-dt<-im<-vector('list',length(GLfiles))
for(i in 1:length(GLfiles)){
s[[i]] <- locationSummary(fit[[i]]$x,time=model[[i]]$time,collapse=F)

fixedz[[i]] <- fixedx[[i]][-length(fixedx[[i]])] > 0 & fixedx[[i]][-length(fixedx[[i]])]==fixedx[[i]][-1]
dt[[i]] <- ifelse(fixedz[[i]],0,model[[i]]$dt)
im[[i]] <- locationImage(fit[[i]]$z,xlim=xlim,ylim=ylim,nx=4*diff(xlim),ny=4*diff(ylim),
                         weight=dt[[i]][1:dim(fit[[i]]$z[[1]])[1]],collapse=TRUE)

  plot(wrld_simpl,col= "grey90",border="grey10" , xlim = range(s[[i]][[i]][,"Lon.mean"]), 
       ylim = range(s[[i]][[i]][,"Lat.mean"]))
  plot(elide(wrld_simpl,shift=c(360,0)),xlim=xlim,ylim=ylim,add=TRUE,col= "grey90",border="grey10")
  image(im[[i]]$x,im[[i]]$y,im[[i]]$W,xlab="",ylab="",cex.axis=0.7, add = T, 
        col = c("transparent", rev(topo.colors(200))))
  plot(wrld_simpl, add = TRUE)
 for(k in 1:2) {
    lines(s[[i]][[k]][,"Lon.mean"],s[[i]][[k]][,"Lat.mean"],
        col=rgb(t(col2rgb(c("darkblue", "darkgreen")))/255,alpha=0.4))
  }
abline(h=cap.lat[i],v=cap.long[i])
 box()
}

Restrict path to include only the species distribution

You can restrict the path to locations on land - birds are still able to fly over water but have stationary locations on land. This makes sense for a terrestrial bird like Ovenbirds. It may not for other species, as always, consider the ecology of the species when conducting the analysis. Here, we will restrict the paths to the ovenbird distribution.

The process is exactly the same as above. However, now we include a prior distribution so only locations within the distritbutionare used. We first create a function to covert the distribution shapefile to a binary surface.

## Function to construct a land/sea mask
distribution.mask <- function(xlim, ylim, n = 4, land = TRUE,shape) {
    r <- raster(nrows = n * diff(ylim), ncols = n * diff(xlim), xmn = xlim[1], 
        xmx = xlim[2], ymn = ylim[1], ymx = ylim[2], crs = proj4string(shape))
    r <- cover(rasterize(elide(shape, shift = c(-360, 0)), r, 1, silent = TRUE), 
        rasterize(shape, r, 1, silent = TRUE), rasterize(elide(shape, 
            shift = c(360, 0)), r, 1, silent = TRUE))
    r <- as.matrix(is.na(r))[nrow(r):1, ]
    if (land) 
        r <- !r
    xbin <- seq(xlim[1], xlim[2], length = ncol(r) + 1)
    ybin <- seq(ylim[1], ylim[2], length = nrow(r) + 1)

    function(p) {
        r[cbind(.bincode(p[, 2], ybin), .bincode(p[, 1], xbin))]
    }
}

Next we read in the Ovenbird distribution shapefile and set the projection of the file.

OVENdist<-raster::shapefile("spatial_layers/seiu_auro_pl.shp")
WGS84<-"+proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0"
crs(OVENdist)<-WGS84

Run the function created above on the Ovenbird’s distribution shapefile.

## Define mask for Ovenbird distribution
is.dist <- distribution.mask(shape=OVENdist, xlim = xlim, ylim = ylim, n = 4, land = T)

Now set the prior based on whether the location falls within the distribution or not.

## Define the log prior for x and z
log.prior <- function(p) {
    f <- is.dist(p)
    ifelse(f | is.na(f), 0, -10)
}

Now we do the same process we did before:
1. creating the initial path
2. fine tuning the path
3. creating the final path

Here I only show the first run and the final product.

x0<-z0<-model<-vector('list',length(GLfiles))
for(i in 1:length(GLfiles)){
  # Take the location estimates created above
x0[[i]]<- path[[i]]$x
  # the model also needs the mid-points - generate those here
z0[[i]]<- trackMidpts(x0[[i]])  
# Define the threshold model - slimilar to above #
model[[i]] <- thresholdModel(d.twl[[i]]$Twilight,d.twl[[i]]$Rise,
                             twilight.model="ModifiedLogNormal",
                             alpha=alpha[[i]],beta=beta,
                    # Here is where we will set the constraints for land/spp dist
                             logp.x = log.prior, logp.z = log.prior, 
                             x0=x0[[i]],z0=z0[[i]],zenith=zenith1[i],fixedx=fixedx[[i]])

# Here you need to set the first few locations as "known locations"-fixed locations. These are periods  
# when you know the bird was at a specific location - capture site - when first deployed and captured. 
model[[i]]$fixedx<-c(model[[i]]$fixedx,rep(FALSE,(dim(model[[i]]$x0)[1]-length(model[[i]]$fixedx))))
model[[i]]$fixedx[(length(model[[i]]$fixedx)-3):length(model[[i]]$fixedx)]<-TRUE
}

proposal.x<-proposal.z<-vector('list',length(GLfiles))
for(i in 1:length(GLfiles)){
proposal.x[[i]] <- mvnorm(S=diag(c(0.005,0.005)),n=nlocation(x0[[i]]))
proposal.z[[i]] <- mvnorm(S=diag(c(0.005,0.005)),n=nlocation(z0[[i]]))
}

fit<-vector('list',length(GLfiles))
for(i in 1:length(GLfiles)){
fit[[i]] <- estelleMetropolis(model[[i]],proposal.x[[i]],proposal.z[[i]],iters=200,thin=25,chains=2)
}

The below figure shows the posterior mean twilight times. The light data are overlayed. Only the figure for the Ovenbird captured at Hubbard Brook Experimental Forest, NH is shown.

This is just a brief tutorial on how to analyze geolocator data with the package SGAT. There are many more things that can be done. Hopefully this tutorial enables you to 1) better understand the analyses involved with analyzing geolocator data and 2) give you a good jumping off point to implement these analyses in your own work.

Advanced Excercise:
Take a look at the posterior estimates for flight distance (beta) - do they fit our original model? Change the priors that you set in your original model - how do the locations differ? Is the final result sensitive to the priors ? Constrain the paths using only the wrld_simpl shapefile. Does the path differ from our current model using the species distribution map?