A Unified Interface to SCAN/SNOTEL Data
D.E. Beaudette and J. Skovlin
2017-03-16
Introduction
This document demonstrates how to use the soilDB package to download climate data from the SCAN/SNOTEL network. Stay tuned for updates and more detailed examples.
Note
This interface is very much a work in progress. There are some SCAN / SNOTEL site with multiple (above-ground) sensors per sensor prefix, which can lead to confusing results. For now the first above-ground sensor is selected for each above-ground sensor prefix: c('TAVG', 'PRCP', 'PREC', 'SNWD', 'WTEQ', 'WDIRV', 'WSPDV', 'LRADT').
There are some sites with multiple below-ground sensors installed at the same depth. A message is printed when this happens. You can use the sensor.id column to access specific sensors. We are working on a solution.
Installation
With a recent version of R (>= 2.15), it is possible to get all of the packages that this tutorial depends on via:
# run these commands in the R console
install.packages('latticeExtra', dep=TRUE)
install.packages('plyr', dep=TRUE)
install.packages('rvest', dep=TRUE)
install.packages('httr', dep=TRUE)
install.packages('reshape2', dep=TRUE)
install.packages('soilDB', dep=TRUE)
You will also need the latest version of soilDB and aqp:
install.packages('devtools', dep=TRUE)
devtools::install_github("ncss-tech/aqp", dependencies=FALSE, upgrade_dependencies=FALSE)
devtools::install_github("ncss-tech/soilDB", dependencies=FALSE, upgrade_dependencies=FALSE)
Quick Example: getting sensor metadata
Each SCAN / SNOTEL station may have a different set of sensors.
# load required packages
library(aqp)
library(soilDB)
library(reshape2)
library(plyr)
library(latticeExtra)
# get basic sensor metadata for several SCAN/SNOTEL sites
m <- SCAN_sensor_metadata(site.code=c(2072,356,2148,2187))
# print a list of sensor types by station
dlply(m, 'site.code', function(i) unique(i$Ecode))
## $`356`
## [1] "WTEQ" "PREC" "TOBS" "TMAX" "TMIN" "TAVG" "SNWD" "SMS" "STO" "SAL" "RDC" "BATT"
##
## $`2072`
## [1] "PREC" "TOBS" "TMAX" "TMIN" "TAVG" "PRCP" "SMS" "STO" "SAL" "RDC" "BATT" "WDIRV"
## [13] "WSPDV" "RHUM" "RHENC" "LRADT"
##
## $`2148`
## [1] "PREC" "TOBS" "TMAX" "TMIN" "TAVG" "PRCP" "SMS" "STO" "SAL" "RDC" "BATT" "WDIRV"
## [13] "WSPDV" "RHUM" "RHENC"
##
## $`2187`
## [1] "PREC" "TOBS" "TMAX" "TMIN" "TAVG" "PRCP" "SMS" "STO" "SAL" "RDC" "BATT" "WDIRV"
## [13] "WSPDV" "RHUM" "RHENC" "LRADT"
##
## attr(,"split_type")
## [1] "data.frame"
## attr(,"split_labels")
## site.code
## 1 356
## 2 2072
## 3 2148
## 4 2187
Getting Data from a Single Station
Lets get some recent (daily) climate data from the Rogers Farm #1 SCAN station; site number 2001. The fetchSCAN() function does all of the work. Notice that the resulting object is a list, each element is a suite of sensor data. For example, the SMS element contains soil moisture from several depths
# fetchSCAN can accept multiple years
x <- fetchSCAN(site.code=2001, year=c(2013,2014,2015,2016))
# check the results
str(x, 1)
## List of 11
## $ SMS :'data.frame': 7325 obs. of 5 variables:
## $ STO :'data.frame': 7325 obs. of 5 variables:
## $ SAL :'data.frame': 0 obs. of 0 variables
## $ TAVG :'data.frame': 1465 obs. of 5 variables:
## $ PRCP :'data.frame': 1465 obs. of 5 variables:
## $ PREC :'data.frame': 1465 obs. of 5 variables:
## $ SNWD :'data.frame': 0 obs. of 0 variables
## $ WTEQ :'data.frame': 0 obs. of 0 variables
## $ WDIRV:'data.frame': 0 obs. of 0 variables
## $ WSPDV:'data.frame': 0 obs. of 0 variables
## $ LRADT:'data.frame': 0 obs. of 0 variables
# tabulate number of soil moisture measurements per depth (cm)
table(x$SMS$depth)
| 5 | 10 | 20 | 51 | 102 |
|---|---|---|---|---|
| 1465 | 1465 | 1465 | 1465 | 1465 |
# get unique set of soil moisture sensor depths
sensor.depths <- unique(x$SMS$depth)
# generate a better axis for dates
date.axis <- seq.Date(as.Date(min(x$SMS$Date)), as.Date(max(x$SMS$Date)), by='3 months')
# plot soil moisture time series, panels are depth
xyplot(value ~ Date | factor(depth), data=x$SMS, as.table=TRUE, type=c('l','g'), strip=strip.custom(bg=grey(0.80)), layout=c(1,length(sensor.depths)), scales=list(alternating=3, x=list(at=date.axis, format="%b\n%Y")), ylab='Volumetric Water Content', main='Soil Moisture at Several Depths')
Getting Data from Multiple Stations
That was interesting, but most of the time we want to make comparisons between stations. Note that the format of data returned makes it possible "stack" data from several stations that do not share the same collection of sensors.
# get more data
x <- fetchSCAN(site.code=c(356, 2072), year=c(2015, 2016))
# same format as before, sensor data are "stacked" within each list element
str(x, 1)
## List of 11
## $ SMS :'data.frame': 5864 obs. of 5 variables:
## $ STO :'data.frame': 5864 obs. of 5 variables:
## $ SAL :'data.frame': 0 obs. of 0 variables
## $ TAVG :'data.frame': 1466 obs. of 5 variables:
## $ PRCP :'data.frame': 733 obs. of 5 variables:
## $ PREC :'data.frame': 1466 obs. of 5 variables:
## $ SNWD :'data.frame': 733 obs. of 5 variables:
## $ WTEQ :'data.frame': 733 obs. of 5 variables:
## $ WDIRV:'data.frame': 0 obs. of 0 variables
## $ WSPDV:'data.frame': 0 obs. of 0 variables
## $ LRADT:'data.frame': 0 obs. of 0 variables
# deeper look
str(x$SMS)
## 'data.frame': 5864 obs. of 5 variables:
## $ Site : int 356 356 356 356 356 356 356 356 356 356 ...
## $ Date : Date, format: "2015-01-01" "2015-01-02" "2015-01-03" ...
## $ value : num 0 0 0 0 0 0 0 0 0 0 ...
## $ depth : num 5 5 5 5 5 5 5 5 5 5 ...
## $ sensor.id: Factor w/ 5 levels "SMS.I_2","SMS.I_8",..: 1 1 1 1 1 1 1 1 1 1 ...
# get unique set of soil moisture sensor depths
sensor.depths <- unique(x$SMS$depth)
# generate a better axis for dates
date.axis <- seq.Date(as.Date(min(x$SMS$Date)), as.Date(max(x$SMS$Date)), by='3 months')
# plot soil moisture time series, panels are depth
xyplot(value ~ Date | factor(depth), groups=factor(Site), data=x$SMS, as.table=TRUE, type=c('l','g'), auto.key=list(columns=2, lines=TRUE, points=FALSE), strip=strip.custom(bg=grey(0.80)), layout=c(1,length(sensor.depths)), scales=list(alternating=3, x=list(at=date.axis, format="%b\n%Y")), ylab='Volumetric Water Content', main='Soil Moisture at Several Depths')
Example Figures
# combine sensors
g <- make.groups('soil moisture'=x$SMS, 'soil temperature'=x$STO)
# get unique set of soil moisture sensor depths
sensor.depths <- unique(g$depth)
# generate a better axis for dates
date.axis <- seq.Date(as.Date(min(g$Date)), as.Date(max(g$Date)), by='3 months')
# plot soil moisture time series, panels are depth
p <- xyplot(value ~ Date | factor(Site) + which, groups=factor(depth), data=g, as.table=TRUE, type=c('l','g'), auto.key=list(columns=length(sensor.depths), lines=TRUE, points=FALSE), strip=strip.custom(bg=grey(0.80)), scales=list(alternating=3, x=list(at=date.axis, format="%b\n%Y"), y=list(relation='free', rot=0)), ylab='', main='Soil Moisture and Temperature at Several Depths')
useOuterStrips(p)
# plot a single suite of sensors
# compare sites / depths
# no SAL sensors for these sites
# xyplot(value ~ Date | factor(depth), groups=factor(Site), data=x$SAL, as.table=TRUE, type=c('l','g'), auto.key=list(title='site', columns=2, lines=TRUE, points=FALSE), layout=c(1,length(sensor.depths)), scales=list(alternating=3, y=list(relation='free', rot=0)))
xyplot(value ~ Date, groups=factor(Site), data=x$TAVG, as.table=TRUE, type=c('l','g'), auto.key=list(title='site', columns=2, lines=TRUE, points=FALSE), scales=list(alternating=3, x=list(at=date.axis, format="%b\n%Y")), main='Average Air Temperature by Site')
xyplot(value ~ Date, groups=factor(Site), data=x$PRCP, as.table=TRUE, type=c('l','g'), auto.key=list(title='site', columns=2, lines=TRUE, points=FALSE), scales=list(alternating=3, x=list(at=date.axis, format="%b\n%Y")), main='Daily Precipitation by Site')
xyplot(value ~ Date, groups=factor(Site), data=x$PREC, as.table=TRUE, type=c('l','g'), auto.key=list(title='site', columns=2, lines=TRUE, points=FALSE), scales=list(alternating=3, x=list(at=date.axis, format="%b\n%Y")), main='Accumulated Precipitation by Site')
xyplot(value ~ Date, groups=factor(Site), data=x$WTEQ, as.table=TRUE, type=c('l','g'), auto.key=list(title='site', columns=2, lines=TRUE, points=FALSE), scales=list(alternating=3, x=list(at=date.axis, format="%b\n%Y")), main='Daily Snow Water Equivalent by Site')
xyplot(value ~ Date, groups=factor(Site), data=x$SNWD, as.table=TRUE, type=c('l','g'), auto.key=list(title='site', columns=2, lines=TRUE, points=FALSE), scales=list(alternating=3, x=list(at=date.axis, format="%b\n%Y")), main='Daily Snow Depth (in) by Site')
Case Study: Pine Nut SCAN Site
Get SCAN data from the Pine Nut site(2144) for years 2008 through 2015. Note that this site has multiple soil temperature sensors installed at each depth.
x <- fetchSCAN(site.code=c(2144), year=2008:2015)
# check available data
str(x, 1)
## List of 11
## $ SMS :'data.frame': 14650 obs. of 5 variables:
## $ STO :'data.frame': 29300 obs. of 5 variables:
## $ SAL :'data.frame': 0 obs. of 0 variables
## $ TAVG :'data.frame': 2930 obs. of 5 variables:
## $ PRCP :'data.frame': 2930 obs. of 5 variables:
## $ PREC :'data.frame': 2930 obs. of 5 variables:
## $ SNWD :'data.frame': 1465 obs. of 5 variables:
## $ WTEQ :'data.frame': 0 obs. of 0 variables
## $ WDIRV:'data.frame': 0 obs. of 0 variables
## $ WSPDV:'data.frame': 0 obs. of 0 variables
## $ LRADT:'data.frame': 0 obs. of 0 variables
## still working on a solution for this issue: https://github.com/ncss-tech/soilDB/issues/14
# ! multiple soil temperature sensors installed at each depth
table(sensor.id=x$STO$sensor.id, sensor.depth=x$STO$depth)
| sensor.id/sensor.depth | 5 | 10 | 20 | 51 | 102 |
|---|---|---|---|---|---|
| STO.I_2 | 2930 | 0 | 0 | 0 | 0 |
| STO.I_4 | 0 | 2930 | 0 | 0 | 0 |
| STO.I_8 | 0 | 0 | 2930 | 0 | 0 |
| STO.I_20 | 0 | 0 | 0 | 2930 | 0 |
| STO.I_40 | 0 | 0 | 0 | 0 | 2930 |
| STO.I-2_2 | 2930 | 0 | 0 | 0 | 0 |
| STO.I-2_4 | 0 | 2930 | 0 | 0 | 0 |
| STO.I-2_8 | 0 | 0 | 2930 | 0 | 0 |
| STO.I-2_20 | 0 | 0 | 0 | 2930 | 0 |
| STO.I-2_40 | 0 | 0 | 0 | 0 | 2930 |
# extract SMS data to a new data.frame
sms <- x$SMS
# add parts of date for plotting
sms$year <- format(sms$Date, "%Y")
sms$doy <- as.numeric(format(sms$Date, "%j"))
# cast long -> wide format for later use
sms.wide <- dcast(sms, Site + Date ~ depth, value.var = 'value', mean, na.rm=TRUE)
# check:
head(sms)
| Site | Date | value | depth | sensor.id | year | doy |
|---|---|---|---|---|---|---|
| 2144 | 2008-01-01 | 2.1 | 5 | SMS.I_2 | 2008 | 1 |
| 2144 | 2008-01-02 | 1.8 | 5 | SMS.I_2 | 2008 | 2 |
| 2144 | 2008-01-03 | 2.6 | 5 | SMS.I_2 | 2008 | 3 |
| 2144 | 2008-01-04 | 3.3 | 5 | SMS.I_2 | 2008 | 4 |
| 2144 | 2008-01-05 | 5.1 | 5 | SMS.I_2 | 2008 | 5 |
| 2144 | 2008-01-06 | 12.1 | 5 | SMS.I_2 | 2008 | 6 |
head(sms.wide)
| Site | Date | 5 | 10 | 20 | 51 | 102 |
|---|---|---|---|---|---|---|
| 2144 | 2008-01-01 | 2.1 | 4.4 | 8.8 | 9.2 | 2.0 |
| 2144 | 2008-01-02 | 1.8 | 4.1 | 8.1 | 9.3 | 1.6 |
| 2144 | 2008-01-03 | 2.6 | 5.2 | 8.2 | 9.2 | 1.6 |
| 2144 | 2008-01-04 | 3.3 | 6.2 | 8.2 | 9.0 | NaN |
| 2144 | 2008-01-05 | 5.1 | 7.7 | 8.2 | 9.1 | 2.0 |
| 2144 | 2008-01-06 | 12.1 | 18.8 | 12.7 | 9.1 | 2.1 |
# make some pretty colors for plotting daily VWC
cols <- colorRampPalette(brewer.pal(11, 'RdYlBu'), space='Lab', interpolate='spline', bias=2)
# plot VWC by year/day, panels by depth
levelplot(value ~ doy * factor(year) | factor(depth), main='Daily Mean VWC (%)',
data=sms, layout=c(1,5), col.regions=cols,
colorkey=list(space='top'), as.table=TRUE, scales=list(alternating=3, cex=0.75, x=list(tick.number=20)),
par.strip.text=list(cex=0.85), strip=strip.custom(bg='grey'),
xlab='Julian Day', ylab='Year')
# plot VWC by year/day, panels by year with depths stacked
levelplot(value ~ doy * rev(factor(depth)) | factor(year), main='Daily Mean VWC (%)',
data=sms, layout=c(1,8), col.regions=cols,
colorkey=list(space='top'), as.table=TRUE, scales=list(alternating=3, cex=0.75, x=list(tick.number=20)),
par.strip.text=list(cex=0.85), strip=strip.custom(bg='grey'),
xlab='Julian Day', ylab='Sensor Depth (cm)', ylim=rev(levels(factor(sms$depth))))
Water Retention Curve Development
Get KSSL data for this site, via user pedon ID. This will include estimated parameters for the van Genuchten model which can be used to derive a water retention curve.
Previously the VG parameters had to be obtained manually from these reports.
# get KSSL data
s <- fetchKSSL(pedon_id = 'S08NV003003')
# VG parameters now returned by fetchKSSL() as of 2016-11-17
(horizons(s)[, c('hzn_desgn', 'hzn_top', 'hzn_bot', 'theta_r', 'theta_s', 'alpha', 'npar')])
| hzn_desgn | hzn_top | hzn_bot | theta_r | theta_s | alpha | npar |
|---|---|---|---|---|---|---|
| A | 0 | 9 | 0.0337216 | 0.4864061 | -1.581517 | 0.1227247 |
| ABk | 9 | 31 | 0.0445697 | 0.4701070 | -1.605478 | 0.1267858 |
| Btk1 | 31 | 48 | 0.0516780 | 0.4647200 | -1.795169 | 0.1317818 |
| Btk2 | 48 | 83 | 0.0562656 | 0.4179194 | -2.058838 | 0.1377132 |
| 2Bk | 83 | 152 | NA | NA | NA | NA |
Conversion of VWC to Matric Potential: Single Depth
Generate water retention curve for sensor at 20cm. This requires extracting VG parameters from the KSSL data from the associated horizon. See ?KSSL_VG_model or this tutorial for details.
# get VG parameters for sensor at 20cm depth
s.vg.hz <- slice(s, 20 ~ theta_r + theta_s + alpha + npar, just.the.data = TRUE)
# get VG curve and inverse function
# this is based on several assumptions about the source data... details pending
vg.hz <- KSSL_VG_model(VG_params = s.vg.hz)
# check: looks good
p.model <- xyplot(phi ~ theta, data=vg.hz$VG_curve, type=c('l', 'g'), scales=list(alternating=3, x=list(tick.number=10), y=list(log=10, tick.number=10)), yscale.components=yscale.components.logpower, ylab=expression("Suction " (kPa)), xlab=expression("Volumetric Water Content " (cm^3/cm^3)), par.settings = list(plot.line=list(col='RoyalBlue', lwd=2)))
update(p.model, main='Estimated Water Retention Curve at 20cm', sub='van Genuchten Model Parameters fit by USDA-ARS Rosetta')
Lets add some points on the water retention curve from the measured VWC values.
# add quantiles from measured values
sms.q <- data.frame(theta=quantile(sms.wide$`20`/100, probs=c(0, 0.05, 0.25, 0.5, 0.75, 0.95, 1), na.rm=TRUE))
sms.q$phi <- vg.hz$VG_inverse_function(sms.q$theta)
p.points <- xyplot(phi ~ theta, data=sms.q, type='p', par.settings = list(plot.symbol=list(col='orange', pch=16, cex=1.25)), scales=list(alternating=3, x=list(tick.number=10), y=list(log=10, tick.number=10)), yscale.components=yscale.components.logpower, ylab=expression("Suction " (kPa)), xlab=expression("Volumetric Water Content " (cm^3/cm^3)), panel=function(x, y, ...) {
panel.xyplot(x, y, ...)
panel.text(x, y, row.names(sms.q), adj=c(0,-0.75), cex=0.65)
})
update(p.model + p.points, main='Select Percentiles of Data Measured at 20cm', sub='van Genuchten Model Parameters fit by USDA-ARS Rosetta')
Conversion of VWC to Matric Potential: All Depths
This time we will iterate over all sensors, develop water retention curves, and convert VWV to water potential values. Note that this approach will only work with a SoilProfileCollection object that contains a single profile.
# split soil moisture data by sensor depth
sms.list <- split(sms, sms$depth)
# convert VWC -> phi (KPa)
sms.phi <- ldply(sms.list, function(i) {
# i is a DF associated with a single sensor depth
this.depth <- unique(i$depth)
# get VG params for this depth
fm <- as.formula(paste0(this.depth, ' ~ theta_r + theta_s + alpha + npar'))
s.vg.hz <- slice(s, fm, just.the.data = TRUE)
# get VG curve and inverse function
# this is based on several assumptions about the source data...
vg.hz <- KSSL_VG_model(VG_params = s.vg.hz)
# estimate phi
if(!is.null(vg.hz$VG_inverse_function))
i$phi <- vg.hz$VG_inverse_function(i$value / 100.0)
else
i$phi <- NA
# done
return(i)
})
# classify each day as <= 1500 kPa (15 bar)
sms.phi$phi_leq_15_bar <- factor(sms.phi$phi <= 1500, levels=c('FALSE', 'TRUE'))
# new color palette
cols.phi <- colorRampPalette(rev(brewer.pal(11, 'RdYlBu')), space='Lab', interpolate='spline', bias=1.5)
# plot daily mean water potential in log10-kPa
levelplot(log(phi, base=10) ~ doy * factor(year) | factor(depth), main='Daily Mean Water Potential log10(-kPa)',
data=sms.phi, layout=c(1,5), col.regions=cols.phi,
colorkey=list(space='top'), as.table=TRUE, scales=list(alternating=3, cex=0.75, x=list(tick.number=20)),
par.strip.text=list(cex=0.85), strip=strip.custom(bg='grey'),
xlab='Julian Day', ylab='Year')
# plot daily mean water potential in log10-kPa by year and sensor stack
levelplot(log(phi, base=10) ~ doy * rev(factor(depth)) | factor(year), main='Daily Mean Water Potential log10(-kPa)',
data=sms.phi, layout=c(1,8), col.regions=cols.phi,
colorkey=list(space='top'), as.table=TRUE, scales=list(alternating=3, cex=0.75, x=list(tick.number=20)),
par.strip.text=list(cex=0.85), strip=strip.custom(bg='grey'),
xlab='Julian Day', ylab='Sensor Depth (cm)', ylim=rev(levels(factor(sms.phi$depth))))
levelplot(phi_leq_15_bar ~ doy * factor(year) | factor(depth), main='Daily Mean Water Potential <= 15 bar (1500 kPa)',
data=sms.phi, layout=c(1,5), col.regions=c(grey(0.85), 'RoyalBlue'), cuts=1,
colorkey=FALSE, as.table=TRUE, scales=list(alternating=3, cex=0.75, x=list(tick.number=20)),
par.strip.text=list(cex=0.85), strip=strip.custom(bg='grey'),
xlab='Julian Day', ylab='Year')
This document is based on aqp version 1.10 and soilDB version 1.8.1.
This document is based on
aqp version and
soilDB version .