Since it’s original release in 2019, data access in nhdplusTools has evolved and grown considerably. This vignette, prepared for a workshop in Summer 2024 shows the diversity of data available with nhdplusTools in one holistic overview.
The function naming in nhdplusTools
related to data
access uses three common terms:
1. “download” will generally download complete data sets for use
locally. 2. “get” will pull a data subset from a web service. 3.
“subset” will pull a subset from a local file and, when available, a web
service.
The following summarizes specific data access functions available for
the datasets that nhdplusTools
has supporting functionality
for.
download_nhdplusv2()
- downloads a National seamless
geodatabaseget_nhdplus()
- access NHDPlusV2 from a web
servicesubset_nhdplus()
- subsets the National geodatabase OR
the web servicesubset_rpu()
- subsets loaded NHDPlusV2 by raster
processing unitsubset_vpu()
- subsets loaded NHDPlusV2 by vector
processing unitget_boundaries()
- retrieves NHDPlusV2 processing unit
boundariesdownload_nhdplushr()
- downloads staged
geodatabasesget_nhdplushr()
- assembles nhdplushr datasets from
many staged databasesget_hr_data()
- helps pull data out of staged databases
(used by get_nhdplushr()
)download_nhd()
- downloads staged geodatabasesget_3dhp
- access 3DHP from a web serviceget_geoconnex_reference()
- access geoconnex reference
features from a web servicedownload_wbd()
- download a National WBD
geodatabaseget_huc()
- get a particular Hydrologic Unit Code from
a servicedownload_rf1()
- download a National RF1
geodatabaseThe Network Linked Data Index plays a key role in discovery
nhdplusTools
. It provides access to an easy network
navigation and basin boundary delineation tool. Behind the scenes, the
NLDI is based on the network of the NHDPlusV2 – so has some special
functionality related directly to that dataset. Functions that utilize
the NLDI include:
subset_nhdplus()
plot_nhdplus()
map_nhdplus()
discover_nhdplus_id()
get_nldi_basin()
get_nldi_feature()
get_nldi_index()
navigate_network()
navigate_nldi()
get_split_catchment()
get_raindrop_trace()
In the demo below, we’ll choose a stream gage and show how to access data related to it from all the datasets that nhdplusTools works with as well as use some of the NLDI functionality in question.
For the sake of demonstration, we’ll look at the Wolf River at Langlade, WI – site ID “04074950”.
nhdplusTools allows us to start from a stream gage to build a subset of NHDPlusV2 data. Below, we’ll do just that and plot the results on a default map.
demo_dir <- nhdplusTools_data_dir()
site_id <- "USGS-04074950"
# use dataRetrieval::get_nldi_sources() to find other nldi sources
site <- list(featureSource = "nwissite",
featureID = "USGS-04074950")
site_feature <- get_nldi_feature(site)
upstream_network <- navigate_nldi(site,
mode = "UT", distance_km = 9999)
demo_data <- file.path(demo_dir, "data_demo.gpkg")
dataset <- subset_nhdplus(as.integer(upstream_network$UT_flowlines$nhdplus_comid),
nhdplus_data = "download", # download from a service
output_file = demo_data, # write the data to disk
return_data = TRUE, # return the data rather
flowline_only = FALSE, overwrite = TRUE)
names(dataset)
sapply(dataset, nrow)
old_par <- par(mar = c(0, 0, 0, 0))
plot_nhdplus(outlets = list(featureSource = "nwissite",
featureID = "USGS-04074950"),
nhdplus_data = demo_data, flowline_only = TRUE)
The above is the original way NHDPlusTools supported access NHDPlusV2
data. A dedicated web service subset utility is available in
get_nhdplus
– which is what subset_nhdplus()
calls behind the scenes.
Here we grab the basin for our site and request NHDPlus with its geometry as the Area of Interest.
basin <- get_nldi_basin(site)
subset <- get_nhdplus(AOI = basin, realization = "flowline")
par(mar = c(0, 0, 0, 0))
plot(sf::st_geometry(basin))
plot(sf::st_geometry(subset), col = "blue", add = TRUE)
For NHDPlusHR data, which is much denser than NHDPlusV2,
nhdplusTools
supports downloading four-digit Hydrologic
Unit Code staged geodatabases. The function get_huc()
is
useful to discover the code needed here.
wolf_huc <- get_huc(basin, type = 'huc04')
nrow(wolf_huc)
# it straddles hucs? Not really.
par(mar = c(0, 0, 0, 0))
plot(sf::st_geometry(basin), col = "grey")
plot(sf::st_geometry(wolf_huc), add = TRUE)
wolf_huc <- get_huc(site_feature, type = "huc04")
nrow(wolf_huc)
# better!!
par(mar = c(0, 0, 0, 0))
plot(sf::st_geometry(wolf_huc))
plot(sf::st_geometry(basin), col = "grey", add = TRUE)
outdir <- file.path(nhdplusTools_data_dir(), "hr_access_demo")
dir.create(outdir)
download_dir <- download_nhdplushr(outdir, wolf_huc$huc4)
list.files(download_dir)
If we had asked for more HUC4 codes, additional gdb files would be in the directory we specified.
With this, we can use one of the two functions for access NHDPlusHR
data to load data from the directory. Here, we use the more complete
get_nhdplus()
and set check_terminals=TRUE
which uses make_standalone()
to ensure that the nhdplus
attributes are complete and self-consistent within the subset of data
returned.
nhdplushr <- get_nhdplushr(
download_dir,
layers = c("NHDFlowline", "NHDPlusCatchment", "NHDWaterbody",
"NHDArea", "NHDLine", "NHDPlusSink", "NHDPlusWall",
"NHDPoint", "NHDPlusBurnWaterbody", "NHDPlusBurnLineEvent",
"HYDRO_NET_Junctions", "WBDHU2", "WBDHU4","WBDHU6",
"WBDHU8", "WBDHU10", "WBDHU12", "WBDLine"),
check_terminals = TRUE)
sapply(nhdplushr, nrow)
At a lower level, we can use get_hr_data()
to access
particular layers. Here, rename=TRUE
causes the nhdplushr
names to be normalized to nhdplusTools
conventions using
align_nhdplus_names()
.
NOTE: “NHDPlusID” from nhdplushr is replaced with the name “COMID”. This attribute is merely a unique integer identifier and should not be assumed to relate to anything outside the context of a given dataset.
nhdplushr <- get_hr_data(list.files(download_dir, pattern = ".gdb", full.names = TRUE),
layer = "NHDFlowline", rename = TRUE)
names(nhdplushr)
# Great Lakes coast are flowlines -- remove for visuals
gl_coast <- c(get_DM(nhdplushr, 60002700000331),
get_DM(nhdplushr, 60002700049157))
plot_data <- dplyr::filter(nhdplushr, FCODE != 56600 & StreamOrde > 2 & !COMID %in% gl_coast)
par(mar = c(0, 0, 0, 0))
plot(sf::st_geometry(wolf_huc))
plot(sf::st_geometry(basin), col = "grey", add = TRUE)
plot(sf::st_geometry(plot_data), lwd = plot_data$StreamOrde / 3, col = "blue", add = TRUE)
par(mar = c(0, 0, 0, 0))
plot(sf::st_geometry(basin), col = "grey")
plot(sf::st_geometry(plot_data), lwd = plot_data$StreamOrde / 2, col = "blue", add = TRUE)
BONUS DEMO: Say we want to know where our stream gage is on the river in question…
get_flowline_index()
and
disambiguate_flowline_indexes()
are your friends.
potential_matches <- get_flowline_index(nhdplushr,
points = site_feature,
search_radius = units::as_units(500, "m"),
max_matches = 5)
potential_matches
site_meta <- dataRetrieval::readNWISsite(gsub("USGS-", "", site_feature$identifier))
sqmi_to_sqkm <- 2.59
da_df <- data.frame(id = 1, drainagearea = site_meta$drain_area_va * sqmi_to_sqkm)
# uses nearest drainage area match
disambiguate_flowline_indexes(potential_matches,
flowpath = dplyr::select(nhdplushr, COMID, TotDASqKM),
hydro_location = da_df)
If what you really need is the base NHD, which is now a static
dataset, the pattern is just the same as with nhdplushr using
download_nhd()
.
outdir <- file.path(nhdplusTools_data_dir(), "nhd_access_demo")
dir.create(outdir)
download_dir <- download_nhd(outdir, wolf_huc$huc4)
list.files(download_dir)
nhd_gdb <- list.files(download_dir, pattern = ".gdb", full.names = TRUE)
sf::st_layers(nhd_gdb)
nhd_fline <- sf::read_sf(nhd_gdb, "NHDFlowline")
We’ll wait to plot this up until after we’ve done some work with
3DHP. As of writing, the 3DHP is more or less the same as the NHD but in
a new database schema. The get_3dhp()
uses a web service to
pull data for subsets much the same as the other get_*
functions. See vignette("get_3dhp_data.Rmd")
for more on
how to work with 3DHP data.
sub_3dhp <- get_3dhp(basin, type = "flowline")
plot(sub_3dhp)
sub_3dhp <- st_compatibalize(sub_3dhp, nhd_fline)
par(mar = c(0, 0, 0, 0))
plot(sf::st_geometry(basin), col = "grey")
plot(sf::st_geometry(sf::st_zm(sub_3dhp)), col = "skyblue", lwd = 2.5, add = TRUE)
plot(sf::st_geometry(sf::st_zm(nhd_fline)), col = "blue", lwd = 0.1, add = TRUE)
A brand new, May 2024, feature in nhdplusTools is access to the
geoconnex.us reference feature server. The reference feature server
provides easy access to representations of datasets that people use to
cross reference other data. discover_geoconnex_reference()
provides access to a full table of what’s available.
Each of these sets of “reference features” includes a “uri” which is a persistent way to identify these features and will always lead you back to a representation of the feature when you look it up.
For example, let’s get a subset of data for our Wolf River basin.
wolf_gages <- get_geoconnex_reference(basin, type = "gages")
geoconnex_gage <- dplyr::filter(wolf_gages,
provider_id == gsub("USGS-", "", site_feature$identifier))
wolf_mainstems <- get_geoconnex_reference(basin, type = "mainstems")
wolf_mainstem <- dplyr::filter(wolf_mainstems, uri == geoconnex_gage$mainstem_uri)
par(mar = c(0, 0, 0, 0))
plot(sf::st_geometry(basin), col = "grey")
plot(sf::st_geometry(wolf_mainstems), lwd = 0.5, col = "blue", add = TRUE)
plot(sf::st_geometry(wolf_mainstem), lwd = 3, col = "blue", add = TRUE)
plot(sf::st_geometry(wolf_gages), pch = 2, cex = 0.75, add = TRUE)
plot(sf::st_geometry(geoconnex_gage), pch = 17, cex = 1.5, add = TRUE)
The Watershed Boundary Dataset contains hydrologic units that are
used as cataloging units for the country. It was developed and improved
over the last twenty years and is nearing a complete static state. There
are three prominent versions of it available – a snapshot available as
part of the NHDPlusV2, a snapshot available as part of and retrieved
with get_huc()
, and the latest (soon to be final) snapshot
available using download_wbd()
.
wbd_dir <- file.path(nhdplusTools_data_dir(), "wbd_access_demo")
wbd_out <- download_wbd(wbd_dir)
# zip::unzip doesn't always work
if(length(wbd_out == 0)) {
f <- list.files(wbd_dir, pattern = ".zip", full.names = TRUE)
utils::unzip(f, exdir = wbd_dir)
}
wbd_gdb <- list.files(wbd_dir, pattern = ".gdb", full.names = TRUE)
sf::st_layers(wbd_gdb)