Title: | 'NetCDF' Geometry and Time Series |
---|---|
Description: | Tools to create time series and geometry 'NetCDF' files. |
Authors: | David Blodgett [aut, cre], Luke Winslow [ctb] |
Maintainer: | David Blodgett <[email protected]> |
License: | CC0 |
Version: | 1.3.0 |
Built: | 2024-11-19 20:23:50 UTC |
Source: | https://github.com/doi-usgs/ncdfgeom |
Returns the fractional percent of each feature in x that is covered by each intersecting feature in y. These can be used as the weights in an area-weighted mean overlay analysis where x is the data **source** and area- weighted means are being generated for the **target**, y.
This function is a lightwieght wrapper around the functions aw_intersect aw_total and aw_weight from the areal package.
calculate_area_intersection_weights(x, y, normalize, allow_lonlat = FALSE)
calculate_area_intersection_weights(x, y, normalize, allow_lonlat = FALSE)
x |
sf data.frame source features including one geometry column and one identifier column |
y |
sf data.frame target features including one geometry column and one identifier column |
normalize |
logical return normalized weights or not. Normalized weights express the fraction of **target** polygons covered by a portion of each **source** polygon. They are normalized in that the area of each **source** polygon has already been factored into the weight. Un-normalized weights express the fraction of **source** polygons covered by a portion of each **target** polygon. This is a more general form that requires knowledge of the area of each **source** polygon to derive area-weighted statistics from **source** to **target. See details and examples for more regarding this distinction. |
allow_lonlat |
boolean If FALSE (the default) lon/lat target features are not allowed. Intersections in lon/lat are generally not valid and problematic at the international date line. |
Two versions of weights are available:
'normalize = FALSE', if a polygon from x (source) is entirely within a polygon in y (target), w will be 1. If a polygon from x (source) is 50 and 50 in each. Weights will sum to 1 per **SOURCE** polygon if the target polygons fully cover that feature.
For 'normalize = FALSE' the area weighted mean calculation must include the area of each x (source) polygon as in:
> *in this case, 'area' is the area of source polygons and you would do this operation grouped by target polygon id.*
> 'sum( (val * w * area), na.rm = TRUE ) / sum(w * area)'
If 'normalize = TRUE', weights are divided by the target polygon area such that weights sum to 1 per TARGET polygon if the target polygon is fully covered by source polygons.
For 'normalize = FALSE' the area weighted mean calculation no area is required as in:
> 'sum( (val * w), na.rm = TRUE ) / sum(w)'
See examples for illustration of these two modes.
data.frame containing fraction of each feature in x that is covered by each feature in y.
library(sf) source <- st_sf(source_id = c(1, 2), val = c(10, 20), geom = st_as_sfc(c( "POLYGON ((0.2 1.2, 1.8 1.2, 1.8 2.8, 0.2 2.8, 0.2 1.2))", "POLYGON ((-1.96 1.04, -0.04 1.04, -0.04 2.96, -1.96 2.96, -1.96 1.04))"))) source$area <- as.numeric(st_area(source)) target <- st_sf(target_id = "a", geom = st_as_sfc("POLYGON ((-1.2 1, 0.8 1, 0.8 3, -1.2 3, -1.2 1))")) plot(source['val'], reset = FALSE) plot(st_geometry(target), add = TRUE) (w <- calculate_area_intersection_weights(source[c("source_id", "geom")], target[c("target_id", "geom")], normalize = FALSE, allow_lonlat = TRUE)) (res <- merge(st_drop_geometry(source), w, by = "source_id")) sum(res$val * res$w * res$area) / sum(res$w * res$area) (w <- calculate_area_intersection_weights(source[c("source_id", "geom")], target[c("target_id", "geom")], normalize = TRUE, allow_lonlat = TRUE)) (res <- merge(st_drop_geometry(source), w, by = "source_id")) sum(res$val * res$w) / sum(res$w)
library(sf) source <- st_sf(source_id = c(1, 2), val = c(10, 20), geom = st_as_sfc(c( "POLYGON ((0.2 1.2, 1.8 1.2, 1.8 2.8, 0.2 2.8, 0.2 1.2))", "POLYGON ((-1.96 1.04, -0.04 1.04, -0.04 2.96, -1.96 2.96, -1.96 1.04))"))) source$area <- as.numeric(st_area(source)) target <- st_sf(target_id = "a", geom = st_as_sfc("POLYGON ((-1.2 1, 0.8 1, 0.8 3, -1.2 3, -1.2 1))")) plot(source['val'], reset = FALSE) plot(st_geometry(target), add = TRUE) (w <- calculate_area_intersection_weights(source[c("source_id", "geom")], target[c("target_id", "geom")], normalize = FALSE, allow_lonlat = TRUE)) (res <- merge(st_drop_geometry(source), w, by = "source_id")) sum(res$val * res$w * res$area) / sum(res$w * res$area) (w <- calculate_area_intersection_weights(source[c("source_id", "geom")], target[c("target_id", "geom")], normalize = TRUE, allow_lonlat = TRUE)) (res <- merge(st_drop_geometry(source), w, by = "source_id")) sum(res$val * res$w) / sum(res$w)
Creates cell geometry from vectors of X and Y positions.
create_cell_geometry( X_coords, Y_coords, prj, geom = NULL, buffer_dist = 0, regularize = FALSE, eps = 1e-10 )
create_cell_geometry( X_coords, Y_coords, prj, geom = NULL, buffer_dist = 0, regularize = FALSE, eps = 1e-10 )
X_coords |
numeric center positions of X axis indices |
Y_coords |
numeric center positions of Y axis indices |
prj |
character proj4 string for x and y |
geom |
sf data.frame with geometry that cell geometry should cover |
buffer_dist |
numeric a distance to buffer the cell geometry in units of geom projection |
regularize |
boolean if TRUE, grid spacing will be adjusted to be exactly equal. Only applies to 1-d coordinates. |
eps |
numeric sets tolerance for grid regularity. |
Intersection is performed with cell centers then geometry is constructed. A buffer may be required to fully cover geometry with cells.
dir <- tempdir() ncf <- file.path(dir, "metdata.nc") try(zip::unzip(system.file("extdata/metdata.zip", package = "ncdfgeom"), exdir = dir)) if(file.exists(ncf)) { nc <- RNetCDF::open.nc(ncf) ncmeta::nc_vars(nc) variable_name <- "precipitation_amount" cv <- ncmeta::nc_coord_var(nc, variable_name) x <- RNetCDF::var.get.nc(nc, cv$X, unpack = TRUE) y <- RNetCDF::var.get.nc(nc, cv$Y, unpack = TRUE) prj <- ncmeta::nc_gm_to_prj(ncmeta::nc_grid_mapping_atts(nc)) geom <- sf::read_sf(system.file("shape/nc.shp", package = "sf")) geom <- sf::st_transform(geom, 5070) cell_geometry <- create_cell_geometry(x, y, prj, geom, 0) plot(sf::st_geometry(cell_geometry), lwd = 0.25) plot(sf::st_transform(sf::st_geometry(geom), prj), add = TRUE) }
dir <- tempdir() ncf <- file.path(dir, "metdata.nc") try(zip::unzip(system.file("extdata/metdata.zip", package = "ncdfgeom"), exdir = dir)) if(file.exists(ncf)) { nc <- RNetCDF::open.nc(ncf) ncmeta::nc_vars(nc) variable_name <- "precipitation_amount" cv <- ncmeta::nc_coord_var(nc, variable_name) x <- RNetCDF::var.get.nc(nc, cv$X, unpack = TRUE) y <- RNetCDF::var.get.nc(nc, cv$Y, unpack = TRUE) prj <- ncmeta::nc_gm_to_prj(ncmeta::nc_grid_mapping_atts(nc)) geom <- sf::read_sf(system.file("shape/nc.shp", package = "sf")) geom <- sf::st_transform(geom, 5070) cell_geometry <- create_cell_geometry(x, y, prj, geom, 0) plot(sf::st_geometry(cell_geometry), lwd = 0.25) plot(sf::st_transform(sf::st_geometry(geom), prj), add = TRUE) }
Gets attribute data from a NetCDF-DSG file and returns it in a data.frame
.
This function is intended as a convenience to be used within workflows where
the netCDF file is already open and well understood.
read_attribute_data(nc, instance_dim)
read_attribute_data(nc, instance_dim)
nc |
A NetCDF path or urlto be opened. |
instance_dim |
The NetCDF instance/station dimension. |
hucPolygons <- sf::read_sf(system.file('extdata','example_huc_eta.json', package = 'ncdfgeom')) hucPolygons_nc <- ncdfgeom::write_geometry(tempfile(), hucPolygons) read_attribute_data(hucPolygons_nc, "instance")
hucPolygons <- sf::read_sf(system.file('extdata','example_huc_eta.json', package = 'ncdfgeom')) hucPolygons_nc <- ncdfgeom::write_geometry(tempfile(), hucPolygons) read_attribute_data(hucPolygons_nc, "instance")
Attempts to convert a NetCDF-CF DSG Simple Geometry file into a sf data.frame.
read_geometry(nc_file)
read_geometry(nc_file)
nc_file |
character file path to the nc file to be read. |
sf data.frame
containing spatial geometry of type found in the NetCDF-CF DSG file.
http://cfconventions.org/index.html
huc_eta_nc <- tempfile() file.copy(system.file('extdata','example_huc_eta.nc', package = 'ncdfgeom'), huc_eta_nc, overwrite = TRUE) vars <- ncmeta::nc_vars(huc_eta_nc) hucPolygons <- sf::read_sf(system.file('extdata','example_huc_eta.json', package = 'ncdfgeom')) plot(sf::st_geometry(hucPolygons)) names(hucPolygons) hucPolygons_nc <- ncdfgeom::write_geometry(nc_file=huc_eta_nc, geom_data = hucPolygons, instance_dim_name = "station", variables = vars$name) huc_poly <- read_geometry(huc_eta_nc) plot(sf::st_geometry(huc_poly)) names(huc_poly)
huc_eta_nc <- tempfile() file.copy(system.file('extdata','example_huc_eta.nc', package = 'ncdfgeom'), huc_eta_nc, overwrite = TRUE) vars <- ncmeta::nc_vars(huc_eta_nc) hucPolygons <- sf::read_sf(system.file('extdata','example_huc_eta.json', package = 'ncdfgeom')) plot(sf::st_geometry(hucPolygons)) names(hucPolygons) hucPolygons_nc <- ncdfgeom::write_geometry(nc_file=huc_eta_nc, geom_data = hucPolygons, instance_dim_name = "station", variables = vars$name) huc_poly <- read_geometry(huc_eta_nc) plot(sf::st_geometry(huc_poly)) names(huc_poly)
This function reads a timeseries discrete sampling geometry NetCDF file and returns a list containing the file's contents.
read_timeseries_dsg(nc_file, read_data = TRUE)
read_timeseries_dsg(nc_file, read_data = TRUE)
nc_file |
character file path to the nc file to be read. |
read_data |
logical whether to read metadata only or not. |
The current implementation checks several NetCDF-CF specific conventions prior to attempting to read the file. The Conventions and featureType global attributes are checked but not strictly required.
Variables with standard_name and/or cf_role of station_id and/or timeseries_id are searched for to indicate which variable is the 'timeseries identifier'. The function stops if one is not found.
All variables are introspected for a coordinates attribute. This attribute is used to determine which variables are coordinate variables. If none are found an attempt to infer data variables by time and timeseries_id dimensions is made.
The coordinates variables are introspected and their standard_names used to determine which coordinate they are. Lat, lon, and time are required, height is not.
Variables with a coordinates attribute are assumed to be the 'data variables'.
Data variables are traversed and their metadata and data content put into lists within the main response list.
See the timeseries vignette for more information.
list containing the contents of the NetCDF file.
https://www.unidata.ucar.edu/software/netcdf-java/v4.6/reference/FeatureDatasets/CFpointImplement.html
Creates a NetCDF file with an instance dimension, and any attributes from a data frame. Use to create the start of a NetCDF-DSG file. One character length dimension is created long enough to contain the longest provided character string. This function does not implement any CF convention attributes or standard names. Any columns of class date will be converted to character.
write_attribute_data( nc_file, att_data, instance_dim_name = "instance", units = rep("unknown", ncol(att_data)), overwrite = FALSE )
write_attribute_data( nc_file, att_data, instance_dim_name = "instance", units = rep("unknown", ncol(att_data)), overwrite = FALSE )
nc_file |
|
att_data |
|
instance_dim_name |
|
units |
|
overwrite |
boolean overwrite existing file? Will append if FALSE. |
sample_data <- sf::st_set_geometry(sf::read_sf(system.file("shape/nc.shp", package = "sf")), NULL) example_file <-write_attribute_data(tempfile(), sample_data, units = rep("unknown", ncol(sample_data))) try({ ncdump <- system(paste("ncdump -h", example_file), intern = TRUE) cat(ncdump ,sep = "\n") }, silent = TRUE)
sample_data <- sf::st_set_geometry(sf::read_sf(system.file("shape/nc.shp", package = "sf")), NULL) example_file <-write_attribute_data(tempfile(), sample_data, units = rep("unknown", ncol(sample_data))) try({ ncdump <- system(paste("ncdump -h", example_file), intern = TRUE) cat(ncdump ,sep = "\n") }, silent = TRUE)
Creates a file with point, line or polygon instance data ready for the extended NetCDF-CF timeSeries featuretype format.
Will also add attributes if provided data has them.
write_geometry( nc_file, geom_data, instance_dim_name = NULL, variables = list() )
write_geometry( nc_file, geom_data, instance_dim_name = NULL, variables = list() )
nc_file |
|
geom_data |
sf |
instance_dim_name |
|
variables |
|
hucPolygons <- sf::read_sf(system.file('extdata','example_huc_eta.json', package = 'ncdfgeom')) hucPolygons_nc <- ncdfgeom::write_geometry(nc_file=tempfile(), geom_data = hucPolygons) try({ ncdump <- system(paste("ncdump -h", hucPolygons_nc), intern = TRUE) cat(ncdump ,sep = "\n") }, silent = TRUE)
hucPolygons <- sf::read_sf(system.file('extdata','example_huc_eta.json', package = 'ncdfgeom')) hucPolygons_nc <- ncdfgeom::write_geometry(nc_file=tempfile(), geom_data = hucPolygons) try({ ncdump <- system(paste("ncdump -h", hucPolygons_nc), intern = TRUE) cat(ncdump ,sep = "\n") }, silent = TRUE)
This function creates a timeseries discrete sampling geometry NetCDF file.
It uses the orthogonal array encoding to write one data.frame
per
function call. This encoding is best suited to data with the same number of
timesteps per instance (e.g. geometry or station).
write_timeseries_dsg( nc_file, instance_names, lats, lons, times, data, alts = NA, data_unit = "", data_prec = "double", data_metadata = list(name = "data", long_name = "unnamed data"), time_units = "days since 1970-01-01 00:00:00", instance_dim_name = "instance", dsg_timeseries_id = "instance_name", coordvar_long_names = list(instance = "Station Names", time = "time of measurement", lat = "latitude of the measurement", lon = "longitude of the measurement", alt = "altitude of the measurement"), attributes = list(), add_to_existing = FALSE, overwrite = FALSE )
write_timeseries_dsg( nc_file, instance_names, lats, lons, times, data, alts = NA, data_unit = "", data_prec = "double", data_metadata = list(name = "data", long_name = "unnamed data"), time_units = "days since 1970-01-01 00:00:00", instance_dim_name = "instance", dsg_timeseries_id = "instance_name", coordvar_long_names = list(instance = "Station Names", time = "time of measurement", lat = "latitude of the measurement", lon = "longitude of the measurement", alt = "altitude of the measurement"), attributes = list(), add_to_existing = FALSE, overwrite = FALSE )
nc_file |
|
instance_names |
|
lats |
|
lons |
|
times |
|
data |
|
alts |
|
data_unit |
|
data_prec |
|
data_metadata |
|
time_units |
|
instance_dim_name |
the |
dsg_timeseries_id |
the |
coordvar_long_names |
|
attributes |
list An optional list of attributes that will be added at the global level. See details for useful attributes. |
add_to_existing |
|
overwrite |
boolean unless set to true, error if file exists. |
Suggested Global Variables: c(title = "title", abstract = "history", provider site = "institution", provider name ="source", description = "description")
Note regarding add_to_existing: add_to_existing = TRUE should only be used to add variables to an existing NetCDF discrete sampling geometry file. All other inputs should be the same as are already in the file. If the functions is called with add_to_existing=FALSE (the default), it will overwrite an existing file with the same name. The expected usage is to call this function repeatedly only changing the data, data_unit, data_prec and data_metadata inputs.
See the timeseries vignette for more information.