pynhd.pynhd#

Access NLDI and WaterData databases.

Module Contents#

class pynhd.pynhd.NHD(layer, outfields='*', crs=4326)#

Access National Hydrography Dataset (NHD), both meduim and high resolution.

Notes

For more info visit: https://hydro.nationalmap.gov/arcgis/rest/services/nhd/MapServer

Parameters:
  • layer (str, optional) – A valid service layer. Layer names with _hr are high resolution and _mr are medium resolution. Also, layer names with _nonconus are for non-conus areas, i.e., Alaska, Hawaii, Puerto Rico, the Virgin Islands , and the Pacific Islands. Valid layers are:

    • point

    • point_event

    • line_hr

    • flow_direction

    • flowline_mr

    • flowline_hr_nonconus

    • flowline_hr

    • area_mr

    • area_hr_nonconus

    • area_hr

    • waterbody_mr

    • waterbody_hr_nonconus

    • waterbody_hr

  • outfields (str or list, optional) – Target field name(s), default to “*” i.e., all the fields.

  • crs (str, int, or pyproj.CRS, optional) – Target spatial reference, default to EPSG:4326

class pynhd.pynhd.NHDPlusHR(layer, outfields='*', crs=4326)#

Access National Hydrography Dataset (NHD) Plus high resolution.

Notes

For more info visit: https://hydro.nationalmap.gov/arcgis/rest/services/NHDPlus_HR/MapServer

Parameters:
  • layer (str, optional) – A valid service layer. Valid layers are:

    • gage for NHDPlusGage layer

    • sink for NHDPlusSink layer

    • point for NHDPoint layer

    • flowline for NetworkNHDFlowline layer

    • non_network_flowline for NonNetworkNHDFlowline layer

    • flow_direction for FlowDirection layer

    • wall for NHDPlusWall layer

    • line for NHDLine layer

    • area for NHDArea layer

    • waterbody for NHDWaterbody layer

    • catchment for NHDPlusCatchment layer

    • boundary_unit for NHDPlusBoundaryUnit layer

    • huc12 for WBDHU12 layer

  • outfields (str or list, optional) – Target field name(s), default to “*” i.e., all the fields.

  • crs (str, int, or pyproj.CRS, optional) – Target spatial reference, default to EPSG:4326

class pynhd.pynhd.NLDI#

Access the Hydro Network-Linked Data Index (NLDI) service.

comid_byloc(coords, loc_crs=4326)#

Get the closest ComID based on coordinates using hydrolocation endpoint.

Notes

This function tries to find the closest ComID based on flowline grid cells. If such a cell is not found, it will return the closest ComID using the flowtrace endpoint of the PyGeoAPI service to find the closest downstream ComID. The returned dataframe has a measure column that indicates the location of the input coordinate on the flowline as a percentage of the total flowline length.

Parameters:
  • coords (tuple or list of tuples) – A tuple of length two (x, y) or a list of them.

  • loc_crs (str, int, or pyproj.CRS, optional) – The spatial reference of the input coordinate, defaults to EPSG:4326.

Returns:

geopandas.GeoDataFrame or (geopandas.GeoDataFrame, list) – NLDI indexed ComID(s) and points in EPSG:4326. If some coords don’t return any ComID a list of missing coords are returned as well.

feature_byloc(coords, loc_crs=4326)#

Get the closest feature ID(s) based on coordinates using position endpoint.

Parameters:
  • coords (tuple or list) – A tuple of length two (x, y) or a list of them.

  • loc_crs (str, int, or pyproj.CRS, optional) – The spatial reference of the input coordinate, defaults to EPSG:4326.

Returns:

geopandas.GeoDataFrame or (geopandas.GeoDataFrame, list) – NLDI indexed feature ID(s) and flowlines in EPSG:4326. If some coords don’t return any IDs a list of missing coords are returned as well.

get_basins(feature_ids, fsource='nwissite', split_catchment=False, simplified=True)#

Get basins for a list of station IDs.

Parameters:
  • feature_ids (str or list) – Target feature ID(s).

  • fsource (str) – The name of feature(s) source, defaults to nwissite. The valid sources are:

    • ‘comid’ for NHDPlus comid.

    • ‘ca_gages’ for Streamgage catalog for CA SB19

    • ‘gfv11_pois’ for USGS Geospatial Fabric V1.1 Points of Interest

    • ‘huc12pp’ for HUC12 Pour Points

    • ‘nmwdi-st’ for New Mexico Water Data Initative Sites

    • ‘nwisgw’ for NWIS Groundwater Sites

    • ‘nwissite’ for NWIS Surface Water Sites

    • ‘ref_gage’ for geoconnex.us reference gages

    • ‘vigil’ for Vigil Network Data

    • ‘wade’ for Water Data Exchange 2.0 Sites

    • ‘WQP’ for Water Quality Portal

  • split_catchment (bool, optional) – If True, split basins at their outlet locations. Default to False.

  • simplified (bool, optional) – If True, return a simplified version of basin geometries. Default to True.

Returns:

geopandas.GeoDataFrame or (geopandas.GeoDataFrame, list) – NLDI indexed basins in EPSG:4326. If some IDs don’t return any features a list of missing ID(s) are returned as well.

getcharacteristic_byid(feature_ids, char_type, fsource=..., char_ids=..., values_only=...)#
getcharacteristic_byid(feature_ids: str | int | Sequence[str | int], char_type: str, fsource: str = ..., char_ids: str | list[str] = ..., values_only: Literal[False] = ...) tuple[pandas.DataFrame, pandas.DataFrame]

Get characteristics using a list ComIDs.

Parameters:
  • feature_ids (str or list) – Target feature ID(s).

  • char_type (str) – Type of the characteristic. Valid values are local for individual reach catchments, tot for network-accumulated values using total cumulative drainage area and div for network-accumulated values using divergence-routed.

  • fsource (str, optional) – The name of feature(s) source, defaults to comid. The valid sources are:

    • ‘comid’ for NHDPlus comid.

    • ‘ca_gages’ for Streamgage catalog for CA SB19

    • ‘gfv11_pois’ for USGS Geospatial Fabric V1.1 Points of Interest

    • ‘huc12pp’ for HUC12 Pour Points

    • ‘nmwdi-st’ for New Mexico Water Data Initative Sites

    • ‘nwisgw’ for NWIS Groundwater Sites

    • ‘nwissite’ for NWIS Surface Water Sites

    • ‘ref_gage’ for geoconnex.us reference gages

    • ‘vigil’ for Vigil Network Data

    • ‘wade’ for Water Data Exchange 2.0 Sites

    • ‘WQP’ for Water Quality Portal

  • char_ids (str or list, optional) – Name(s) of the target characteristics, default to all.

  • values_only (bool, optional) – Whether to return only characteristic_value as a series, default to True. If is set to False, percent_nodata is returned as well.

Returns:

pandas.DataFrame or tuple of pandas.DataFrame – Either only characteristic_value as a dataframe or or if values_only is Fale return percent_nodata as well.

getfeature_byid(fsource, fids)#

Get feature(s) based ID(s).

Parameters:
  • fsource (str) – The name of feature(s) source. The valid sources are:

    • ‘comid’ for NHDPlus comid.

    • ‘ca_gages’ for Streamgage catalog for CA SB19

    • ‘gfv11_pois’ for USGS Geospatial Fabric V1.1 Points of Interest

    • ‘huc12pp’ for HUC12 Pour Points

    • ‘nmwdi-st’ for New Mexico Water Data Initative Sites

    • ‘nwisgw’ for NWIS Groundwater Sites

    • ‘nwissite’ for NWIS Surface Water Sites

    • ‘ref_gage’ for geoconnex.us reference gages

    • ‘vigil’ for Vigil Network Data

    • ‘wade’ for Water Data Exchange 2.0 Sites

    • ‘WQP’ for Water Quality Portal

  • fid (str or list of str) – Feature ID(s).

Returns:

geopandas.GeoDataFrame or (geopandas.GeoDataFrame, list) – NLDI indexed features in EPSG:4326. If some IDs don’t return any features a list of missing ID(s) are returned as well.

navigate_byid(fsource, fid, navigation, source, distance=500, trim_start=False, stop_comid=None)#

Navigate the NHDPlus database from a single feature id up to a distance.

Parameters:
  • fsource (str) – The name of feature(s) source. The valid sources are:

    • ‘comid’ for NHDPlus comid.

    • ‘ca_gages’ for Streamgage catalog for CA SB19

    • ‘gfv11_pois’ for USGS Geospatial Fabric V1.1 Points of Interest

    • ‘huc12pp’ for HUC12 Pour Points

    • ‘nmwdi-st’ for New Mexico Water Data Initative Sites

    • ‘nwisgw’ for NWIS Groundwater Sites

    • ‘nwissite’ for NWIS Surface Water Sites

    • ‘ref_gage’ for geoconnex.us reference gages

    • ‘vigil’ for Vigil Network Data

    • ‘wade’ for Water Data Exchange 2.0 Sites

    • ‘WQP’ for Water Quality Portal

  • fid (str or int) – The ID of the feature.

  • navigation (str) – The navigation method.

  • source (str, optional) – Return the data from another source after navigating the features using fsource, defaults to None.

  • distance (int, optional) – Limit the search for navigation up to a distance in km, defaults is 500 km. Note that this is an expensive request so you have be mindful of the value that you provide. The value must be between 1 to 9999 km.

  • trim_start (bool, optional) – If True, trim the starting flowline at the source feature, defaults to False.

  • stop_comid (str or int, optional) – The ComID to stop the navigationation, defaults to None.

Returns:

geopandas.GeoDataFrame – NLDI indexed features in EPSG:4326.

navigate_byloc(coords, navigation=None, source=None, loc_crs=4326, distance=500, trim_start=False, stop_comid=None)#

Navigate the NHDPlus database from a coordinate.

Notes

This function first calls the feature_byloc function to get the comid of the nearest flowline and then calls the navigate_byid function to get the features from the obtained comid.

Parameters:
  • coords (tuple) – A tuple of length two (x, y).

  • navigation (str, optional) – The navigation method, defaults to None which throws an exception if comid_only is False.

  • source (str, optional) – Return the data from another source after navigating the features based on comid, defaults to None which throws an exception if comid_only is False.

  • loc_crs (str, int, or pyproj.CRS, optional) – The spatial reference of the input coordinate, defaults to EPSG:4326.

  • distance (int, optional) – Limit the search for navigation up to a distance in km, defaults to 500 km. Note that this is an expensive request so you have be mindful of the value that you provide. If you want to get all the available features you can pass a large distance like 9999999.

  • trim_start (bool, optional) – If True, trim the starting flowline at the source feature, defaults to False.

  • stop_comid (str or int, optional) – The ComID to stop the navigationation, defaults to None.

Returns:

geopandas.GeoDataFrame – NLDI indexed features in EPSG:4326.

class pynhd.pynhd.PyGeoAPI#

Access PyGeoAPI service.

cross_section(coord, width, numpts, crs=4326)#

Return a GeoDataFrame from the xsatpoint service.

Parameters:
  • coord (tuple) – The coordinate of the point to extract the cross-section as a tuple,e.g., (lon, lat).

  • width (float) – The width of the cross-section in meters.

  • numpts (int) – The number of points to extract the cross-section from the DEM.

  • crs (str, int, or pyproj.CRS, optional) – The coordinate reference system of the coordinates, defaults to EPSG:4326.

Returns:

geopandas.GeoDataFrame – A GeoDataFrame containing the cross-section at the requested point.

Examples

>>> from pynhd import PyGeoAPI
>>> pga = PyGeoAPI()
>>> gdf = pga.cross_section((-103.80119, 40.2684), width=1000.0, numpts=101, crs=4326)  
>>> print(gdf.iloc[-1, 1])  
1000.0
elevation_profile(coords, numpts, dem_res, crs=4326)#

Return a GeoDataFrame from the xsatendpts service.

Parameters:
  • coords (list) – A list of two coordinates to trace as a list of tuples, e.g., [(x1, y1), (x2, y2)].

  • numpts (int) – The number of points to extract the elevation profile from the DEM.

  • dem_res (int) – The target resolution for requesting the DEM from 3DEP service.

  • crs (str, int, or pyproj.CRS, optional) – The coordinate reference system of the coordinates, defaults to EPSG:4326.

Returns:

geopandas.GeoDataFrame – A GeoDataFrame containing the elevation profile along the requested endpoints.

Examples

>>> from pynhd import PyGeoAPI
>>> pga = PyGeoAPI()
>>> gdf = pga.elevation_profile(
...     [(-103.801086, 40.26772), (-103.80097, 40.270568)], numpts=101, dem_res=1, crs=4326
... )  
>>> print(gdf.iloc[-1, 1])  
411.5906
flow_trace(coord, crs=4326, direction='none')#

Return a GeoDataFrame from the flowtrace service.

Parameters:
  • coord (tuple) – The coordinate of the point to trace as a tuple,e.g., (lon, lat).

  • crs (str) – The coordinate reference system of the coordinates, defaults to EPSG:4326.

  • direction (str, optional) – The direction of flowpaths, either down, up, or none. Defaults to none.

Returns:

geopandas.GeoDataFrame – A GeoDataFrame containing the traced flowline.

Examples

>>> from pynhd import PyGeoAPI
>>> pga = PyGeoAPI()
>>> gdf = pga.flow_trace(
...     (1774209.63, 856381.68), crs="ESRI:102003", direction="none"
... )  
>>> print(gdf.comid.iloc[0])  
22294818
split_catchment(coord, crs=4326, upstream=False)#

Return a GeoDataFrame from the splitcatchment service.

Parameters:
  • coord (tuple) – The coordinate of the point to trace as a tuple,e.g., (lon, lat).

  • crs (str, int, or pyproj.CRS, optional) – The coordinate reference system of the coordinates, defaults to EPSG:4326.

  • upstream (bool, optional) – If True, return all upstream catchments rather than just the local catchment, defaults to False.

Returns:

geopandas.GeoDataFrame – A GeoDataFrame containing the local catchment or the entire upstream catchments.

Examples

>>> from pynhd import PyGeoAPI
>>> pga = PyGeoAPI()
>>> gdf = pga.split_catchment((-73.82705, 43.29139), crs=4326, upstream=False)  
>>> print(gdf.catchmentID.iloc[0])  
22294818
class pynhd.pynhd.WaterData(layer, crs=4326, validation=True)#

Access to WaterData service.

Parameters:
  • layer (str) – A valid layer from the WaterData service. Valid layers are:

    • catchmentsp

    • gagesii

    • gagesii_basins

    • nhdarea

    • nhdflowline_network

    • nhdflowline_nonnetwork

    • nhdwaterbody

    • wbd02

    • wbd04

    • wbd06

    • wbd08

    • wbd10

    • wbd12

    Note that the layers’ namespace for the WaterData service is wmadata and will be added to the given layer argument if it is not provided.

  • crs (str, int, or pyproj.CRS, optional) – The target spatial reference system, defaults to epsg:4326.

  • validation (bool, optional) – Whether to validate the input data, defaults to True.

bybox(bbox, box_crs=4326, sort_attr=None)#

Get features within a bounding box.

Parameters:
  • bbox (tuple of floats) – A bounding box in the form of (minx, miny, maxx, maxy).

  • box_crs (str, int, or pyproj.CRS, optional) – The spatial reference system of the bounding box, defaults to epsg:4326.

  • sort_attr (str, optional) – The column name in the database to sort request by, defaults to the first attribute in the schema that contains id in its name.

Returns:

geopandas.GeoDataFrame – The requested features in a GeoDataFrames.

bydistance(coords, distance, loc_crs=4326, sort_attr=None)#

Get features within a radius (in meters) of a point.

Parameters:
  • coords (tuple of float) – The x, y coordinates of the point.

  • distance (int) – The radius (in meters) to search within.

  • loc_crs (str, int, or pyproj.CRS, optional) – The CRS of the input coordinates, default to epsg:4326.

  • sort_attr (str, optional) – The column name in the database to sort request by, defaults to the first attribute in the schema that contains id in its name.

Returns:

geopandas.GeoDataFrame – Requested features as a GeoDataFrame.

byfilter(cql_filter, method='GET', sort_attr=None)#

Get features based on a CQL filter.

Parameters:
  • cql_filter (str) – The CQL filter to use for requesting the data.

  • method (str, optional) – The HTTP method to use for requesting the data, defaults to GET.

  • sort_attr (str, optional) – The column name in the database to sort request by, defaults to the first attribute in the schema that contains id in its name.

Returns:

geopandas.GeoDataFrame – The requested features as a GeoDataFrames.

bygeom(geometry, geo_crs=4326, xy=True, predicate='INTERSECTS', sort_attr=None)#

Get features within a geometry.

Parameters:
  • geometry (shapely.geometry) – The input geometry

  • geo_crs (str, int, or pyproj.CRS, optional) – The CRS of the input geometry, default to epsg:4326.

  • xy (bool, optional) – Whether axis order of the input geometry is xy or yx.

  • predicate (str, optional) – The geometric prediacte to use for requesting the data, defaults to INTERSECTS. Valid predicates are:

    • EQUALS

    • DISJOINT

    • INTERSECTS

    • TOUCHES

    • CROSSES

    • WITHIN

    • CONTAINS

    • OVERLAPS

    • RELATE

    • BEYOND

  • sort_attr (str, optional) – The column name in the database to sort request by, defaults to the first attribute in the schema that contains id in its name.

Returns:

geopandas.GeoDataFrame – The requested features in the given geometry.

byid(featurename, featureids)#

Get features based on IDs.

pynhd.pynhd.pygeoapi(coords, service)#

Return a GeoDataFrame from the flowtrace service.

Parameters:
  • coords (geopandas.GeoDataFrame) – A GeoDataFrame containing the coordinates to query. The required columns services are:

    • flow_trace: direction that indicates the direction of the flow trace. It can be up, down, or none.

    • split_catchment: upstream that indicates whether to return all upstream catchments or just the local catchment.

    • elevation_profile: numpts that indicates the number of points to extract along the flowpath and 3dep_res that indicates the target resolution for requesting the DEM from 3DEP service.

    • cross_section: numpts that indicates the number of points to extract along the flowpath and width that indicates the width of the cross-section in meters.

  • service (str) – The service to query, can be flow_trace, split_catchment, elevation_profile, or cross_section.

Returns:

geopandas.GeoDataFrame – A GeoDataFrame containing the results of requested service.

Examples

>>> from shapely import Point
>>> gdf = gpd.GeoDataFrame(
...     {
...         "direction": [
...             "none",
...         ]
...     },
...     geometry=[Point((1774209.63, 856381.68))],
...     crs="ESRI:102003",
... )
>>> trace = nhd.pygeoapi(gdf, "flow_trace")
>>> print(trace.comid.iloc[0])
22294818