pygeoutils.pygeoutils#

Some utilities for manipulating GeoSpatial data.

Module Contents#

class pygeoutils.pygeoutils.Coordinates#

Generate validated and normalized coordinates in WGS84.

Parameters
  • lon (float or list of floats) – Longitude(s) in decimal degrees.

  • lat (float or list of floats) – Latitude(s) in decimal degrees.

  • bounds (tuple of length 4, optional) – The bounding box to check of the input coordinates fall within. Defaults to WGS84 bounds.

Examples

>>> from pygeoutils import Coordinates
>>> c = Coordinates([460, 20, -30], [80, 200, 10])
>>> c.points.x.tolist()
[100.0, -30.0]
property points: geopandas.GeoSeries#

Get validate coordinate as a geopandas.GeoSeries.

class pygeoutils.pygeoutils.GeoBSpline(points, npts_sp, degree=3)#

Create B-spline from a geo-dataframe of points.

Parameters
  • points (geopandas.GeoDataFrame or geopandas.GeoSeries) – Input points as a GeoDataFrame or GeoSeries in a projected CRS.

  • npts_sp (int) – Number of points in the output spline curve.

  • degree (int, optional) – Degree of the spline. Should be less than the number of points and greater than 1. Default is 3.

Examples

>>> from pygeoutils import GeoBSpline
>>> import geopandas as gpd
>>> xl, yl = zip(
...     *[
...         (-97.06138, 32.837),
...         (-97.06133, 32.836),
...         (-97.06124, 32.834),
...         (-97.06127, 32.832),
...     ]
... )
>>> pts = gpd.GeoSeries(gpd.points_from_xy(xl, yl, crs=4326))
>>> sp = GeoBSpline(pts.to_crs("epsg:3857"), 5).spline
>>> pts_sp = gpd.GeoSeries(gpd.points_from_xy(sp.x, sp.y, crs="epsg:3857"))
>>> pts_sp = pts_sp.to_crs("epsg:4326")
>>> list(zip(pts_sp.x, pts_sp.y))
[(-97.06138, 32.837),
(-97.06135, 32.83629),
(-97.06131, 32.83538),
(-97.06128, 32.83434),
(-97.06127, 32.83319)]
property spline: Spline#

Get the spline as a Spline object.

pygeoutils.pygeoutils.arcgis2geojson(arcgis, id_attr=None)#

Convert ESRIGeoJSON format to GeoJSON.

Notes

Based on arcgis2geojson.

Parameters
  • arcgis (str or binary) – The ESRIGeoJSON format str (or binary)

  • id_attr (str, optional) – ID of the attribute of interest, defaults to None.

Returns

dict – A GeoJSON file readable by GeoPandas.

pygeoutils.pygeoutils.break_lines(lines, points, tol=0.0)#

Break lines at specified points at given direction.

Parameters
  • lines (geopandas.GeoDataFrame) – Lines to break at intersection points.

  • points (geopandas.GeoDataFrame) – Points to break lines at. It must contain a column named direction with values up or down. This column is used to determine which part of the lines to keep, i.e., upstream or downstream of points.

  • tol (float, optional) – Tolerance for snapping points to the nearest lines in meters. The default is 0.0.

Returns

geopandas.GeoDataFrame – Original lines except for the parts that have been broken at the specified points.

pygeoutils.pygeoutils.geo2polygon(geometry, geo_crs, crs)#

Convert a geometry to a Shapely’s Polygon and transform to any CRS.

Parameters
  • geometry (Polygon or tuple of length 4) – Polygon or bounding box (west, south, east, north).

  • geo_crs (int, str, or pyproj.CRS) – Spatial reference of the input geometry

  • crs (int, str, or pyproj.CRS) – Target spatial reference.

Returns

Polygon – A Polygon in the target CRS.

pygeoutils.pygeoutils.geometry_list(geometry)#

Get a list of polygons, points, and lines from a geometry.

pygeoutils.pygeoutils.get_transform(ds, ds_dims=('y', 'x'))#

Get transform of a xarray.Dataset or xarray.DataArray.

Parameters
  • ds (xarray.Dataset or xarray.DataArray) – The dataset(array) to be masked

  • ds_dims (tuple, optional) – Names of the coordinames in the dataset, defaults to ("y", "x"). The order of the dimension names must be (vertical, horizontal).

Returns

rasterio.Affine, int, int – The affine transform, width, and height

pygeoutils.pygeoutils.gtiff2xarray(r_dict, geometry=None, geo_crs=None, ds_dims=None, driver=None, all_touched=False, nodata=None, drop=True)#

Convert (Geo)Tiff byte responses to xarray.Dataset.

Parameters
  • r_dict (dict) – Dictionary of (Geo)Tiff byte responses where keys are some names that are used for naming each responses, and values are bytes.

  • geometry (Polygon, MultiPolygon, or tuple, optional) – The geometry to mask the data that should be in the same CRS as the r_dict. Defaults to None.

  • geo_crs (int, str, or pyproj.CRS, optional) – The spatial reference of the input geometry, defaults to None. This argument should be given when geometry is given.

  • ds_dims (tuple of str, optional) – The names of the vertical and horizontal dimensions (in that order) of the target dataset, default to None. If None, dimension names are determined from a list of common names.

  • driver (str, optional) – A GDAL driver for reading the content, defaults to automatic detection. A list of the drivers can be found here: https://gdal.org/drivers/raster/index.html

  • all_touched (bool, optional) – Include a pixel in the mask if it touches any of the shapes. If False (default), include a pixel only if its center is within one of the shapes, or if it is selected by Bresenham’s line algorithm.

  • nodata (float or int, optional) – The nodata value of the raster, defaults to None, i.e., is determined from the raster.

  • drop (bool, optional) – If True, drop the data outside of the extent of the mask geometries. Otherwise, it will return the same raster with the data masked. Default is True.

Returns

xarray.Dataset or xarray.DataAraay – Parallel (with dask) dataset or dataarray.

pygeoutils.pygeoutils.json2geodf(content, in_crs=4326, crs=4326)#

Create GeoDataFrame from (Geo)JSON.

Parameters
  • content (dict or list of dict) – A (Geo)JSON dictionary e.g., response.json() or a list of them.

  • in_crs (int, str, or pyproj.CRS, optional) – CRS of the content, defaults to epsg:4326.

  • crs (int, str, or pyproj.CRS, optional) – The target CRS of the output GeoDataFrame, defaults to epsg:4326.

Returns

geopandas.GeoDataFrame – Generated geo-data frame from a GeoJSON

pygeoutils.pygeoutils.snap2nearest(lines, points, tol)#

Find the nearest points on a line to a set of points.

Parameters
Returns

geopandas.GeoDataFrame or geopandas.GeoSeries – Points snapped to lines.

pygeoutils.pygeoutils.xarray2geodf(da, dtype, mask_da=None, connectivity=8)#

Vectorize a xarray.DataArray to a geopandas.GeoDataFrame.

Parameters
  • da (xarray.DataArray) – The dataarray to vectorize.

  • dtype (type) – The data type of the dataarray. Valid types are int16, int32, uint8, uint16, and float32.

  • mask_da (xarray.DataArray, optional) – The dataarray to use as a mask, defaults to None.

  • connectivity (int, optional) – Use 4 or 8 pixel connectivity for grouping pixels into features, defaults to 8.

Returns

geopandas.GeoDataFrame – The vectorized dataarray.

pygeoutils.pygeoutils.xarray_geomask(ds, geometry, crs, all_touched=False, drop=True, from_disk=False)#

Mask a xarray.Dataset based on a geometry.

Parameters
  • ds (xarray.Dataset or xarray.DataArray) – The dataset(array) to be masked

  • geometry (Polygon, MultiPolygon, or tuple of length 4) – The geometry to mask the data

  • crs (int, str, or pyproj.CRS) – The spatial reference of the input geometry

  • all_touched (bool, optional) – Include a pixel in the mask if it touches any of the shapes. If False (default), include a pixel only if its center is within one of the shapes, or if it is selected by Bresenham’s line algorithm.

  • drop (bool, optional) – If True, drop the data outside of the extent of the mask geometries. Otherwise, it will return the same raster with the data masked. Default is True.

  • from_disk (bool, optional) – If True, it will clip from disk using rasterio.mask.mask if possible. This is beneficial when the size of the data is larger than memory. Default is False.

Returns

xarray.Dataset or xarray.DataArray – The input dataset with a mask applied (np.nan)