Skip to content

Use python for carbon calculation from raster files (VRT)

Sifeddine Biri requested to merge sif/py-stats into main

Intro

In this PR, we switch the underlying technology for carbon stats calculation, querying the large TIF data from PostGIS to Python. This allows us to avoid the step where we have to load raster into the DB, which is time and resource-consuming and adds more operational complications (backups, uploading, restoring, tuning…). We just read directly from TIF files on disk. We leverage Python packages like rasterio and numpy. We also leverage parallel computation. Testing on a local computer, returning the stats for any polygon selection takes at most 10 seconds. The time it takes will always be predictable/stable as we always sample up to 2500 random points (can be changed) from the polygon. Our backend is in Clojure; we leverage libpython-clj to establish a bridge between Python and JVM (Clojure running on top) and have the result as a map. This system aims to be compatible with the previous one (in terms of data shape returned) to be a drop-in replacement. The main requirement for this to work is to generate a VRT mosaic raster. The default location carbon_tiles/mosaic.vrt can be updated in the RASTER_PATH constant. One future improvement can be making this configurable from config.edn. A script carbon_tiles/create_vrt.sh was provided to generate this VRT (quick step). A script carbon_tiles/optimize_tifs.sh for an optional step, to optimize the raster tiles (enabling tiling and craning in 512x512 block size), was also provided.

Requirement:

  • Java 17 requirement (for libpython-clj to work properly)
  • GDAL should be installed
  • pip install numpy rasterio shapely geojson pyproj (packages we depend on)

To Test:

  • Put some TIFs inside carbon_tiles files
  • Optionally optimize them (output to optimized)
  • Run create_vrt to have mosaic.vrt; the Python script will read it to do calculations

Closes FCF-72

Edited by Sifeddine Biri

Merge request reports