MODERN GIS CERTIFICATION
E1: Cloud-Native Geospatial Files
This brick introduces the essential file formats powering modern, scalable spatial data workflows.
You’ll get hands-on experience with formats like:
-
Cloud-Optimized GeoTIFF (COG) for raster data
-
GeoParquet for columnar vector data
Brick E1 – Core Competencies (High-Level)
Cloud Storage & Streaming I/O
Authenticate to S3-style buckets and read/write objects on-the-fly without full downloads.Raster → Cloud-Optimized GeoTIFF
Transform GeoTIFFs into COGs with internal tiling, overviews, and best-practice compression.Vector → GeoParquet & Spatial Partitioning
Convert vector data in-memory to GeoParquet (v1.1), apply row-group tuning, and split by spatial key.Lightweight Validation & CI
Extract minimal metadata (COG footprint, Parquet schema), write pytest checks, and automate grading.
Lesson Skills
E1 Certified Skills
Full list of skills and tools used during this certification track. Anyone who has the validated badge for this track has used these tools and skills.
– Understand AWS S3 concepts (buckets, objects, prefixes, authentication)
– Configure and connect to an object store (anonymous vs. signed access)
– Use fsspec
and obstore
(and the s3fs
backend) to read and write remote files as if they were local
– Stream data tile-by-tile (no full file download)
– Use Rasterio and rio-cogeo
to transform plain GeoTIFFs into COGs
– Tune internal tiling (blockxsize
, blockysize
) and overviews for optimal HTTP-range reads
– Apply best-practice compression (DEFLATE, ZSTD, etc.) and predictors
– Read Shapefiles (or GeoPackages) in-memory with GeoPandas (and VSIMEM)
– Write out GeoParquet with PyArrow, setting row-group size, compression, and GeoParquet version
– Understand how row-grouping and spatial keys (geohash, Hilbert) impact query performance
– Leverage Rasterio’s MemoryFile
(VSIMEM) for zero-scratch-disk conversions
– Use io.BytesIO
or Obstore’s file-like readers/writers for pure in-Python pipelines
– Choose appropriate tile sizes (256 × 256 vs. 512 × 512) based on use case
– Select compression codecs (lossless vs. lossy) for data vs. visualization needs
– Partition vector data (quad-tree, k-d tree) for spatial pruning
– Inspect COG metadata and validate with rio_cogeo.cog_validate
/ cog_info
– Read GeoParquet footers to assert schema, row groups, and GeoParquet metadata
– Write automated tests (pytest) that verify your outputs without downloading entire files
– Package your workflow into a Gitpod/GitHub Codespaces environment
– Automate dependencies (GDAL, rasterio, geopandas, rio-cogeo) via Docker/Gitpod config
– Integrate CI checks that grade your notebook outputs and issue badges
Affordable plans
Flexible pricing to suit every team size
Choose from scalable pricing options that grow with your team’s needs and budget.
🔥 START E1 BRICK 🔥
E1 Brick
Learn the complete track and how to interweave skills in a Capstone seminar
- Full E1 Brick content
- Certification badge on completion
- Cloud resources and code
Full Brick Track
Learn the complete track and how to interweave skills in a Capstone seminar
- Access to all 4 E-Track Bricks
- Certifications for each Brick
- Capstone seminar for the track
- Capstone certifiation

Spatial Lab Membership
Get full access to additional tracks, community resources, and live cohort events.
- Access to the E-Track and other upcoming tracks
- Community supported learning with other learners
- Access to live events for this track
- Much more...