Uncategorized

The Real State of Geospatial AI: What Foundation Models, Embeddings, and Earth Observation Can (and Can’t) Do Today

June 8, 2026 Matt Forrest Comments Off

The Real State of Geospatial AI

Geospatial AI feels like it’s moving fast, but if you look closely, it’s still early days. Not in a discouraging way. In an exciting way. A “wild west” moment where incredible tools exist, but the rules, workflows, and best practices are still taking shape.

In a recent Spatial Stack conversation with Chris Ren and Isaac Corley from Wherobots, we dug into what’s really happening across Earth observation, foundation models, embeddings, and the challenge of scaling geospatial machine learning.

This article breaks down the big ideas in a practical way, without the hype.

Earth Observation Isn’t Waiting on AI

One of the clearest points from this discussion is that we already have the tools to derive a huge amount of insight from Earth observation data.

Sensors like MODIS, Landsat, and Sentinel were designed to capture meaningful physical signals. Vegetation dynamics. Moisture. Land surface change.

A lot of today’s problems, from deforestation detection to crop mapping, can still be solved with traditional approaches and domain knowledge.

Chris shared an example where he mapped tea plantations in Indonesia with no labeled training data.

No deep learning. No embeddings. No giant model.

Just:

A few Google searches
A visual understanding of how tea fields appear
An HSV transform to detect spectral “yellowish” signatures in Sentinel-2

The result? A scalable, interpretable solution built in a day.

This is the heart of modern remote sensing. The best solutions usually come from combining data, physics, intuition, and simple methods before jumping to neural networks.

Where Embeddings Fit (and Where They Don’t)

Embeddings are one of the most talked-about ideas right now, especially with projects like AlphaEarth and the growing number of Earth-focused foundation models.

Here’s the simplest way to think about embeddings:

They are compressed representations of an image or time series.

Averages, medians, or standard deviations are technically embeddings. Neural embeddings simply compress more information.

The question isn’t whether embeddings are useful. It’s whether they capture the right signals for the problems we care about.

For example:

Corn and soy have different NDVI curves
Floodwater has a different spectral signature from bare soil
Built-up areas behave differently in SAR backscatter

An embedding is valuable if it preserves the patterns needed to separate those real-world categories.

Right now, it’s an open question whether the most popular embedding models consistently do this across geography, seasonality, and sensor variations.

Benchmarks in Earth observation simply aren’t at ImageNet levels yet. And many of them are built on noisy labels from existing land cover products. That makes it hard to trust single-number performance claims.

Foundation Models: Impressive, But Not Magic

Tools like SAM, OWL-V2, and general-purpose vision transformers sparked a wave of excitement in geospatial.

But there’s a catch.

Most were trained on natural images, not satellite imagery. That means:

They struggle with top-down views
They miss fine-grained features
They require careful prompting or patch sizing
They break on geography shifts or small objects

Isaac described how combining text-to-box models with SAM often works better than using SAM alone, because it narrows the focus of the segmentation task.

Even then, you need very specific prompts like:

“A baseball field in the middle of a park”

instead of

“baseball field”

These tools are great for demos, prototyping, and human-in-the-loop mapping. They are not yet reliable replacements for purpose-built geospatial models.

Scaling Is Not the Same as Solving the Problem

We need to talk about scale, because it’s used as shorthand for “progress” in the field.

But scale comes in different shapes:

Geographic scale
Computational scale
Operational scale
Business scale

And as Chris pointed out, customers rarely need global models. Most want their region, not the world.

A global model inevitably becomes worse somewhere. Local models can outperform global ones with a fraction of the data and compute.

This is why regional modeling will likely grow, not disappear. It mirrors how humans work. We generalize within a geography and adapt our mental models when the environment changes.

The Real Bottleneck: IO, Storage, and Compute

One of the strongest themes in this conversation was that the biggest innovation gap is not modeling.

It’s the plumbing.

How fast can you read 10 TB of Sentinel-2?
Can you run inference without downloading anything?
Can compute run next to storage instead of across the internet?
Can we reduce costs from thousands of dollars to hundreds?

This is where the largest practical gains will happen. Solving IO unlocks everything else.

Zarr, COGs, optimized access patterns, and cloud-native computation are more impactful today than another architecture tweak in a transformer.

This is why platforms like Wherobots, Xarray/Zarr, Earth Engine alternatives, and cloud-native raster processing tools matter more than ever. They reduce operational friction so teams can actually apply the models they build.

Where Geospatial AI Goes Next

A few clear paths emerged in this conversation:

1. Better Benchmarking

We need Earth observation’s version of ImageNet.

Not another land cover dataset built on noisy labels.

A challenging, diverse dataset that forces real generalization.

2. Region-Specific Models

Instead of one model to rule them all, we’ll see:

Continental models
Biome-based models
AOI-specific rapid-fine-tuning workflows

This mirrors real-world variation.

3. Human-AI Hybrid Mapping

Humans provide the intuition.

AI provides the speed.

Together they outperform both.

4. Efficient, Localized Compute

The next breakthroughs will come from:

Decoupled storage and compute
Zero-copy IO
Distributed tiling
GPU-friendly formats
Fast raster chunking

This shifts geospatial AI from “demo” to “production.”

5. More Real Problems, Fewer Demos

Counting baseball fields is fun.

But impact comes from:

Agriculture
Climate
Crisis response
Insurance
Infrastructure
Energy

The field grows when solutions make someone’s job easier.

If You Want to Learn This Stuff: Start Building

Both Chris and Isaac emphasized the same advice.

Don’t just read. Don’t scroll LinkedIn.

Download some data and map something.

Pick a problem:

Aquaculture in Bali
Flood damage in Pakistan
Tree canopy in your city
Wetlands in Louisiana
Solar farms across Texas

You’ll learn:

Where the data breaks
Where models fail
How bad labels are
What tasks are easy or impossible
Why scale is expensive
What tools actually help

This is the fastest path to understanding the field.

Final Thoughts

Geospatial AI is moving fast, but the real progress is happening under the hood. Better data access. Better infrastructure. Better workflows. The flashy models matter, but practical systems thinking matters more.

The future of Earth observation won’t be decided by model size.

It will be decided by:

How fast we can move data
How well we integrate domain knowledge
How efficiently we compute
How targeted our models become
How useful our outputs are to real people

As always, the future is spatial.

And right now, we’re just getting started.