The Real State of Geospatial AI: What Foundation Models, Embeddings, and Earth Observation Can (and Can’t) Do Today
The Real State of Geospatial AI
Geospatial AI feels like it’s moving fast, but if you look closely, it’s still early days. Not in a discouraging way. In an exciting way. A “wild west” moment where incredible tools exist, but the rules, workflows, and best practices are still taking shape.
In a recent Spatial Stack conversation with Chris Ren and Isaac Corley from Wherobots, we dug into what’s really happening across Earth observation, foundation models, embeddings, and the challenge of scaling geospatial machine learning.
This article breaks down the big ideas in a practical way, without the hype.
Earth Observation Isn’t Waiting on AI
One of the clearest points from this discussion is that we already have the tools to derive a huge amount of insight from Earth observation data.
Sensors like MODIS, Landsat, and Sentinel were designed to capture meaningful physical signals. Vegetation dynamics. Moisture. Land surface change.
A lot of today’s problems, from deforestation detection to crop mapping, can still be solved with traditional approaches and domain knowledge.
Chris shared an example where he mapped tea plantations in Indonesia with no labeled training data.
No deep learning. No embeddings. No giant model.
Just:
- A few Google searches
- A visual understanding of how tea fields appear
- An HSV transform to detect spectral “yellowish” signatures in Sentinel-2
The result? A scalable, interpretable solution built in a day.
This is the heart of modern remote sensing. The best solutions usually come from combining data, physics, intuition, and simple methods before jumping to neural networks.
Where Embeddings Fit (and Where They Don’t)
Embeddings are one of the most talked-about ideas right now, especially with projects like AlphaEarth and the growing number of Earth-focused foundation models.
Here’s the simplest way to think about embeddings:
They are compressed representations of an image or time series.
Averages, medians, or standard deviations are technically embeddings. Neural embeddings simply compress more information.
The question isn’t whether embeddings are useful. It’s whether they capture the right signals for the problems we care about.
For example:
- Corn and soy have different NDVI curves
- Floodwater has a different spectral signature from bare soil
- Built-up areas behave differently in SAR backscatter
An embedding is valuable if it preserves the patterns needed to separate those real-world categories.
Right now, it’s an open question whether the most popular embedding models consistently do this across geography, seasonality, and sensor variations.
Benchmarks in Earth observation simply aren’t at ImageNet levels yet. And many of them are built on noisy labels from existing land cover products. That makes it hard to trust single-number performance claims.
Foundation Models: Impressive, But Not Magic
Tools like SAM, OWL-V2, and general-purpose vision transformers sparked a wave of excitement in geospatial.
But there’s a catch.
Most were trained on natural images, not satellite imagery. That means:
- They struggle with top-down views
- They miss fine-grained features
- They require careful prompting or patch sizing
- They break on geography shifts or small objects
Isaac described how combining text-to-box models with SAM often works better than using SAM alone, because it narrows the focus of the segmentation task.
Even then, you need very specific prompts like:
“A baseball field in the middle of a park”
instead of
“baseball field”
These tools are great for demos, prototyping, and human-in-the-loop mapping. They are not yet reliable replacements for purpose-built geospatial models.
Scaling Is Not the Same as Solving the Problem
We need to talk about scale, because it’s used as shorthand for “progress” in the field.
But scale comes in different shapes:
- Geographic scale
- Computational scale
- Operational scale
- Business scale
And as Chris pointed out, customers rarely need global models. Most want their region, not the world.
A global model inevitably becomes worse somewhere. Local models can outperform global ones with a fraction of the data and compute.
This is why regional modeling will likely grow, not disappear. It mirrors how humans work. We generalize within a geography and adapt our mental models when the environment changes.
The Real Bottleneck: IO, Storage, and Compute
One of the strongest themes in this conversation was that the biggest innovation gap is not modeling.
It’s the plumbing.
- How fast can you read 10 TB of Sentinel-2?
- Can you run inference without downloading anything?
- Can compute run next to storage instead of across the internet?
- Can we reduce costs from thousands of dollars to hundreds?
This is where the largest practical gains will happen. Solving IO unlocks everything else.
Zarr, COGs, optimized access patterns, and cloud-native computation are more impactful today than another architecture tweak in a transformer.
This is why platforms like Wherobots, Xarray/Zarr, Earth Engine alternatives, and cloud-native raster processing tools matter more than ever. They reduce operational friction so teams can actually apply the models they build.
Where Geospatial AI Goes Next
A few clear paths emerged in this conversation:
1. Better Benchmarking
We need Earth observation’s version of ImageNet.
Not another land cover dataset built on noisy labels.
A challenging, diverse dataset that forces real generalization.
2. Region-Specific Models
Instead of one model to rule them all, we’ll see:
- Continental models
- Biome-based models
- AOI-specific rapid-fine-tuning workflows
This mirrors real-world variation.
3. Human-AI Hybrid Mapping
Humans provide the intuition.
AI provides the speed.
Together they outperform both.
4. Efficient, Localized Compute
The next breakthroughs will come from:
- Decoupled storage and compute
- Zero-copy IO
- Distributed tiling
- GPU-friendly formats
- Fast raster chunking
This shifts geospatial AI from “demo” to “production.”
5. More Real Problems, Fewer Demos
Counting baseball fields is fun.
But impact comes from:
- Agriculture
- Climate
- Crisis response
- Insurance
- Infrastructure
- Energy
The field grows when solutions make someone’s job easier.
If You Want to Learn This Stuff: Start Building
Both Chris and Isaac emphasized the same advice.
Don’t just read. Don’t scroll LinkedIn.
Download some data and map something.
Pick a problem:
- Aquaculture in Bali
- Flood damage in Pakistan
- Tree canopy in your city
- Wetlands in Louisiana
- Solar farms across Texas
You’ll learn:
- Where the data breaks
- Where models fail
- How bad labels are
- What tasks are easy or impossible
- Why scale is expensive
- What tools actually help
This is the fastest path to understanding the field.
Final Thoughts
Geospatial AI is moving fast, but the real progress is happening under the hood. Better data access. Better infrastructure. Better workflows. The flashy models matter, but practical systems thinking matters more.
The future of Earth observation won’t be decided by model size.
It will be decided by:
- How fast we can move data
- How well we integrate domain knowledge
- How efficiently we compute
- How targeted our models become
- How useful our outputs are to real people
As always, the future is spatial.
And right now, we’re just getting started.
