Yes, Terraform has a graph. That's not the point.
Every time I talk about Stategraph, someone eventually says "Doesn't Terraform already have graph walking?" Yes. Terraform builds a DAG and walks it during plan and apply. It parallelizes nodes. You can tune it with -parallelism. And none of that solves the problem we're solving.
This is not a "we discovered graphs" moment. Terraform absolutely has a graph. It's a well-engineered DAG scheduler embedded in the CLI. The issue isn't whether Terraform walks a graph. The issue is that Terraform's graph is an implementation detail.
If you think "graph walking" is the story, you've confused a data structure for a control plane.
What Terraform's graph actually does
Let's be precise. Terraform:
- Parses configuration
- Builds a dependency graph
- Resolves references
- Orders operations
- Walks the DAG
- Executes up to N nodes in parallel (default 10)
- Exits
- Discards the graph
The graph lives inside a single process, for a single run, against a single state. It exists to schedule work. It does not coordinate systems.
Observation
Terraform's graph is ephemeral by design. It's reconstructed on every run from configuration and state. This means the graph can never be more than a local scheduling optimization. It has no memory, no history, and no awareness of other concurrent operations.
Parallelism is not the bottleneck
When someone says "Terraform already does graph walking," what they usually mean is "Terraform already parallelizes things." Sure. But parallelism inside a single apply was never the fundamental scaling constraint.
If your system is small, Terraform is fine.
If your system is large, your bottlenecks look like this:
- State-level contention
- CI queues that serialize everything
- Monorepo blast radius
- Refreshing the world when you changed a pebble
- Re-planning massive surfaces for tiny diffs
- Cross-state coordination via shell scripts and optimism
- Terragrunt
run-alltrying to impersonate a transaction manager
None of those are fixed by increasing -parallelism from 10 to 50. You can't thread-pool your way out of architectural limits.
Terraform's safety model is effectively one state, one writer. Whatever the backend, the coordination boundary is still the state. One operation owns the write path, and everyone else waits. That means independent subgraphs can't proceed concurrently. The graph inside Terraform only optimizes work after you've acquired the global lock.
CI queues, PR serialization, Terragrunt wrappers. Ceremony around a file lock.
Which is like optimizing the fuel efficiency of a car that's stuck in traffic.
Terraform walks a DAG. Stategraph operates a DAG.
This is the core difference.
Terraform builds a graph to execute a run. Stategraph treats the graph as the system.
And before someone says "Terraform Cloud already coordinates runs," let's be clear. It coordinates runs around Terraform. Stategraph coordinates runs through the graph. Those are not the same thing.
That means:
- The graph is persisted
- The graph is indexed
- The graph is queryable
- The graph coordinates execution
- The graph enforces resource-level locking
- The graph spans states
Terraform's graph is ephemeral. Stategraph's graph is infrastructure. That's not a performance tweak. That's a different execution model.
Design Principle
When the graph is persistent, it stops being an internal structure and becomes a system of record. Once it's a system of record, it becomes a control plane. And once it's a control plane, you can coordinate at resource granularity instead of file granularity.
In-run parallelism versus subgraph execution
Within a single apply, if two nodes don't depend on each other, they can run concurrently. Great. But that still happens inside one process, under one state lock, in one isolated execution context, with no awareness of other runs.
Stategraph takes a different view. Instead of asking "How do we parallelize within one apply?" we ask a different question.
What is the minimal impacted subgraph of this change?
If a change touches 5 resources in a graph of 10,000, why should we reason about 10,000? If two changes touch disjoint subgraphs, why should they block each other? If a database and a CDN are independent in the dependency graph, why are they serialized by a state file boundary?
Subgraph execution means:
- Identify only what's affected
- Lock only what's necessary
- Allow disjoint work to proceed
- Coordinate at the resource level, not the file level
That's not "more parallelism." That's removing the global mutex.
If your coordination primitive is a file lock, your scaling story is a queue.
Persistence changes everything
Here's the part most people miss. Terraform rebuilds the graph every run. Stategraph persists it.
Once the graph is persistent, you unlock:
- Incremental recomputation
- Fast impact analysis
- Resource-level history
- Execution metadata attached to nodes
- Queryable blast radius
- Attachable cost and security data
- True change intelligence
Once the graph is persistent, it stops being a data structure and starts being infrastructure.
That's when coordination moves from "hope the pipeline runs in order" to "the system enforces invariants."
Implementation Detail
Stategraph stores the graph in PostgreSQL as a normalized schema: resources table, dependencies table, transactions log. This enables SQL queries over your infrastructure, ACID transactions across states, and resource-level concurrency control for safe parallel operations.
Why Terraform works this way
This isn't an accident. Terraform's architecture optimizes for portability and simplicity.
- State is a file.
- The CLI is stateless.
- The execution model is local.
That makes Terraform easy to reason about and easy to distribute.
It also means Terraform can't accumulate institutional memory. Every run starts fresh.
But it also hard-codes a coordination boundary at the state file.
Stategraph makes a different tradeoff. We accept a persistent control plane because coordination, not portability, is the scaling constraint in 2026.
A cleaner mental model
Here's the simplest way to think about it:
Terraform:
- Builds a graph
- Walks the graph
- Throws the graph away
Stategraph:
- Builds the graph
- Stores the graph
- Coordinates through the graph
- Executes through the graph
- Queries the graph
- Evolves the graph
Terraform's graph is an execution detail. Stategraph's graph is the system.
Why this matters now
At small scale, none of this matters. At enterprise scale, it's everything.
When you have hundreds of engineers, dozens of states, monorepos, regulated environments, slow CI queues, and constant lock contention, the bottleneck isn't CPU. It's coordination.
And Terraform's architecture centralizes coordination at the state file boundary. Stategraph decentralizes coordination to the resource boundary.
That's the shift.
Terraform walks a graph.
Stategraph makes the graph the control plane.
Terraform optimized single-run execution. Stategraph optimizes organizational coordination.
At scale, the bottleneck isn't CPU. It's who's allowed to touch what, when.
Stategraph moves that boundary from the state file to the resource.
If your infrastructure execution model is a queue, your engineering org eventually becomes one too.
Stop coordinating. Start shipping
Graph-based state. Resource-level locking. Multi-state transactions.
The graph becomes infrastructure, not an execution detail.
// Zero spam. Just progress updates as we build Stategraph.