← Back to Blog RSS

Parallel execution: Faster tests and faster Terraform

Testing Terraform Infrastructure Performance

Parallel execution is a simple idea with an outsized impact. If work can run independently, don't make it wait. However, many engineering workflows still behave like a single-threaded program. Test suites queue behind each other, pipelines serialize steps that could overlap, and infrastructure changes block on a global lock.

TL;DR
$ cat parallel-execution.tldr
• Independent tasks should run in parallel. Dependencies decide what actually cannot.
• Parallel testing cuts pipeline time dramatically, but requires isolated test data and deterministic cleanup.
• Terraform builds a dependency DAG but bottlenecks on a flat state file and global lock.
• Stategraph's subgraph execution and resource-level locking enable parallel infra changes without state fragmentation.

What is parallel execution?

This guide connects the dots across two domains where the pain is obvious: software testing and infrastructure as code (IaC). In testing, parallel test execution can let you run more tests, expand broader coverage, and shorten overall testing time, but only if you handle shared test data and interdependent tests. In infrastructure, Terraform and OpenTofu already understand dependency ordering, but the flat state file and coarse locking can turn independent work into sequential waiting.

Parallel execution means multiple operations are executed simultaneously instead of one at a time. That can be parallel threads on one machine, worker processes, or distributed testing across multiple machines. The requirement is the same in every case: tasks must be safe to run concurrently.

A helpful way to think about it is "fan-out / fan-in." You split a workload into multiple test cases, jobs, or resource operations, run them in parallel, then combine outputs into one set of test results or one applied infrastructure state. When the work is actually independent, parallel execution can be the difference between a 30-minute pipeline and a 5-minute pipeline, significantly reducing wait time and making new features easier to ship.

Of course, that only helps if the workflow wasn't accidentally forced into serial execution, which is exactly what happens in many systems.

Pattern Recognition

Fan-out / fan-in is the fundamental shape of parallel work: split into independent units, execute concurrently, collect results. Every parallelization problem reduces to identifying what can safely fan out and what must converge before proceeding.

Why sequential execution creates bottlenecks

In a testing process, sequential testing is when one test suite finishes before the next begins, even if they target different operating systems, different environments, or different browser configurations. In CI/CD, testing one stage at a time is often convenient to implement, but it stretches overall test execution time and delays feedback.

Two common root causes:

Once you see the bottleneck as a scheduling problem, parallel execution becomes the obvious lever. The next step is to see how teams actually apply it in software testing.

Parallel execution in software testing

The bottlenecks above are why parallel testing efforts have become standard in modern software testing. In continuous testing pipelines, teams use parallel test execution to run parallel tests, including across a matrix, so tests run and report faster, even when tests are executed simultaneously.

Parallel test execution typically works by starting N workers (the desired number depends on CPU, CI budget, and resource constraints) and distributing tests in parallel. You might shard tests by test files, by test scripts, or by individual test cases, depending on how balanced your suite is. In a larger setup, distributed testing runs the same test matrix across multiple machines, so tests in parallel cover multiple platforms and validate behavior with tests simultaneously across your grid.

You can run tests across multiple browsers, multiple devices, and multiple configurations, while still shrinking test execution time, which lets teams increase test coverage (more tests, more environments) without paying a linear cost in testing time.

But parallelization also changes failure modes. Interdependent tests that passed in serial order can fail when multiple tests run simultaneously. Shared test data can collide. To make parallel testing work well, you need a dependency strategy.

The challenge of dependencies

Parallel execution in testing teaches the constraint you can't ignore, that dependencies decide what can run concurrently.

Dependencies show up as ordering (A must happen before B), shared data (two tests mutate the same record), or shared infrastructure (one database, one rate limit, one expensive third-party call). If you ignore these, parallel tests become flaky and your test coverage gains evaporate.

The most reliable pattern is to design autonomous tests:

Here's a simple example where unique identifiers are generated for each one-test run, rather than reusing the same data:

import java.util.UUID;
class TestDataFactory {
static String uniqueUserEmail() {
return "user+" + UUID.randomUUID() + "@example.com";
}
}

This is tiny, but the core idea is that you don't let parallel execution turn into shared-state chaos.

If dependencies limit test methods, they limit infrastructure operations too. Terraform already models dependencies as a graph, so the question becomes: why do teams still wait?

Design Principle

Dependencies decide what can run concurrently. Ordering constraints, shared data, and shared infrastructure are all forms of dependency. Map them explicitly before parallelizing, or you will discover them the hard way through intermittent failures.

Parallel execution for infrastructure as code

Dependency is a highly relevant consideration for users of Terraform and OpenTofu. Terraform builds a dependency DAG to determine ordering and can execute independent resource operations in parallel within that graph. However, teams still hit long waits because the state layer is file-based and locking is global.

In practice, Terraform bottlenecks often come from:

  1. A flat state file that forces broad reads and writes
  2. A global lock that serializes all applies
  3. Plan/refresh scopes that are bigger than the change itself

Stategraph Velocity is built to remove those constraints by replacing the flat state file with a database-backed dependency graph. It enables independent changes to run in parallel without turning coordination into a master project of state fragmentation.

To see how that works mechanically, we'll walk through three building blocks: subgraph execution, resource-level locking, and intelligent parallelization.

Subgraph execution

Following from the "work scope" problem, subgraph execution narrows execution to the minimal change cone.

Instead of refreshing and reasoning about the entire state, the system identifies the affected resources plus the dependencies that must be considered for correctness. This graph-aware execution only processes changed subgraphs, including partial refresh of affected resources.

This execution matters because most real changes are small compared to the total graph. If you can avoid the "scan everything" tax, you can drop minutes of work from every plan and apply. With scope reduced, you can also reduce lock scope.

Resource-level locking

Subgraph execution makes it possible to lock precisely what's changing.

With a file backend, locking is coarse: the whole state is locked, so unrelated work can't proceed. Teams often respond by splitting state into multiple files, but that shifts the problem into coordination and cross-state dependency management.

The alternative, resource-level locking, protects only the specific resources being modified. That's how you eliminate lock contention while still protecting correctness.

Locking prevents conflicts, but scheduling still needs to respect ordering. That's where intelligent parallelization finishes the story.

Lock Granularity Principle

Global locks force serial execution. Resource-level locking protects correctness at the boundary that actually matters. The goal is not to eliminate locking; it is to shrink the lock to match exactly what is being modified.

Intelligent parallelization

If resource-level locking solves contention, intelligent parallelization solves orchestration.

A dependency graph tells you which operations must be sequential and which can run in parallel. The scheduler runs independent branches with simultaneous execution, then enforces ordering at the points where edges demand it.

The result is controlled concurrency: run concurrently where the graph allows, and serialize only the dependent steps.

Dependency-aware parallel execution on a database-backed graph is provided, and CLI commands are used at apply time.

Once you have these mechanics, the real question is: What changes for a team's daily workflow?

Real-world performance gains

We've only covered the mechanics, so now let's map them to outcomes, such as faster loops, fewer queues, and lower coordination costs.

In many Terraform pipelines, the problem is that average times aren't just provider latency: every plan touches far more than the change requires. If a plan takes 6 minutes and an engineer runs it 8 times a day, that's almost an hour of waiting, before you factor in lock contention across teams.

Subgraph execution flips that ratio. When the change cone is small, the system spends time on the change, not on global bookkeeping. Multiply that across multiple team members and you get a compounding reduction in overall costs: less idle time, fewer retries, and less pressure to avoid small changes.

By replacing the flat state file with a database-backed dependency graph, Stategraph Velocity can isolate the minimal change cone and execute independent changes in parallel.

Speed is only valuable if it's trustworthy, so the next section focuses on implementing parallel execution safely, whether you're running tests or infrastructure.

Implementing parallel execution safely

The transition from sequential to parallel should feel controlled, not risky. The safety playbook is similar across testing and infrastructure:

  1. Isolate state. For testing, isolate test data per worker so you avoid the same data being mutated across parallel tests. For infrastructure, isolate the changed subgraph so unrelated work doesn't contend.
  2. Use the right lock granularity. Global locks force serial execution, while resource-level locking keeps correctness while allowing teams to move.
  3. Respect dependencies. Build explicit dependency edges when needed, such as setup and teardown steps or shared fixtures in tests, and true resource dependencies in infrastructure, then let the graph enforce the correct ordering.
  4. Keep runs reproducible. Sharding strategies should be deterministic, and failures should be debuggable in the same way, whether a suite ran on one machine or in a distributed testing grid.

Parallel execution doesn't mean you run everything at once; it's about running what's independent, and proving what isn't. With that framing, it's also clear there are times you shouldn't parallelize.

Implementation Detail

Parallel execution does not mean running everything at once. It means running what is independent and proving what is not. The dependency graph is both the enabler and the constraint. Build it explicitly and you get parallelism for free where it is safe.

When not to use parallel execution

Even with the safety patterns, parallel execution isn't always the right choice.

Avoid parallelization when:

In testing, this might be a tiny project where sequential testing is faster, or an integration test that must run alone because it uses a shared environment. In infrastructure, it might be a change that touches a tightly coupled cluster where the dependency graph leaves little room for fan-out.

The point isn't to parallelize everything, it's to parallelize the right things.

Dig into how Velocity works

We began with the core promise of parallel execution: tasks that can run independently should run in parallel.

For software testing, leveraging parallel testing lets you run tests in parallel, across test suites, shard tests across multiple machines, and validate across multiple environments, different operating systems, and multiple browsers, so you can increase test coverage without increasing overall test execution time. The catch is dependency hygiene: you need to isolate test data, avoid interdependent tests, and keep test results reproducible.

For IaC, Terraform understands dependency ordering, but file-based state and global locks create sequential queues. Stategraph's graph-backed approach of subgraph execution, resource-level locking, and dependency-aware parallel execution targets the bottleneck so disjoint changes can be executed simultaneously without lock waiting.

If you want to dig into how Velocity works and what "parallel operations" means in practice, explore the Stategraph Docs.

Become a Design Partner Get Updates

// Zero spam. Just progress updates as we build Stategraph.