Terraform parallelism: a practical guide to faster applies
Terraform's default parallelism of 10 concurrent operations is a sensible starting point, but understanding when and how to change it is what separates teams that fight slow or flaky applies from teams that don't.
Large Terraform estates have a way of making waiting feel normal.
You kick off a terraform apply, watch providers churn through reads, creates, updates, and deletes, and accept that a long stream of Creation complete output is just the price of managing real infrastructure.
In plenty of teams, though, the slowness is not only about the number of resources in the state. It’s also about how many of those operations Terraform is allowed to run at once. Terraform plans and applies are driven by a dependency graph, and Terraform uses provider APIs to carry out the resulting actions, meaning concurrency is built into the tool rather than bolted on later.
That is where Terraform parallelism comes into the equation.
What is Terraform parallelism?
The parallelism setting specifies how many independent operations Terraform can execute concurrently while walking the graph. The default value of 10 is sensible enough that many engineers never touch it, but sensible defaults are not the same thing as optimal defaults.
A single Terraform configuration that manages a small module in a quiet account behaves very differently from a sprawling production deployment that’s brushing up against provider quotas or CI time limits.
Understanding that distinction is what separates teams that treat slow or flaky applies as an unavoidable tax from teams that can actually tune their workflows. Keep the default when it fits, increase parallelism when independent resources dominate your graph, and lower it aggressively when provider throttling or noisy remote execution makes terraform apply unreliable. The setting is small, but the operational difference can be large.
Terraform parallelism is a cap on concurrency
When Terraform runs a plan or apply, it does not simply march through your resources in file order. It builds a dependency graph from the configuration, adds edges for explicit dependency declarations such as depends_on, infers more edges from references and interpolations, and then walks that graph once the dependencies for a node are satisfied.
Graph walking is parallel by design, so resources that don’t depend on each other can be created, updated, refreshed, or destroyed at the same time. In practice, that makes parallelism the cap on concurrent operations while Terraform processes resources, not the total scope of the run.
Don’t assume a higher number somehow tells Terraform to do more work, when all it really does is let Terraform do independent work sooner. If your plan contains 500 resources but only 6 of them are actually independent at a given point in the graph, a parallelism setting of 50 will not magically manufacture more concurrency.
Terraform’s standard default value
By default, Terraform processes up to 10 operations in parallel while walking the graph. That same default applies to terraform plan, terraform apply, and terraform destroy.
In the cases of both plan and apply, -parallelism=n is an option that defaults to 10, a number tied directly to graph walking across plan, apply, and destroy.
Ten is a pragmatic middle ground, high enough that independent resources don’t sit idle in most everyday configurations, but low enough that Terraform is less likely to overwhelm the machine running the workload or the provider APIs behind it.
Changing it is, according to HashiCorp, an advanced operation, which is another way of saying the default is meant to be the starting point, not a number you second-guess on every module. Knowing that default is only useful, though, if you can override it cleanly when your environment stops behaving like the median case.
You set Terraform apply parallelism with a flag or an environment variable
The direct way to set terraform apply parallelism is the CLI flag. Both terraform apply and terraform plan support -parallelism=n, and it’s the limit on concurrent operations while Terraform walks the graph. In the example below, the value is raised from Terraform’s default of 10 to 20 concurrent operations.
For automation, the cleaner method is usually an environment variable. TF_CLI_ARGS and TF_CLI_ARGS_name let you inject default arguments into specific commands, with command-line flags still taking precedence.
In the example below, TF_CLI_ARGS_plan and TF_CLI_ARGS_apply tell Terraform to use -parallelism=20, which again raises the concurrency limit from the default of 10 to 20.
HCP Terraform uses workspace settings and edition limits
For remote runs, the picture is slightly different, which is where people often mix up workspace concurrency with Terraform parallelism.
Concurrency is the number of runs the platform can execute simultaneously across the organization, and that limit depends on your edition.
Parallelism is the number of tasks Terraform performs simultaneously within a single run.
In workspaces, there’s a TFE_PARALLELISM environment variable that sets the -parallelism=<N> flag for terraform plan and terraform apply, with valid values from 1 through 256 and a default of 10. If you are using HCP Terraform agents, use TF_CLI_ARGS_plan or TF_CLI_ARGS_apply instead.
Increasing parallelism helps only when the graph can use it
Increasing parallelism makes sense when apply time is a real bottleneck, and your configuration contains enough independent resources to benefit from more concurrency. Think wide deployments rather than tightly chained ones, lots of instances, security groups, queues, or other resources that can be managed in parallel without waiting on an explicit dependency or an implicit reference. In that kind of environment, the default value can become conservative.
Even then, higher is not automatically better. Graph walking only runs in parallel when dependencies allow it, and larger graphs consume more worker RAM, and increasing parallelism that also increases run CPU usage.
Changing TFE_PARALLELISM alone may not significantly decrease large-run duration, because the result depends on the mix of resources and the API calls each one requires. In other words, there are diminishing returns, and eventually, the dependency graph, not the concurrency cap, becomes the limiting factor.
Testing matters more than confidence.
Measure one environment, inspect provider behavior, and increase parallelism in deliberate steps rather than treating 20 or 30 as inherently better than 10. Most teams discover this setting from the other direction, though, when too much concurrency starts breaking otherwise valid applies.
Lowering parallelism is often the fastest fix
Here is a scenario engineers face in real life. A plan looks fine, the configuration is valid, and then terraform apply starts failing mid-run with throttling, connection resets, or provider-side instability that only appears under load.
The problem is rarely that Terraform forgot how to manage infrastructure; it’s usually that the surrounding APIs do not appreciate the burst pattern you just created.
In AWS, EC2 API requests are throttled on a per-account, per-Region basis. In Google Cloud, quotas and rate limits can block requests when they exceed what a project or API allows. In Azure, Azure Resource Manager throttles requests when limits are reached and returns HTTP 429 responses.
In those cases, lowering parallelism is often the fastest path back to a stable workflow. Adjust TFE_PARALLELISM when providers produce errors on concurrent operations or enforce non-standard rate limiting.
Values in the 3 to 5 range have been effective for mitigating intermittent connection resets in HCP Terraform. That lines up with a lot of day-to-day operator experience. When applies start failing in noisy ways, dropping from 10 to 5, or even to 3, is usually faster than trying to out-argue a provider quota. Setting it to 1 is also a useful debugging tool when you need to determine whether a failure is order-dependent.
There’s a bigger point here, and it’s not really about one flag. Terraform still uses a state file as the unit of coordination, so unrelated changes can end up serialized by workflow and locking even when their resource subgraphs are disjoint.
Stategraph is built for that exact pain. It persists the dependency graph and moves coordination to the resource level, which lets disjoint subgraphs from different teams proceed concurrently instead of forcing everything through one shared file.
Book a demo if you’re interested in seeing Stategraph in action.
The dependency graph sets the real ceiling
Parallelism only affects operations that are truly independent. If resources depend on each other through depends_on, through ordinary references that create implicit dependencies, or through data lookups that must resolve before downstream resources can be planned and applied, they cannot run in parallel, no matter how high you set the flag.
Terraform creates edges for both explicit dependencies and inferred references, and a node is only walked as soon as all its dependencies are walked.
As a result, some configurations barely change behavior when you increase parallelism.
A tightly coupled VPC module, or a single Terraform configuration whose modules depend heavily on shared outputs and data sources, may simply not have enough width in the graph to benefit.
Before you assume more concurrency will fix performance, inspect the graph with terraform graph and look at the structure you have actually built. That command enables you to generate a visual representation of the dependency ordering, which is often more revealing than another round of guesswork.
Stategraph fits neatly here, too, because persisting and querying that graph is a lot more operationally useful than treating it as a temporary artifact you only inspect when something is already broken.
More parallelism increases risk as well as speed
Greater API pressure
The first risk is obvious. More concurrent operations mean more API pressure, which means a greater chance of rate limit errors, retries, and partially completed runs when a provider gives up before Terraform does. Some providers, AWS included, handle API rate limiting with lower-level backoff and retry behavior, which is why Terraform does not position the parallelism feature itself as a direct rate-limit solution.
That’s useful nuance. Increasing the flag might improve speed, but it can also just move the bottleneck downstream into provider retries and flaky behavior.
Terraform only respects dependencies it can see
The second risk is subtler and, in some ways, more dangerous. Terraform can only respect dependencies it can see. Some real dependencies are invisible to Terraform unless you declare them explicitly. When that happens, higher parallelism can expose race conditions that were already present but hidden by slower execution.
A lower value is almost always the more conservative choice for production, not because concurrency is bad, but because hidden dependency bugs get more expensive as you speed them up.
Operational blast radius
The third risk is operational blast radius. If a run fails while more resources are changing at once, there is usually more in-flight work to inspect, more provider-side retries to understand, and more recovery context to reconstruct.
That doesn’t mean you should never increase parallelism. It means you should treat it like any other production control, with awareness of trade-offs rather than optimism.
Parallelism is one lever, not the whole solution
The best default advice is boring because it works:
Keep 10 until you have a specific reason to change it
Test changes outside production first
Set the behavior through environment variables in CI or remote workspaces rather than hard-coding one-off tweaks into scripts
HashiCorp documents TF_CLI_ARGS_name specifically for this kind of automation, and their HCP Terraform guidance also points to workspace-level variables for remote runs.
Beyond that, monitor the provider side, not just Terraform’s terminal output. If you are experimenting with higher parallelism in AWS, GCP, or Azure, watch API usage, throttling, and retries where the provider exposes them.
Don’t overlook the structural fix of splitting oversized configurations into smaller, logically scoped workspaces or state files when that reflects the natural shape of the infrastructure. Large graphs consume more memory, so reduce the workspace scope.
Also, restructuring configurations can help more than changing parallelism alone. In many estates, that is the better long-term answer.
Terraform parallelism is a real performance lever, but it isn’t magic. The right value depends on the width of your dependency graph, the behavior of your providers, and the amount of operational risk you are willing to absorb.
If you want better visibility into how those graphs behave in practice, and better control over how concurrent work is coordinated across teams, start with the Stategraph docs.
Terraform parallelism FAQs
What is the default parallelism in Terraform?
Terraform processes up to 10 operations in parallel by default while walking the dependency graph, which is the default for terraform plan, terraform apply, and terraform destroy.
How do I change the parallelism for Terraform apply?
Use terraform apply -parallelism=<N> or terraform plan -parallelism=<N> on the CLI.
In automation, use TF_CLI_ARGS_apply and TF_CLI_ARGS_plan, which are named environment variables for injecting default flags into specific Terraform commands.
Does increasing parallelism always make Terraform faster?
It only helps when the dependency graph has enough independent resources to run in parallel.
If your resources depend heavily on shared outputs, data sources, or explicit dependency edges, the graph itself becomes the limiting factor and higher values produce little benefit.
Can high parallelism cause Terraform apply to fail?
Yes. Higher concurrency can trigger API rate limits, connection resets, or provider instability. It’s recommended to reduce parallelism when concurrent operations or non-standard rate limiting cause errors. Values between 3 and 5 have helped mitigate certain HCP Terraform run failures.
How does Terraform decide which resources to run in parallel?
Terraform builds a dependency graph from explicit dependencies and inferred references, then walks that graph in parallel only when the dependencies for a node have already been satisfied. Resources depend on graph structure, not file order.
Is parallelism configurable in Terraform Cloud?
Yes, but there is an important distinction. Organization-level run concurrency is separate from per-run Terraform parallelism. For workspace runs, you can set TFE_PARALLELISM in supported environments, while for agent-based runs, you can use TF_CLI_ARGS_plan or TF_CLI_ARGS_apply.