← Back to Blog RSS

Concurrency vs. Parallelism: What's the Difference?

Terraform Concurrency Performance

Concurrency and parallelism are related but distinct concepts, and confusing them leads to poor architecture decisions. Understanding the difference is the first step to writing better systems, and to reasoning clearly about tools like Terraform that expose both.

TL;DR
$ cat concurrency-vs-parallelism.tldr
• Concurrency is about structuring a program to handle multiple tasks at once; parallelism is about actually executing them at the same time.
• You can have concurrency without parallelism, and parallelism without good concurrency design.
• The distinction has real consequences in infrastructure tooling, particularly in how Terraform plans and applies changes.
• Knowing when to lean on each concept helps you build faster, safer, and more predictable systems.

Concurrency and parallelism sound similar because both involve multiple tasks, overlapping time periods, and the uncomfortable feeling that more than one task is happening inside the same system.

In engineering, the two concepts often get collapsed into the same bucket, especially when people are debugging slow services, tuning a web server, or trying to understand why a terraform apply is touching several resources at once.

Rob Pike's 2012 Waza talk remains one of the cleanest articulations of the split. In his talk, Rob Pike defines concurrency as the composition of independently executing processes, while parallelism is the simultaneous execution of possibly related computations.

Concurrency is not Parallelism by Rob Pike

Concurrency runs tasks A, B, and C interleaved on one core over 6 time units; parallelism runs the same three tasks simultaneously across three cores in 2 time units Concurrency runs tasks A, B, and C interleaved on one core over 6 time units; parallelism runs the same three tasks simultaneously across three cores in 2 time units

Confusing concurrency and parallelism can cause teams to make poor architecture decisions, as they start adding multiple cores when the system really needs better task structure, or they add asynchronous execution when the workload really needs parallel execution across multiple processing units.

Read our guide to the key differences between concurrency and parallelism so you avoid making that mistake.

What is concurrency?

Concurrency means a program is structured to deal with multiple tasks at once, even if a single processor is only ever doing one task at a given instant. The central processing unit may run the first task, pause it, move to the next task, then return later, creating progress across overlapping periods without true simultaneous execution.

A common analogy uses the example of a chef preparing a number of dishes at once.

One chef can prepare multiple dishes by chopping vegetables for one dish, checking a pan for another, starting sauce for a third, and moving between tasks quickly. There are multiple tasks in flight, but the chef is only working on one at a time. Nothing is happening at the exact same time, yet the kitchen remains responsive because the chef is not blocked by waiting.

Concurrency is primarily a design consideration, which is why programming languages and runtimes expose different models for it.

Go's goroutines and channels are well-known, while Node.js's event loop lets concurrent applications make progress through asynchronous execution, especially when tasks spend time waiting on I/O.

What is parallelism?

Parallelism is the simultaneous execution of multiple computations, usually across a multicore CPU, multiple CPUs, multiple processors, or different processors inside a larger system.

While concurrency is about structure, parallelism is about execution. Are two tasks actually running at the same instant, or is the CPU switching between them quickly enough to create the illusion of things happening together?

Let's return to the kitchen. Parallelism is multiple chefs, each working on a different dish at exactly the same time. One chef kneads dough, another grills fish, and another dusts sugar. The work moves faster because there are multiple processing units, not because one person is able to juggle a number of tasks at once.

Parallelism requires hardware capable of real simultaneous execution, meaning a single CPU with one core cannot provide true parallelism for CPU work, even though operating systems can make concurrent programs feel fluid.

Parallel programming has limits. Amdahl's Law describes how the speedup of parallelism is constrained by the sequential portions of a program, so computational speed does not scale forever just because more cores are available.

Concurrency vs. parallelism: the key differences

The key distinction between concurrency and parallelism is purpose.

Concurrency helps a system manage more than one task by organizing work into independent processes, threads, goroutines, callbacks, or event loop tasks.

Parallelism helps a system finish CPU work faster by running multiple computations at the exact same time.

For that reason, parallelism and concurrency aren't rivals, but different layers of the same system.

Concurrency can run on a single core, a single thread, one CPU, or one process, provided the system can switch between tasks and keep making progress. Parallelism, meanwhile, requires hardware with multiple cores, multiple CPUs, or multiple processing units.

The complexity also differs. Concurrency introduces coordination failures such as race conditions, deadlocks, missed cancellations, and unsafe access to shared state.

Parallelism introduces data consistency, synchronisation, workload partitioning, and cache contention issues, especially when smaller subtasks must combine into one final result.

In Terraform, both parallelism and concurrency can be controlled: the dependency graph decides what can be worked on concurrently, while the -parallelism setting controls how many eligible operations Terraform actually runs at the same instant.

When to use concurrency

Concurrency is the right tool when tasks spend more time waiting than computing.

A web server handling many simultaneous requests is the classic case, because one request might be waiting on a database query, another on a network response, another on a disk read, and another on a downstream API. A single thread or event loop can keep the application responsive by moving between concurrent tasks instead of blocking the whole process.

I/O-bound systems often involve concurrency before they need more CPU. Event-driven systems, message consumers, background job workers, and API gateways usually win by structuring task execution so that waiting time does not waste the whole application. The system isn't necessarily doing everything at the same instant. Instead, it's making progress across overlapping time periods.

Concurrency also gives developers a way to express independence. When different tasks can proceed without depending on each other, concurrent design keeps the system busy while external systems respond. Parallelism becomes useful when the waiting disappears and the CPU becomes the bottleneck.

When to use parallelism

Parallelism is the right tool when the application splits heavy CPU work into smaller subtasks that can run simultaneously.

Data processing pipelines, image encoding, video transcoding, scientific simulations, search indexing, and machine learning training often benefit from parallel execution because the work is compute-heavy and can be divided across multiple cores.

This is where multi-threading, worker pools, vectorized execution, and distributed compute start to make sense. If each chunk of work can run independently and merge into a final result, parallel programming can turn available hardware into computational speed. A multicore CPU can run different tasks at the same time, while a larger cluster can spread execution across different processors or machines.

Throwing more cores at an I/O-bound problem rarely helps. A service waiting on a slow database will still wait, even if it has 32 cores available. Parallelism works best when the CPU is busy doing real work, not when the program is mostly stalled on something outside the processor.

Concurrency and parallelism in Terraform

Terraform uses both concurrency and parallelism when it plans and applies infrastructure changes.

Terraform builds a dependency graph from configuration, provider relationships, implicit references, and explicit depends_on edges. That graph determines which resources are independent enough to be touched during overlapping periods. In other words, the graph is the concurrency design.

The -parallelism flag controls how many of those concurrent operations Terraform will run at once while it walks the graph. The current default is 10.

In this instance, calling it -parallelism may be confusing, because people read it and assume it overrides dependency structure. It does not. Terraform cannot safely create a resource before its dependencies exist just because the parallelism value is higher. The graph defines eligibility. The setting defines the execution width.

That distinction has consequences. If the graph says 30 resources are independent, Terraform may apply up to 10 of them concurrently by default. If the graph says every resource depends on the previous one, increasing -parallelism does almost nothing.

If the provider API has strict rate limits, increasing it can turn a slow apply into a noisy failure. If the state model is strained by too many concurrent operations in a large project, it can produce errors that feel unpredictable due to an underlying issue of coordination rather than raw speed.

Pattern Recognition

Teams tuning Terraform should treat parallelism as a control surface, not a magic accelerator.

State management turns this distinction into an operational constraint

When Terraform applies multiple resources in parallel, it still needs a coherent state file that reflects the final result. State locking prevents concurrent writes against the same workspace, meaning disabling that lock allows others to run commands concurrently against the same workspace.

Concurrency without proper state isolation is one of the fastest ways to create hard-to-debug infrastructure behavior. In large Terraform projects, that risk is amplified: there are more resources, more provider calls, more dependencies, and more chances for task execution to collide with reality.

Parallelism can make applies faster, but state management decides whether that speed is safe.

Reliable systems use both concepts deliberately

Design Principle

Concurrency is a design pattern. Parallelism is a hardware capability.

The two concepts are related, and many production systems use both, but they answer different questions.

Concurrency asks how the system should organize multiple tasks so progress continues across waiting, dependency boundaries, and overlapping time periods.

Parallelism asks how much work can run simultaneously across multiple cores, multiple processors, or multiple CPUs.

In Terraform, understanding concurrency and parallelism helps teams tune performance without sacrificing stability. The dependency graph tells Terraform what can happen together. The parallelism setting controls how wide execution can get. State management keeps the result coherent. Those are distinct responsibilities, and treating them as the same concept is where brittle systems begin.

Explore Stategraph's docs to see how Stategraph helps teams manage Terraform at scale, with the control, visibility, and operational discipline that large infrastructure graphs require.