Blast Radius Analysis

Blast radius analysis shows all resources that would be affected if a specific resource changes or is destroyed.

What is Blast Radius?

When you modify a resource in Terraform, dependent resources may also need to be updated or recreated. The "blast radius" is the set of all resources affected by a change.

Change to aws_vpc.main
         │
         ▼
┌────────────────────────────────────────┐
│           Blast Radius                  │
│                                        │
│  aws_subnet.public                     │
│  aws_subnet.private                    │
│  aws_security_group.web                │
│  aws_instance.web                      │
│  aws_db_subnet_group.main              │
│  aws_rds_cluster.database              │
│                                        │
└────────────────────────────────────────┘

Accessing Blast Radius

Via UI

  1. Navigate to a state
  2. Select a resource (click on it in the list or graph)
  3. Click Blast Radius or Impact Analysis
  4. View all affected resources

Via CLI

First, get your tenant and state IDs:

# List your tenants
stategraph user tenants list

Output:

550e8400-e29b-41d4-a716-446655440000    my-org
# List states for a tenant
stategraph states list --tenant 550e8400-e29b-41d4-a716-446655440000

Output:

{
  "results": [
    {
      "id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
      "name": "networking",
      "workspace": "production"
    }
  ]
}

Then query blast radius:

# For aws_instance.web
stategraph states instances blast-radius \
  --state a1b2c3d4-e5f6-7890-abcd-ef1234567890 \
  "aws_instance.web"

# For module.vpc.aws_subnet.public[0]
stategraph states instances blast-radius \
  --state a1b2c3d4-e5f6-7890-abcd-ef1234567890 \
  "module.vpc.aws_subnet.public[0]"

Output:

aws_instance.web    aws_instance.web    0    1
aws_eip.web         aws_eip.web         0    2

Understanding Results

Distance

The distance field indicates how many hops away a resource is in the dependency chain:

Distance Meaning
1 Directly depends on the target resource
2 Depends on something that depends on the target
3+ Further downstream in the chain

Impact Levels

Resources closer in distance are more likely to be affected:

  • Distance 1: Will definitely be affected
  • Distance 2-3: Likely to be affected
  • Distance 4+: May be affected depending on change type

Use Cases

Pre-Change Assessment

Before modifying critical infrastructure:

  1. Select the resource you plan to change
  2. View blast radius
  3. Identify all affected resources
  4. Plan change window based on impact

Risk Assessment

Identify high-risk resources:

  1. Compare blast radius sizes across resources
  2. Resources with large blast radius are higher risk
  3. Prioritize extra caution for these changes

Outage Planning

When planning maintenance:

  1. Check blast radius of each component
  2. Identify if changes cascade to critical services
  3. Plan communication and rollback procedures

High-Risk Patterns

Network Foundation Resources

VPCs, subnets, and security groups often have large blast radius:

aws_vpc.main
├── aws_subnet.public (20+ resources)
├── aws_subnet.private (30+ resources)
├── aws_security_group.web (15+ resources)
└── aws_internet_gateway.main (10+ resources)

IAM Roles

Roles used by many services:

aws_iam_role.application
├── aws_lambda_function.api (Distance 1)
├── aws_ecs_task_definition.worker (Distance 1)
├── aws_codebuild_project.build (Distance 1)
└── ... (many more)

Data Resources

Databases and storage with dependent services:

aws_rds_cluster.main
├── aws_rds_cluster_instance.primary
├── aws_rds_cluster_instance.replica
├── aws_secretsmanager_secret.db_credentials
└── aws_lambda_function.processor

Minimizing Blast Radius

Design Patterns

Loose Coupling: Use data sources instead of direct references:

# Higher coupling (larger blast radius)
resource "aws_instance" "web" {
  subnet_id = aws_subnet.main.id
}

# Lower coupling
data "aws_subnet" "main" {
  tags = { Name = "main" }
}

resource "aws_instance" "web" {
  subnet_id = data.aws_subnet.main.id
}

Module Boundaries: Encapsulate related resources:

module "networking" {
  source = "./modules/networking"
}

module "compute" {
  source    = "./modules/compute"
  subnet_id = module.networking.subnet_id  # Single connection point
}

Smaller States: Split large states into smaller, focused states:

networking/  → VPCs, subnets, security groups
compute/     → EC2 instances, ASGs
data/        → RDS, DynamoDB

Comparing Blast Radius

Between Resources

Compare blast radius to prioritize changes:

Resource Blast Radius Size Risk Level
aws_vpc.main 45 resources High
aws_security_group.web 12 resources Medium
aws_instance.worker 2 resources Low

Between Workspaces

Same resource may have different blast radius across environments:

Workspace aws_vpc.main Blast Radius
Production 150 resources
Staging 50 resources
Dev 10 resources

MQL Queries

Find Resources with Large Blast Radius

Currently, blast radius requires CLI calls per resource. You can script this:

#!/bin/bash
STATE_ID="a1b2c3d4-e5f6-7890-abcd-ef1234567890"

# Get all instance addresses
stategraph states instances query --state "$STATE_ID" -i "true" | \
  jq -r '.results[].address' | while read instance; do
    count=$(stategraph states instances blast-radius --state "$STATE_ID" "$instance" 2>/dev/null | wc -l)
    echo "$instance: $count"
done | sort -t: -k2 -rn | head -20

Find Highly Connected Resources

-- Resources with many dependencies (many things depend on them)
SELECT address, array_length(dependencies, 1) as dep_count
FROM instances
WHERE dependencies IS NOT NULL
ORDER BY dep_count DESC
LIMIT 20

Best Practices

  1. Check before apply - Always check blast radius for production changes
  2. Prefer small changes - Multiple small changes are safer than one large change
  3. Document high-risk resources - Mark resources with large blast radius
  4. Test in staging - Verify changes don't cascade unexpectedly
  5. Plan rollback - Know how to recover if blast radius was underestimated

Limitations

  • Blast radius shows structural dependencies, not runtime dependencies
  • External dependencies (DNS, external APIs) are not tracked
  • Changes that don't affect the dependency graph may still cause issues

Next Steps