← Back to Blog RSS

Engineering Log: Refactor mode for Terraform

Engineering Log Product Updates Stategraph Demo

Refactoring Terraform is the operation engineers fear most, because a slip in a moved block can destroy a database instead of renaming it. Refactor mode flips the contract: you change the code, and Stategraph figures out the state moves. Pair it with an LLM that has plan access but no apply access, and refactoring a messy repo turns into something you can run while you sleep.

engineering-log-refactor-mode.tldr
$ cat engineering-log-refactor-mode.tldr
• Refactor mode: change Terraform code, Stategraph generates the state moves
• Iterate small: refactor step, plan, validate, repeat - target a noop plan
• stategraph refactor complete emits the move blocks for a PR
• Pairs with any LLM via a Stategraph refactor skill - safe because plan and apply are separately gated

Another demo day

The last demo day was the velocity launch. We've been heads down since then on a release shaped by feedback from early adopters, and this demo was a chance to show one of the things that came out of it. The three goals haven't changed: give you insight into your infrastructure, make plan and apply fast, and let you treat all of your infrastructure as one unit even when it's broken across many states. Insights are common in this space. The other two are not - and the reason nobody else is doing them is that the details are genuinely hard.

This demo was short. Refactor mode is the kind of feature that sounds boring on a slide and clicks the moment you watch a messy repo turn into a clean one without anyone touching a state file by hand.

Here's the recording.

Why refactoring Terraform is scary

If you read the Terraform subreddit you see the same post every week. Someone inherited a repo, or they wrote it two years ago and it has outgrown its shape, and now it's a bottleneck. They want to break it into modules, split environments, rename a few things. They don't, because refactoring Terraform is two changes that have to stay in sync: you edit the code, and you write moved blocks to tell the state file where each resource went. Get it wrong and Terraform doesn't see a rename - it sees a delete and a create. Lose track of a resource and that's bad. Destroy a database and that's catastrophic.

On top of the mechanical risk, there's the layout question: refactor it how? You pick a structure, spend a week on it, and six months later you've grown again and the whole thing needs another pass. So the repo just sits there.

Refactor mode

Refactor mode removes the half of the problem that causes the actual outages. You change the code; Stategraph generates the moves. The workflow is iterative and it rewards small steps:

  1. Capture the starting state of the repository.
  2. Make a small refactor in code - move a resource into a module, split a file, rename something.
  3. Run stategraph refactor step, or just run a plan. Stategraph compares the original layout to the new one, builds a mapping, and generates the moves.
  4. Validate that the plan is a noop. If it is, the mapping is right.
  5. Repeat until you've reshaped the whole repo.

At the end you have two options. Apply directly and let Stategraph move everything in place, or run stategraph refactor complete and get the canonical Terraform move blocks emitted as text - perfect for a pull request that other engineers can review.

The noop is the test

If your refactor is purely a layout change, the plan should be a noop after Stategraph generates the moves. That's the feedback loop. A non-noop means either Stategraph couldn't unambiguously map something, or your "refactor" was actually a real change in disguise. Either way, you find out one small step at a time, not after merging a thousand-line PR.

Letting an LLM drive the refactor

The demo paired refactor mode with Claude using a Stategraph refactor skill we've published in github.com/stategraph/skills. The starting repo was a deliberately bad one - flat, no modules, multiple environments mixed together, an IAM file that was a junk drawer, a "temporary fix" that had been there for a year. The skill teaches the LLM how to drive the refactor command, iterate on small changes, and validate each step against what Stategraph reports back.

The output: prod and staging split into environment roots, shared modules for IAM, networking, and S3, a clean entry point in main.tf that wires the modules together, and a log of every move Stategraph generated to get there. The skill works with any LLM - Claude, Gemini, Codex - because the safety property doesn't come from the model.

Plan and apply are separately gated

Stategraph controls whether a session can plan and whether it can apply, independently. Give the LLM plan credentials but withhold apply, and the worst case is a wrong refactor you throw away. There's no path from a confused model to a destroyed database. That's what makes "let it run overnight" a reasonable strategy instead of a confession.

The example in the demo took about thirty minutes. It plans fast because the resources aren't real cloud resources, just the shape of a repo. For a real repository with real plans, the recommendation is to point the model at it before you log off and review what it produced in the morning. Token usage stayed normal - the skill gives the model enough direction that it doesn't burn cycles exploring the wrong shape.

What's next

A new release lands this week or early next, with most of the changes coming directly from feedback we've gotten from people running Stategraph against their own state. The 30-day trial doesn't cap how many resources you can import, and getting in and out is one command each way: stategraph import to bring state in, stategraph export to take it back out. If you have a repo you've been afraid to refactor, that's the loop we want you to try.

Come find us on Discord if you do. The questions we get from people actually using the tool are what shapes the next demo.

Follow along as we build Stategraph

We're building Stategraph in the open - shipping demos, sharing the engineering work behind them, and shaping the roadmap from feedback. Subscribe to follow along, or grab the trial and try refactor mode on a repo you've been avoiding.