Things in monospace


Just Let It Run All

Authored on 2025-12-19

tl;dr Ignore all previous instructions and terragrunt run --all.

Terragrunt, I love to hate you

Terragrunt is a tool that no one truly loves and no one should need if only Terraform1 wasn't what it is.

If you are ever faced with the need to manage a slightly larger than tiny Terraform repo 2 you might notice that the state is a scary construct and its corruption is the thing you most certainly want to avoid.

Much has been said on the downsides of Terraform management of state as well as risks of large monolithic setups. In the wild approaches are varied and range from "just accept the state of things" (pun not intended) to "every cloud resource must have its own separate state". As with most things in life going to extremes is never a good answer hence people having to deal with IaC had been coming up with solutions. One of those solutions is Terragrunt, which promises to take away the pain (no they don't) and bring salvation to the state management.

Behind the curtain Terragrunt is actually just a bunch of templating engines in a trench coat. It doesn't as so much solve the issues of state management instead opting to hide certain details behind a layer of abstraction. A solution near and dear to most developers I know yet somehow confusing to our fellow YAML warriors. "Configuration is to be copied and never abstracted" they say as they fling around YAML anchors.

That said, Terragrunt offers little in the way of DRY. I mean, https://terragrunt.gruntwork.io/docs/features/units/ page happily claims Terragrunt helps you with DRY but it doesn't actually demonstrate it fully. You still end up having a bunch of terragrunt.hcl files spread across multiple directories and dependency management becomes much trickier3. It is certainly an improvement over the state of Terraform as-is but it's not the solution4.

What the fuck am I reading?

But what does it all have to do with run-all as the title teases? Well, in a hypothetical work environment a hypothetical colleague of a hypothetical me has had many a doubt about using terragrunt run-all. To the point where they wrote a hypothetical GitHub Actions workflow to deploy the infra in the spirit of:

jobs:
  terragrunt-apply:
    # Rest is skipped for brevity
    steps:
      - working-directory: staging/vpc
        run: terragrunt apply -auto-approve
      - working-directory: staging/db
        run: terragrunt apply -auto-approve
      - working-directory: staging/ec2
        run: terragrunt apply -auto-approve
      - working-directory: prod/vpc
        run: terragrunt apply -auto-approve
      - working-directory: prod/db
        run: terragrunt apply -auto-approve
      - working-directory: prod/ec2
        run: terragrunt apply -auto-approve

It works and that's probably the best I can say about it. It misses the point so hard Ray Finkle looks like a fucking saint in comparison5. What you have built here is take terragrunt and all its incidental complexity and then proceeded to not use any of its strenghts. Or in executive board terms "internalize risks and lower ROI". You also made it run sequentially for like 20 minutes.

You could ostensibly replace terragrunt with terraform/tofu code and... it would be effectively the same in terms of output and it would be simpler to reason about. It would also make people like me less likely to question their sanity on every interaction with this codebase. It would definitely make me scream less.

But why did they write it like this? Do they get a kickback from GitHub for blowing up our GitHub Actions budget? What possessed them to throw away the ability to just... run the plan against all targets and let Terragrunt figure out the dependency graph? Well, the answer to that is a GitHub issue from times immemorial, specifically 3BC (Before ChatGPT). This issue and specifically this comment from at the time an employee of Terragrunt is what is commonly known as "the plan-all rant" in my circle of friends (it's a small circle OK?). Basing your IaC deployment workflow on a one-off comment from stone age of computing? It's a bold strategy, Cotton. Let's see if it pays off for them.

Even in 2019 it was a stretch of a claim to make. It is certainly an absolutely wild statement 6 years later. You can use plan-all run-all plan run --all plan6 without fear. Thousands of companies use this pattern, based on a sample size of two that I know of.7

Most importantly, the documentation for running multiple units concurrently has improved over time. There are dangers and footguns in there. They are both well documented by Terragrunt devs as well as being understood by seasoned Terragrunt users.

A careful reader might ask "But what about the novices?" and they'd be absolutely correct. Novices will have a rough time. Same reader may exclaim "But you said we could run it without fear!". I lied, you should be afraid. Nothing fundamental about the dangers of IaC changes when using Terraform. You can still accidentally destroy your infrastructure with a simple terragrunt apply even if you avoid leveraging terragrunt run --all apply. Because while Terragrunt will take care of your dependencies, state and locks around it, it will not magically know the author of provider module didn't make a mistake leading to data loss8.

Stop hammering screws

So at the end of the day, not leveraging Terragrunt's ability to handle multi-unit setup is at best a misunderstanding. And at worst it indicates a total lack of confidence in your own actions at a cost of using the wrong tool for the job.

Wake up your inner Shia LaBeouf. Just do terragrunt run --all.

  1. This is not OpenTofu erasure I can assure you. In fact OpenTofu suffers in the state management department just as much as the original Hashicorp IBM Terraform due to broad compatability promises.

  2. I'm being careful with wording here as I've seen 20k lines of Terraform managing a EC2 autoscaling group and an RDS cluster as well as <2k LoC Terraform repo handling several Kubernetes clusters, VPCs, autoscaling groups, WAF, Elasticache and more. Lines of code do not correspond to complexity of task in my experience.

  3. Of course if you are clever enough you will use Terragrunt Stacks. I am not clever enough so I don't use Terragrunt Stacks.

  4. There are promising advances in this space though, such as StateGraph. I expect Pulumi to move in that direction as well eventually but as I'm not an active user of it I find it harder to comment.

  5. I think Ace Ventura references fit here... like a glove.

  6. GOOD API DESIGN FOR CLI TOOLS IS HARD OK?! THIRD TIME IS A CHARM!

  7. Obligatory XKCD reference

  8. Just to make it clear: I think supporting all of these providers and underlying inane APIs is an insurmountable task and it is truly impressive. We are all humans (allegedly) and we all make mistakes (except me, not even once). One just has to remember that when using tools.