frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Terraform requires a DAG. AWS allows cycles. Here's how I map the difference.

7•davidlu1001•5h ago
Error: Cycle: aws_security_group.app -> aws_security_group.db -> aws_security_group.app

If you've ever seen this error while importing AWS infrastructure to Terraform, you know the pain.

Terraform's core engine relies on a Directed Acyclic Graph (DAG). It needs to know: "Create A first, then B."

But AWS is eventually consistent and happily allows cycles.

The Deadlock

The most common culprit is Security Groups. Imagine two microservices:

- SG-App allows outbound traffic to SG-DB - SG-DB allows inbound traffic from SG-App

If you write this with inline rules (which is what terraform import defaults to), you create a cycle:

  resource "aws_security_group" "app" {
    egress {
      security_groups = [aws_security_group.db.id]
    }
  }

  resource "aws_security_group" "db" {
    ingress {
      security_groups = [aws_security_group.app.id]
    }
  }
Terraform cannot apply this. It can't create app without db's ID, and vice versa.

The Graph Theory View

When building an infrastructure reverse-engineering tool, I realized I couldn't just dump API responses to HCL. We model AWS as a graph: Nodes are Resources, Edges are Dependencies.

In a healthy config, dependencies are a DAG: [VPC] --> [Subnet] --> [EC2]

But Security Groups often form cycles: ┌──────────────┐ ▼ │ [SG-App] [SG-DB] │ ▲ └──────────────┘

Finding the Knots

To solve this for thousands of resources, we use Tarjan's algorithm to find Strongly Connected Components (SCCs). It identifies "knots" — clusters of nodes that are circularly dependent — and flags them for surgery.

In our testing, a typical enterprise AWS account with 500+ SGs contains 3-7 of these clusters.

The Fix: "Shell & Fill"

We use a strategy to break the cycle:

1. Create Empty Shells: Generate SGs with no rules. Terraform creates these instantly. 2. Fill with Rules: Extract rules into separate aws_security_group_rule resources that reference the shells.

  Step 1: Create Shells
    [SG-App (empty)]      [SG-DB (empty)]

  Step 2: Create Rules
          ▲                     ▲
          │                     │
    [Rule: egress->DB]    [Rule: ingress<-App]
The graph is now acyclic.

"Why not just always use separate rules?"

Fair question. The problem is: 1. terraform import often generates inline rules. 2. Many existing codebases prefer inline rules for readability. 3. The AWS API presents the "logical" view (rules bundled inside).

The tool needs to detect cycles and surgically convert only the problematic ones.

Why terraform import isn't enough

Standard import reads state as-is. It doesn't build a global dependency graph or perform topological sorting before generating code. It places the burden of refactoring on the human. For brownfield migrations with 2,000+ resources, that's not feasible.

---

I've implemented this graph engine in a tool called RepliMap. I've open-sourced the documentation and IAM policies needed to run read-only scans safely.

If you're interested in edge cases like this (or the root_block_device trap), the repo is here:

https://github.com/RepliMap/replimap-community

Happy to answer questions.

Comments

davidlu1001•2h ago
Author here. A few implementation notes:

1. We use NetworkX for the graph operations. Tarjan's SCC detection is O(V+E), so it scales well even for large accounts.

2. The trickiest part isn't the algorithm — it's mapping AWS API responses to graph edges. AWS APIs are... inconsistent. Some resources return IDs, some ARNs, some Names. Security Groups can reference themselves, reference by ID or by name, and have rules scattered across inline blocks and separate resources. Normalizing this soup into a clean adjacency matrix is where 80% of the engineering work lives.

3. For those wondering about the "Shell & Fill" naming: it's essentially forcing Terraform's create_before_destroy lifecycle behavior manually, by decoupling the resource identity from its configuration.

Would love to hear if others have hit similar graph problems with other IaC tools (Pulumi, CDK, CloudFormation).

talolard•1h ago
Not IAC, but I’ve been doing a similar trick to sequence adding type annotations to python code,

Eg take the module graph, break the SCCs in a similar manner , then take a reverese topological sort of the imports (now a dag by construction).

davidlu1001•39m ago
That's a spot-on parallel! Python circular imports (especially for type hinting) are basically the software equivalent of this infrastructure deadlock.

Do you use string-based forward references ("ClassName") to break the cycles? That's essentially our "empty shell" trick — decoupling the resource identity from its configuration to satisfy the graph.

Did you stick with Tarjan's for the SCC detection on the module graph?

andyjohnson0•52m ago
Please don't do this. Ask HN isn't your blogging platform. Per the guidelines its for asking questions of the community.
davidlu1001•46m ago
Appreciate the feedback. To be transparent: I originally submitted this as a standard text post, but after it hit a spam filter, the HN moderators kindly restored it and moved it to /ask themselves to help with visibility.

I'm definitely here for the dialogue, specifically looking to compare notes on graph algorithms with other IaC engineers.

Ask HN: Best practice securing secrets on local machines working with agents?

3•xinbenlv•7h ago•0 comments

Ask HN: Modern test automation software (Python/Go/TS)?

5•rajkumar14•1h ago•1 comments

Ask HN: How do you verify cron jobs did what they were supposed to?

4•BlackPearl02•10h ago•1 comments

Ask HN: Industrial smart glasses with online / offline capabilities?

3•aureliusm•8h ago•0 comments

Ask HN: Anyone doing production image editing with image models? How?

3•geooff_•4h ago•0 comments

Ask HN: What is your opinion on non-mainstream mobile OS options (e.g. /e/OS)?

4•sendes•3h ago•1 comments

Ask HN: Is there any good open source model with reliable agentic capabilities?

3•baalimago•15h ago•0 comments

Tell HN: Drowning in information but still missing everything

4•akhil08agrawal•11h ago•3 comments

Ask HN: Unusual Network Filter

3•gman21•8h ago•0 comments

Ask HN: How do you authorize AI agent actions in production?

3•naolbeyene•6h ago•3 comments

Ask HN: I'm sure more than just Microsoft is down rn

7•koconder•45m ago•3 comments

Tell HN: 2 years building a kids audio app as a solo dev – lessons learned

132•oliverjanssen•1d ago•74 comments

Ask HN: Do you have any evidence that agentic coding works?

431•terabytest•2d ago•437 comments

Ask HN: Thoughts on monitoring multi-chain staking and alerts with KoinyxBot

2•eeezl0dey•2h ago•0 comments

Ask HN: Why are so many rolling out their own AI/LLM agent sandboxing solution?

27•ATechGuy•1d ago•10 comments

Ask HN: GitHub "files changed" tab change?

2•nonethewiser•4h ago•0 comments

Ask HN: COBOL devs, how are AI coding affecting your work?

168•zkid18•3d ago•183 comments

Ask HN: Does "Zapier for payment automation" exist?

8•PL_Venard•1d ago•12 comments

Tell HN: Claude session limits getting small

23•pragmaticalien8•1d ago•14 comments

Ask HN: Is GitHub Down?

11•AznHisoka•7h ago•5 comments

Ask HN: Why does Google Maps still use mercator projection?

5•hbarka•15h ago•2 comments

Ask HN: Revive a mostly dead Discord server

19•movedx•2d ago•28 comments

Ask HN: Do you have side income as a software engineer?

10•andrewstetsenko•4h ago•3 comments

Tell HN: Avoid Cerebras if you are a founder

34•remusomega•1d ago•14 comments

Ask HN: How locked down are your work machines?

17•donatj•1d ago•22 comments

Tell HN: GitHub has experienced issues 60% of days this year

5•petetnt•6h ago•1 comments

Ask HN: How do you run parallel agent sessions?

7•Olshansky•3d ago•2 comments

Ask HN: What have you built/shipped with Claude Code?

9•blhack•1d ago•4 comments

Ask HN: Which common map projections make Greenland look smaller?

18•jimnotgym•2d ago•17 comments

How do you keep AI-generated applications consistent as they evolve over time?

10•RobertSerber•1d ago•0 comments