Deploying Temporal on AWS ECS with Terraform

https://papnori.github.io/posts/temporal-ecs-terraform/

28•norapap•3mo ago

Comments

DoofWarrior•3mo ago

Why go with Fargate instead of EC2?

norapap•3mo ago

We went with Fargate because it keeps things lean — no servers to manage, no patching, no scaling headaches. It’s perfect for our bursty workloads, since we only pay when containers actually run . Plus autoscaling just works .

In the github you can find comments to easily switch to EC2 if your workload needs it

leetrout•2mo ago

Just as an aside for the fargate convo... we switched from fargate to EC2 auto scaling group so we could run a custom AMI that has our container images (which are larger than I like) pre-baked and we went from a ~3 minute startup in fargate to ~30 seconds with the ASG when it's not triggering a scaling action.

We're using prefect not Temporal and each prefect flow launches in a discrete ECS task so the waiting added up.

jarboot•2mo ago

> Autoscaling is configured via CloudWatch alarms on CPU usage: > Scale-out policy adds workers when CPU > 30%. > Scale-in policy removes idle workers when CPU < 20%.

Does this handle the case where there are longer-running activities that have low CPU usage? Couldn't these be canceled during scalein?

Temporal would retry them, but it would make some workflow runs take longer, which could be annoying for some user-interactive workflows.

Otherwise I've seen needing to hit the metrics endpoint to query things like `worker_task_slots_available` to scale up, or query pending activities, pending workflows, etc to scale down per worker.

norapap•2mo ago

They can be cancelled if CPU drops below the scale-in threshold. In my case the activities were CPU-heavy, batch-style, and not client-facing — so preferred occasional retries and slightly longer runtimes over blowing up the AWS bill. For that workload, CPU-based autoscaling was perfectly fine.

I originally ran this setup on Temporal Cloud, and pulling detailed worker/queue metrics directly from Cloud can be tricky... you need to expose custom worker metrics yourself, then pipe them into CloudWatch. If you host Temporal yourself, it is easier:)

llmslave•2mo ago

Whats funny is in some sense, temporal replaces alot of the AWS stack. You dont really need queues, step functions, lambdas, and the rest. I personally think its a better compute model than the wildly complicated AWS infra. Deploying temporal on compute primitives is simply better, and allows for you to be cloud agnostic.

causal•2mo ago

I sometimes suspect AWS deliberately looks for ways to extract low-overhead tasks into dedicated services for the simple reason that many people will pay for the service without thinking about whether they really need it.

llmslave•2mo ago

Its very easy to add AWS services, but after building them into a stack over a few years, its basically impossible to remove them

bithavoc•2mo ago

yes, one word: IAM

swyx•2mo ago

same guys worked on temporal as aws step functions. they just learned over time.

jen20•2mo ago

This article is really about hosting Temporal _workers_ in ECS - which is the "easy" part - not running the Temporal service itself. That would be a valuable follow-up!

whalesalad•2mo ago

99.9% sure the entire article was written by Claude or ChatGPT - so you can probably direct that question at the source. Make sure to end your prompt with, "no emojis"

norapap•2mo ago

I found a lot of guides and articles on how to host Temporal - they have it in the official docs also:)

OpenCiv3: Open-source, cross-platform reimagining of Civilization III

The Waymo World Model

How we made geo joins 400× faster with H3 indexes

What Is Ruliology?

Unseen Footage of Atari Battlezone Arcade Cabinet Production

Show HN: Look Ma, No Linux: Shell, App Installer, Vi, Cc on ESP32-S3 / BreezyBox

Monty: A minimal, secure Python interpreter written in Rust for use by AI

Show HN: I spent 4 years building a UI design tool with only the features I use

Microsoft open-sources LiteBox, a security-focused library OS

Sheldon Brown's Bicycle Technical Info

Hackers (1995) Animated Experience

Delimited Continuations vs. Lwt for Threads

Dark Alley Mathematics

PC Floppy Copy Protection: Vault Prolok

Show HN: If you lose your memory, how to regain access to your computer?

An Update on Heroku

Was Benoit Mandelbrot a hedgehog or a fox?

How to effectively write quality code with AI

Zlob.h 100% POSIX and glibc compatible globbing lib that is faste and better

Show HN: R3forth, a ColorForth-inspired language with a tiny VM

Female Asian Elephant Calf Born at the Smithsonian National Zoo

Why I Joined OpenAI

I spent 5 years in DevOps – Solutions engineering gave me what I was missing

Introducing the Developer Knowledge API and MCP Server

Understanding Neural Network, Visually

Learning from context is harder than we thought

I now assume that all ads on Apple news are scams

FORTH? Really!?

Show HN: ARM64 Android Dev Kit

Show HN: Smooth CLI – Token-efficient browser for AI agents