It seems easier for the cloud provider to implement the equivalent of a dry-run flag in API calls that validate that the call would succeed (even if it's best effort determination) which could be used by tools like Terraform during the planning and dependency tree generation.
Instead, you have platform providers like AzureRM that squint at the supplied objects and make a guess as to whether that looks valid, which causes a ton of failures upon actual application. For instance, if you try to create storage with a redundancy level not supported by the region you're adding it to, Terraform will pass a plan stage, but the actual application of the resource will fail because the region doesn't support that level of redundancy.
There are unlimited other examples in a similar vein, all of which could be resolved if API providers had a dryrun flag.
Work a few years in Ops and you learn that spinning up things is not a big part of your work. It's maintenance, such as deleting stuff.
Unfortunately this process is the hardest, and there's very little to help you do it right. Many tools, framework and vendors don't even have proper support for it.
Some even recommend 'rinse and repeat' instead of adjusting what you have - and this method is not great if you value uptime, nor if you have state that you want to preserve, such as customer data :-)
Deleting stuff, shutting services down, turning off servers - those are hard tasks in IT.
I don’t love how unreliable providers are, even for creating resources. Clouds like DigitalOcean will 429 throttle me for making too many plans in a row with only 100+ resources. Sometimes the plan goes through, but the apply fails. Sometimes halfway through.
I’d rather use a cloud-specific API, unless I’m certain of the quality of the specific terraform provider.
akersten•1h ago
Why don't cloud providers have a nice way for tools like TF to query the current state of the infra? Maybe they do and I'm doing IaC wrong?
mooreds•1h ago
These folks also have an article about that: https://newsletter.masterpoint.io/p/how-to-bootstrap-your-st...
bigstrat2003•1h ago
catlifeonmars•52m ago
When you have a hammer… as the expression goes. It’s crazy how many times that even knowing this, I have to catch myself and step back. IaC is a contextually different way of thinking and it’s easy to get lost.
colechristensen•1h ago
* Your terraform code
* The state terraform holds which is what it thinks your infrastructure state is
* The actual state of your infrastructure
>Why don't cloud providers have a nice way for tools like TF to query the current state of the infra?
What a terraform provider is is code that queries the targeted resources through whatever APIs they provide. I guess you could argue these APIs could be better, faster, or more tuned towards infrastructure management... but gathering state from whatever resources it manages is one of the core things terraform does. I'm not sure what you're asking for.
fragmede•1h ago
colechristensen•45m ago
It's not an API issue but a terraform provider issue having missing or incomplete code (i.e. https://github.com/hashicorp/terraform-provider-aws )
cobolexpert•1h ago
The state however is always stored in a _separate AWS account_ that only the devops team can manage. I find this to be a reasonable way of working with TF. I agree that it is confusing though, because one is using $PROVIDER to both create things and manage those things at the same time, but conceptually from TF’s perspective they are very different things.
don-code•1h ago
This is technically how Ansible works. Here's an extensive list of modules that deploy resources in various public clouds: https://docs.ansible.com/projects/ansible/2.9/modules/list_o...
That said, it looks like Ansible has deprecated those modules, and that seems fair - I haven't actually heard of anyone deploying infrastructure in a public cloud with Ansible in years. It found its niche is image generation and systems management. Almost all modern tools like Terraform, Pulumi, and even CloudFormation (albeit under the hood) keep a state file.
knowhy•51m ago
At work we use Ansible to setup Route53 records for infrastructure hosted elsewhere. Not sure if that counts as infrastructure.
cyberax•1h ago
They do! In fact, this is my greatest pet peeve with TF, it adds state when it's not needed.
I was doing infra-as-code without TF with AWS long time ago. It went like this:
AWS has tag-on-create now, making this sort of code reliable. Before that, you could do the same with instance idempotency tokens. GCP also has tags.raffraffraff•56m ago
These days people store the state in terraform cloud or spaceliftor env0 or whatever. Doesn't have to be the same infra you deployed.
If you were a lunatic you could not use a state backend and just let it create state files in the terraform code directory, check the file into git with all those secrets and unique ids etc.