Abstraction, not syntax

https://ruudvanasseldonk.com/2025/abstraction-not-syntax

102•unripe_syntax•3mo ago

Comments

JohnMakin•3mo ago

As someone who's spent most of their career in cloud IAC, and likes to think they are pretty read up on the latest going on in that world, if you didn't know better you'd think YAML is one of the greatest threats facing mankind. There are plenty of things I certainly hate about it, but every configuration syntax I've ever used I have similar gripes about. It's like once a month this kind of "The world is growing tired of yaml" claim is just thrown out there like everyone just agrees with it. Choose something that works for you. This author repeatedly mentions TOML but there are plenty of issues with that one I could point out too. Syntax is one very small part of what makes an ecosystem great or not so great. Most of my exposure to yaml is helm chart templates, which admittedly is not pure YAML, but it works fine enough for me, at least to where I don't feel like writing lengthy blog posts about how much I hate it. I even wrote a library that converts yaml templates to HCL for internal use because I got so sick of people having this exact same argument like it deeply mattered. And guess what? They hate the HCL too.

compyman•3mo ago

I also think that a lot of the problems with yaml specifically are overblown, but this post is actually not about that!

It is specifically saying the same problem exists in JSON/YAML/TOML, etc, which is that all these configuration languages don't have any real means of abstraction, and ultimately aren't expressive enough do to the job we require of them.

as soon as you are templating config files with other configs, I agree, I have sorely felt this limitation with helm charts

kmoser•3mo ago

Serious question: do people who work with these config files frequently, or on large such files, use simple text editors, or are there "smart" editors that do things like prevent you from making typos or inserting the wrong data type, similar to an HTML form that does basic validation or a DB schema that rejects bad data?

There is no single cure-all, of course, but surely we should be relying on computers to do much of the heavy lifting when it comes to validation and verification of these files, not just as linters after the fact but in realtime while we're editing, and with some sort of knowledge (even if derived programmatically) of what is right and what is wrong so we no longer have to worry about simple footguns.

kokada•3mo ago

I think one of the problems of those "configuration languages" is that you can extract semantic information without knowing the target, e.g., with has a specific meaning in GitHub Actions but it is otherwise an unremarkable word in the YAML specification.

But when working with real programming languages it is completely different, you can take semantic information from the current code, and you can have things like types to give you safety.

JohnMakin•3mo ago

The problem is most configuration languages are declarative vs imperative like most “real” languages are. You could probably levy the same complaint against declarative languages in general - it’s just a different way of thinking

kokada•3mo ago

Nix as used in NixOS is a declarative language and there is none of the issues I cited by being a "real" programming language (or as the article talks about, having "abstractions" like builtin.map). You can pretty easily setup a LSP to get code-completion (even between different projects, like NixOS vs Home-Manager). There is no proper type system in Nix but the module system does supplement it well.

taeric•3mo ago

I'm trying to remember the phrase. Something like, "there is nothing as vicious as low stakes fights."

Trying that on Google gets me https://en.wikipedia.org/wiki/Sayre%27s_law. Is about right. :D

jdmichal•3mo ago

It's like bike shedding. It's a side effect of mixed expertise (and confidence) working together on things that are only partially understood by all. When something is clearly outside one's expertise, they are content to leave it to others. But then you'll get minor questions with low stakes like "what color to paint the shed". And how everyone feels like they can participate, so suddenly there's a huge discussion / debate / argument about a very, very minor thing.

skywhopper•3mo ago

HCL is generally great, but has some issues with clarity of transformation to the underlying data structures.

But anytime someone suggests TOML I have to double check to be sure they are serious because the TOML syntax for anything more complicated than single-layer maps is mind-bogglingly confusing to me. This is not a serious alternative to YAML.

qezz•3mo ago

> And guess what? They hate the HCL too.

Don't want to sound too harsh, but to me HCL is even worse than plain YAML.

By expressiveness, HCL is somewhat similar to Ansible-flavoured YAML - in both you need to use magic keywords to create any kind of abstraction (e.g. a loop).

HCL is worse than regular YAML because there's only one "true" parser for it, that is official Hashicorp's HCL parser. So if you are locked into Golang ecosystem, then sure it can work for you, otherwise you are out of luck.

There are a couple of tools that convert HCL into JSON, I tried both, they somewhat work, but in the end of the day it's a big hack. At that point I just gave up on using HCL and started using something else that generates JSON.

Hope you find some configuration layer that fits your users more than HCL or YAML.

JohnMakin•3mo ago

Not exactly true in terms of being the only parser - OpenTofu is a good project I’m highly supportive of

Defletter•3mo ago

Really wish people would just bite the bullet and do configuration as code instead of trying to make all these config petlangs.

patrickmay•3mo ago

Exactly. Emacs Lisp is an existence proof that this can be done well.

taeric•3mo ago

You beat me to it!

And for those that haven't taken a look at it, the "customize" menu and everything it supports is silly impressive. And it just writes out the results out, like a boss.*

* Obviously, it has a lot of "don't edit below this line" complexity to it. But that doesn't change that it is right there.

skydhash•3mo ago

My preference is towards simpler formats like:

  option value

Easy to edit and manipulate. JSON and YAML is always a nightmare if it's user facing. As for ansible, I'd love to see some scheme/lisp variants.

mrheosuper•3mo ago

then how you distinguish between string "52" and number 52 ?

Keep adding more edge cases and you have something resembles JSON

skydhash•3mo ago

Why do you need to differentiate between the two as an input? It’s config, not random data. If you have

  email test@example.com
  logo-size 100
  background #adadaf
  modules auth
  modules db
  modules files

The only reason to have special token here is if you multiple line values. Types are not a concern.

bandie91•3mo ago

i don't. and neither my perl-based softwares. there should not be the possibility whereas a given parameter can have both string or a numeric value too at the configuration level which the user interfaces with - as of the "real-world analogy" programming paradigm suggests. json and stuff still do have their place but in a lower, machine-to-machine layer.

horsawlarway•3mo ago

I appreciate that the ts/js ecosystem seems to be moving in this general direction.

Lots of config.json is being replaced by the nicer config.ts.

nikeee•3mo ago

I really dislike it when a turing-complete language is used for configuration. It almost always breaks every possibility to programmatically process or analyze the config. You can't just JSON.parse the file and check it.

Also I've been in projects where I had to debug the config multiple levels deep, tracking side-effects someone made in some constructor trying to DRY out the code. We already have these issues in the application itself. Lets not also do that in configurations.

crdrost•3mo ago

This is what's nice about Pkl, you define a schema as a Pkl file, you define a value of that schema as a Pkl file that imports the schema, `pkl eval my file.pkl` will do the type check and output yaml for visual inspection or programmatic processing, but keeping it to one file per module means that I almost never obsessively D-R-Y my Pkl configs.

Actually that's not the biggest benefit (which is tests for schemas) but it's nice to have the “.ts” file actually log the actual config as JSON and then the app consumes it as JSON, rather than importing the .ts file and all its dependencies and having weird things like “this configuration property expects a lambda.”

echelon•3mo ago

That's why Starlark exists.

You need something between JSON/YAML and Python/JavaScript.

A config language makes the possibility space small.

It also makes it deterministic for CI and repeatable builds.

It also makes it parallelizable and cacheable.

Don't use your language for config. People will abuse it. Use a config language like Starlark or RCL.

skydhash•3mo ago

I still have to see a JS project where the config for each tool could not be something simple like `.toolrc`. We could have some markers to delineate plugins config.

Instead, there’s a another software in the configuration of sample projects, instead of just using good code organization and sensible conventions.

your_fin•3mo ago

> It almost always breaks every possibility to programmatically process or analyze the config. You can't just JSON.parse the file and check it.

Counterpoint: 95% of config-readers are or could be checked in with all the config they ever read.

I have yet to come across a programming language where it is easier to read + parse + type/structure validate a json/whatever file than it is to import a thing. Imports are also /much/ less fragile to e.g. the current working directory. And you get autocomplete! As for checks, you can use unit tests. And types, if you've got them.

I try to frame these guys as "data values" rather than configuration though. People tend to have less funny ideas about making their data 'clean'.

The only time where JSON.parse is actually easier is when you can't use a normal import. This boils down to when users write the data and have practical barriers to checking in to your source code. IME such cases are rare, and most are bad UX.

> Side effects in constructors

Putting such things in configuration files will not save you from people DRYing out the config files indirectly with effectful config processing logic. I recently spent the better part of a month ripping out one such chimera because changing the data model was intractable.

frou_dh•3mo ago

When Python projects used that approach (setup.py files) that meant to just know what a package's dependencies were, arbitrary code had to be run. Now it's pyproject.toml

nickelpro•3mo ago

pyproject.toml calls into a build backend which is... Python.

It is good to have a simple, declarative entry point to the build system which records declarative elements of the build. The non-declarative elements of the system are configuration-as-code.

lenkite•3mo ago

Yes, though languages need to develop and provide restricted execution modes for "configuration as code" for security enforcement.

tekbruh9000•3mo ago

This year I started using an SQLite file specifically for config values

Have used everything from Json to Cue and in-between. Tired of the context switch. Need to use SQL anyway. Fewer dependencies overall required.

mrmrcoleman•3mo ago

Curious - how do you version the config?

stirfish•3mo ago

I'm guessing they version a SQL file

tekbruh9000•3mo ago

Yes. Git log is a handy thing for versioning.

I never relied on it for developer notes. Just arguing semantics in those cases.

theknarf•3mo ago

Config as code suffers from two big problems:

- Turing completeness means that you have to deal with the halting problem, meaning you can't statically ensure that a program ever completes. This is really shit when dealing with config, one buggy while loop or infinite recursive function and stuff just grinds to a halt with no good way of debugging it. Having this problem at the config level might mean that your program never even gets to properly start up, so you never get to setup the logging / otel or whatever you usually use to catch those problems.

- Normal programming languages have side effects and are therefor insecure! They can usually read and write files anywhere, open sockets, send traffic over the internet, etc. These are all properties you don't want of a config language! Especially if you can import code from other modules, a single import statement in a "config file" is now a huge security risk! This is why "npm" keeps having security nightmares again and again and again.

So what you want from a config language is not the same thing as from a programming language, you want as much power as you can get without "Turing completeness" and without any "side effects". That's the reason we have stuff like HCL and whatever the article used as an example.

apalmer•3mo ago

I don't think the title and the article really communicates it's case well. Did not understand the goal until 90% through the article when they showed the source code of RCL with the loops.

This isn't syntax vs abstraction. This is how much programming language power do you want to enable in your configuration language. This is a big difference and I think we miss the interesting part of that discussion because we dip into this 'abstraction angle.

dapperdrake•3mo ago

The "abstraction angle" seems to be a different encoding for "power of configuration language."

jasaldivara•3mo ago

I think if people want a more powerful, programmable config language, maybe they should use something like Lua or Scheme instead of reinventing the wheel with those new niche languages.

dvrp•3mo ago

Should be titled “the power spectrum of data” or something similar

the article talks about the trade off between plain data structure versus abstract ones and that’s the main issue

nickelpro•3mo ago

Yes, one more DSL on your Tower of Babel tech stack will save you.

If you want configuration-as-code use Python. Please. Or Tcl if you must. Do not invent N+1 DSL for your engineers to waste time learning.

qezz•3mo ago

Luckily, the main point of the article is that syntax doesn't matter, but abstractions do. In other words, you can use your favorite DSL, that being Python or TCL or something else.

kgen•3mo ago

Code as configuration loses one primary benefit which is being able to read the actual config and know exactly what will apply. In the example in the article you would get the same benefit by disallowing users to edit the config directly and instead require the config to be generated via a cmdline app/service that encodes the same policy?

Splizard•3mo ago

The problem with configuration formats, is not syntax, or abstraction, it's the lack of consistent language server integration, it's problem when I can't lookup the definition for a key, the expected type, or quickly jump to definitions that clearly show which keys are available to configure.

atoav•3mo ago

To be frank, the clear problem with configuration format is that people have configurations so complex they probably should use something else.

Example: We are programming a backend for a blog. If we were to not use templates, but instead try to get that functionality via the webservice configuration we would have to "invent" some format that gives us the flexibility of templates within let's say a YAML file.

Needless to say that would be a horrible idea. Maybe I am being naive here, but I have yet to be convinced of the fact that it is really configuration formats that are the problem and not what people try to abuse them for. I have yet to work on a project where TOML wasn't enough for actual configuration.

Usually when I need something more complex than what can be done with TOML it is a sign that this needs to be handled differently. Via templates or with a database or making a special DSL-like script file. E.g. if you're using python nothing (except security considerations) stops you from allowing users to use a python file for configuration. If your configuration needs are really that complex, why not use a real programming language for it?

skydhash•3mo ago

Two examples of complex user facing configurations I can think of are pretty trivial to implement:

- Decision tree, where you only need comparison operators. The leaves are a specified list of actions.

- actions list with macros (variables). You can be fancy and add some conventions for arrays.

Anything more that that should just be a programming language. And if the relationship is adversarial (saas), you should really think hard about needing something that complex.

weavejester•3mo ago

To me the solution seems like it's adding complexity that could cause more issues further down the line.

The specific problems in the example could be solved by changing how the data is represented. Consider the following alternative representation, written in edn:

    {:aws.s3/buckets
     {:aws.region/eu-west
      {:alpha-hourly  {:lifecycle/policy {:delete-after #interval/days 4}}
       :alpha-daily   {:lifecycle/policy {:delete-after #interval/days 30}}
       :alpha-monthly {:lifecycle/policy {:delete-after #interval/days 365}}

       :bravo-hourly  {:lifecycle/policy {:delete-after #interval/days 4}}
       :bravo-daily   {:lifecycle/policy {:delete-after #interval/days 30}}
       :bravo-monthly {:lifecycle/policy {:delete-after #interval/days 365}}}}}

This prevents issues where the region is mistyped for a single bucket, makes the interval more readable by using a custom tag, and as a bonus prevents duplicate bucket names via the use of a map.

Obviously this doesn't prevent all errors, but it does prevent the specific errors that the RCL example solves, all without introducing a Turing-complete language.

MathMonkeyMan•3mo ago

> The specific problems in the example could be solved by changing how the data is represented.

Finding the "right" representation for a given set of data is an interesting problem, but most (all) of the time the representation is specified by someone/something else.

In the past I've written a [preprocessor][1] that adds some power to the representation while avoiding general purpose computation. For example,

    (buckets
      (let ([(regional region (name policy) ...)
             ((['name name] ['region region] ['lifecycle_policy policy]) ...)])
        (regional us-west
          (alpha-hourly (delete_after_seconds 345600))
          (alpha-daily (delete_after_seconds 2592000))
          (alpha-monthly (delete_after_seconds 31536000))
          (bravo-hourly (delete_after_seconds 345600))
          (bravo-daily (delete_after_seconds 259200))
          (bravo-monthly (delete_after_seconds 31536000)))))

Macros, basically. Arithmetic would help there, but that might be too much.

[1]: https://github.com/dgoffredo/llama

yawnxyz•3mo ago

the answer seems to be both of both worlds - if you're going to do for loops why not just use python?

the answer is both a faux programming language, and really bad ux / really hard to read / scan

maybe what they need is a program that generates better readable text; and somehow you can flip between the determinism of code and ux of readable text?!

(is that possible)

transfire•3mo ago

FOR loops?

YAML has a merge key <<:, which might be helpful.

The merge key is a clever little trick, but it depends of the special hash key, so lists can’t be merged.

Syntax does matter, which is why YAML matters — even if imperfect.

rattyJ2•3mo ago

Merges and anchors are some of the least maintaineable and most error-prone config i've seen.

Doesn't help every yaml parser has their own opinion on what a merge or an anchor should do, exactly.

rednafi•3mo ago

The intention behind configuration languages is sound but there’s just too many of them. Not having a universally accepted one makes picking one harder. Also, migrating from one to another isn’t as straightforward. Plus,

    for database in ["alpha", "bravo"]:
        for period, days in period_retention_days:

At this point, you’re better off writing a Python script that spits out some JSON. I‘m aware that even the blog mentions it. The benefit is - having a more expressive language at your fingertips and not having to fight your peers while trying to add yet another idiosyncratic dependency.

sam_bristow•3mo ago

Stoke Space[1] uses a similar system, letting people write arbitrary code to generate a static configuration for their launch vehicle. It means you get all the power of something like Python during development but also a deterministic, bounded config for the critical flight systems. I think their config files are just TOML that is consumed by Rust.

I'll try dig out a link to the talk one of their Flight Software Engineers did on the concept.

[1] https://www.stokespace.com/

lwhsiao•3mo ago

I'd be curious what the author thinks of KSON, which was also recently featured on HN [1].

[1]: https://news.ycombinator.com/item?id=45291858

ruuda•3mo ago

I mention it in the first paragraph, and what I think of it in the second paragraph.

gorgoiler•3mo ago

Somewhere along the way we got lost into thinking config files were remote procedure calls, and that YAML was the only RPC interface available to us. The caller generates YAML, the receiver parses it, but more often than not the two are on the same host and use the same language*:

  def say(message):
    o = dict(f=say, v=message))
    call(yaml.dumps(o))

…meanwhile elsewhere…

  def eval(request):
    o = yaml.loads(request.body)
    match o[“f”]:
      case “say”:
        print(o[“v”])

Yes, referring to this as “madness” skips over the fact that you can now scale up your hello world printer across the internet, have auth, rate limit, etc etc. but for so many things the RPC just isn’t needed at all.

As this article gets at, config files are one of those things, and they benefit hugely from the rampant abstraction violation of skipping an intermediary text format between the program doing the configuration and the program doing the work.

*or have internal versions of their APIs available in matching languages.

radarsat1•3mo ago

I got confused reading this because I wasn't sure how I, as a reader with no knowledge of the system under discussion, was supposed to know that all the buckets should be in the same region.

Nor is it clear to me how the "for loop" version would handle the case where exceptionally one bucket is different. Which is a more interesting discussion imho, that's the whole point of having it as a configuration field, after all.

Joker_vD•3mo ago

Like this:

    region = if name == "bravo-hourly" then "us-west" else "eu-west",

radarsat1•3mo ago

I see, not so clean imho but I guess it works.

qezz•3mo ago

The example in the article shows a need for cartesian product of (bucket name) and (lifetime policy), with a fixed location. Nothing stops you from defining a separate category for the exceptions, and just append them to the resulting list. Or you may want to assign each bucket its own location, then you define it along side the bucket name.

teo_zero•3mo ago

Before going all in with turing-complete configuration languages, why not simply augmenting existing declarative formats with parameter & arithmetic expansion à la bash? It's a syntax already familiar to many, that would mitigate the issues highlighted in TFA.

texuf•3mo ago

Yeah no. The reason we don't write code in config files is that, if we did, in order to know what you are deploying you have to run the code in your head. Not just this code, the code that Steve wrote six months ago before people noticed he was grossly incompetent and fired. Also worth mentioning, I can run this code in my head. The author can run this code in their head, but not everyone can. And finally, if the author thinks copy pasting yaml files cause bugs, wait till the llms start copy pasting the for loops.

saurik•3mo ago

Or you could run the code on the computer, instead of your head, and then view it on the computer as well... the same computer you were going to view the file in the first place? These files aren't being printed on paper... you have access to a computer: use it!

vivzkestrel•3mo ago

The world is growing tired of YAML so someone invented MAML yesterday on HN https://news.ycombinator.com/item?id=45562056

theknarf•3mo ago

Big fan of HCL as the configuration language to rule them all, being able to abstract stuff into reusable modules at the configurations language level is great. And there are implementation for HCL in multiple programming languages.

dapperdrake•3mo ago

A "universal language" is a language with (a) sequences of instructions, (b) conditionals, ifs, branches, and (c) loops, repetition, iteration, y-combinator (the math one), recursion.

They are Turing complete. Leaving off any of the three features (difficult for the y-combinator) yields a non-Turing complete language, for example Sieve Script for filtering email lacks (c).

What the definition doesn’t cover is parameterization. Some people call this abstraction. Technically, lisp macros also fall under parameterization.

With parameterization it really seems like any and all functions, mappings, and operators that fail to be injective and composable are counter-productive in practice. The math term is "generative effects".

There seems to be this continuous contention between one side that wants its configuration files to basically be CSV and another side that, effectively, wants a full-blown programming language as their "configuration language".

Both sides have a point. Just happen to land in the "essentially CSV" camp myself. Even a lisp macro can generate CSV.

(Yes the phrase "universal language" is very difficult to search for if the specific academic term above is under consideration.)

Twey•3mo ago

When Kubernetes exposed its APIs as declarative configuration objects, I get the impression that they didn't originally mean for people to write the configuration by hand. The YAML/JSON/… is a conveniently universal interchange format for interfacing with Kubernetes from bindings, and representing target state as documents is just a good way to encode idempotence in the API.

I'd be interested to hear from someone involved with early Borg/K8s development what the original intention was.

dizlexic•3mo ago

That's why I exclusively write my config files in PHP and output JSON \s

xg15•3mo ago

Friends don't let friends write turing complete config languages.

Apple is the only Big Tech company whose capex declined last quarter

Reverse-Engineering Raiders of the Lost Ark for the Atari 2600

Show HN: Deterministic NDJSON audit logs – v1.2 update (structural gaps)

The Greater Copenhagen Region could be your friend's next career move

Do Not Confirm – Fiction by OpenClaw

The Analytical Profile of Peas

Hallucinations in GPT5 – Can models say "I don't know" (June 2025)

What AI is good for, according to developers

OpenAI might pivot to the "most addictive digital friend" or face extinction

Show HN: Know how your SaaS is doing in 30 seconds

ClawdBot Ordered Me Lunch

What the News media thinks about your Indian stock investments

Running Lua on a tiny console from 2001

Google and Microsoft Paying Creators $500K+ to Promote AI Tools

New filtration technology could be game-changer in removal of PFAS

Show HN: I saw this cool navigation reveal, so I made a simple HTML+CSS version

Kinda Surprised by Seadance2's Moderation

I Write Games in C (yes, C)

Django scales. Stop blaming the framework (part 1 of 3)

Malwarebytes Is Now in ChatGPT

Thoughts on the job market in the age of LLMs

Show HN: Stacky – certain block game clone

AIII: A public benchmark for AI narrative and political independence

SectorC: A C Compiler in 512 bytes

The API Is a Dead End; Machines Need a Labor Economy

Digital Iris [video]

New wave of GLP-1 drugs is coming–and they're stronger than Wegovy and Zepbound

Convert tempo (BPM) to millisecond durations for musical note subdivisions

Show HN: Tasty A.F. - Use AI to Create Printable Recipe Cards

The Contagious Taste of Cancer

Apple is the only Big Tech company whose capex declined last quarter

Reverse-Engineering Raiders of the Lost Ark for the Atari 2600

Show HN: Deterministic NDJSON audit logs – v1.2 update (structural gaps)

The Greater Copenhagen Region could be your friend's next career move

Do Not Confirm – Fiction by OpenClaw

The Analytical Profile of Peas

Hallucinations in GPT5 – Can models say "I don't know" (June 2025)

What AI is good for, according to developers

OpenAI might pivot to the "most addictive digital friend" or face extinction

Show HN: Know how your SaaS is doing in 30 seconds

ClawdBot Ordered Me Lunch

What the News media thinks about your Indian stock investments

Running Lua on a tiny console from 2001

Google and Microsoft Paying Creators $500K+ to Promote AI Tools

New filtration technology could be game-changer in removal of PFAS

Show HN: I saw this cool navigation reveal, so I made a simple HTML+CSS version

Kinda Surprised by Seadance2's Moderation

I Write Games in C (yes, C)

Django scales. Stop blaming the framework (part 1 of 3)

Malwarebytes Is Now in ChatGPT

Thoughts on the job market in the age of LLMs

Show HN: Stacky – certain block game clone

AIII: A public benchmark for AI narrative and political independence

SectorC: A C Compiler in 512 bytes

The API Is a Dead End; Machines Need a Labor Economy

Digital Iris [video]

New wave of GLP-1 drugs is coming–and they're stronger than Wegovy and Zepbound

Convert tempo (BPM) to millisecond durations for musical note subdivisions

Show HN: Tasty A.F. - Use AI to Create Printable Recipe Cards

The Contagious Taste of Cancer

Abstraction, not syntax

Comments