aliases:
common-env: &common-env
key1: value1
key2: value2
tasks:
- key: some-task
run: ...
env:
<<: *common-env
https://www.rwx.com/docs/mint/aliasesOne day we might even see for-loop in CSS...
TBH it's getting a bit exhausting watching us go through this hamster wheel again and again and again.
I hypothesize a TC-complete language for something like CSS that included deep tracking under the hood for where values are coming from and where they are going would be very useful, i.e., you would have the ability to point at a particular part of the final output and the language runtime could give you a complete accounting of where it came from and what went into making the decisions, could end up giving us the auditability that we really want from the "declarative" languages while giving us the full power of the programming langauges we clearly want. However I don't have the time to try to manifest such a thing myself, and I don't know of any existing language that does what I'm thinking of. Some of the more powerful languages could theoretically do it as a library. It's not entirely unlike the auditing monad I mention towards the end of https://www.jerf.org/iri/post/2958/ . It's not something I'd expect a general-purpose language to do by default since it would have bad general-purpose performance, but I think for specialized cases of a TC-complete configuration langauge it could have value, and one could always run it as an debugging option and have an optimized code path that didn't track the sources of everything.
Now only if they supported paths filter for `workflow_call` [2] event in addition to push/pull_request and my life would be a lot easier. Nontrivial repos have an unfortunate habit of building some sort of broken version of change detection themselves.
The limit of 20 unique workflow calls is quite low too but searching the docs for a source maybe they have removed it? It used to say
> You can call a maximum of 20 unique reusable workflows from a single workflow file.
but now it's max of 4 nested workflows without loops, which gives a lot of power for the more complex repos [3]. Ooh. Need to go test this.
[1] https://docs.github.com/en/actions/reference/workflows-and-a...
[2] https://docs.github.com/en/actions/reference/workflows-and-a...
[3] https://docs.github.com/en/actions/how-tos/reuse-automations...
Generate it from Dhall, or cue, or python, or some real language that supports actual abstractions.
If your problem is you want to DRY out yaml, and you use more yaml features to do it, you now have more problems, not fewer.
Above a certain level of complexity, sure. But having nothing in between is an annoying state of affairs. I use anchors in Gitlab pipelines and I hardly curse their names.
Asking the team to add a new build dependency, learn a new language, and add a new build step would create considerably more problems, not fewer. Used sparingly and as needed, YAML anchors are quite easy to read. A good editor will even allow you to jump to the source definition just as it would any other variable.
Being self-contained without any additional dependencies is a huge advantage, particularly for open source projects, IMHO. I'd wager very few people are going to learn Dhall in order to fix an issue with an open source project's CI.
I find it an absolute shame that languages like Dhall did not become more popular earlier. Now everything in devops is yaml, and I think many developers pick yaml configs not out of good reasons but defaulting to its ubiquity as sufficient.
yaml 1.2 was released in 2009, and it fixed this problem. this is an implementation issue.
If you have two workflows... one to handle a PR creation/update and another to address the merge operation, it is like pulling teeth to get the final commit properly identified so you can grab any uploaded artifacts from the PR workflow.
This is a terrible advice from security endpoint - given that env variables are often used for secrets data, you really _don't_ want them to set them at the top level. The secrets should be scoped as narrow as possible!
For example, if you have a few jobs, and some of them need to download some data in first step (which needs a secret), then your choices are (a) copy-paste "env" block 3 times in each step, (b) use the new YAML anchor and (c) set secret at top-level scope. It is pretty clear to me that (c) is the worst idea, security wise - this will make secret available to every step in the workflow, making it much easier for malware to exfiltrate.
Language implementations for yaml vary _wildly_.
What does the following parse as:
some_map:
key: value
no: cap
If I google "yaml online" and paste it in, one gives me:{'some_map': {False: 'cap', 'key': 'value'}}
The other gives me:
{'some_map': {'false': 'cap', 'key': 'value'}}
... and neither gives what a human probably intended, huh?
Plus it has exactly enough convenience-feature-related sharp edges to be risky to hand to a newbie, while wearing the dress of something that should be too bog-simple to have that problem. I, too, enjoy languages that arbitrarily decide the Norwegian TLD is actually a Boolean "false."
GitHub Actions have a lot of rules, logic and multiple sublanguages in lots of places (e.g. conditions, shell scripts, etc.) YAML is completely superficial, XML would be an improvement due to less whitespace sensitivity alone.
That is the key function any serious CI platform needs to tackle to get me interested. FORCE me to write something that can run locally. I'll accept using containers, or maybe even VMs, but make sure that whatever I build for your server ALSO runs on my machine.
I absolutely detest working on GitHub Actions because all too often it ends up requiring that I create a new repo where I can commit to master (because for some reason everybody loves writing actions that only work on master). Which means I have to move all the fucking secrets too.
Solve that for me PLEASE. Don't give me more YAML features.
We write Bash or Python, and our tool will produce the YAML pipeline reflecting it.
So we dont need to maintain YAML with over-complicated format.
The resulting YAML is not meant to be read by an actual human since its absolute garbage, but the code we want to run is running when we want, without having to maintain the YAML.
And we can easily test it locally.
I really enjoyed working with the Earthfile format[1] used for Earthly CI, which unfortunately seems like a dead end now. It's a mix of Dockerfile and Makefile, which made it made very familiar to read and write. Best of all, it allowed running the pipeline locally exactly as it would run remotely, which made development and troubleshooting so much easier. The fact GH Actions doesn't have something equivalent is awful UX[2].
Honestly, I wish the industry hadn't settled on GitHub and GH Actions. We need better tooling and better stewards of open source than a giant corporation who has historically been hostile to open source.
[1]: https://earthly.dev/earthfile
[2]: Yes, I'm aware of `act`, but I've had nothing but issues with it.
Although, I think it is generally an accepted practice to use declarative configuration over imperative configuration? In part, maybe what the article is getting at, maybe?
I'm certainly willing to believe that yaml is not the ideal answer but unless we're comparing it to a concrete alternative, I feel like this is just a "grass is always greener" type take.
I am not sure you can do this whilst having the granular job reporting (i.e. either you need one YAML block per job or you have all your jobs in one single 'status' item?) Is it actually doable?
Some do just that: dagger.io. It is not all roses but debugging is certainly easier.
Once you allow setting and reading of variables in a configuration file, you lose the safety that makes the format useful. You might as well be using a bash script at that point.
Give me a proper platform that I can run locally on my development machine.
but, if those anchors are a blessed standard YAML feature that YAML tools will provide real assertions about unlike the ${{}} stuff that basically you're doing a commit-push-run-wait-without any proper debug tools besides prints?
Then yes, they should use them.
Custom YAML anchors with custom support and surprise corner cases: bad.
- The complaint is Github using a non-standard, custom fork of yaml
- This makes it harder to develop linters/security tools (as those have to explicitly deal with all features available)
- The author of this blogpost is also the author of zizmor, the most well-known Github Actions security linter (also the only one I'm aware of)
Using anchors would have improved the security of this, as well as the maintenance. The examples cited don't remotely demonstrate the cases where anchors would have been useful in GA.
I agree that YAML is a poor choice of format regardless but still, anchor support would have benefitted a number of projects ages ago.
I think they should be supported because it's surprising and confusing if you start saying 'actually, it's a proprietary subset of YAML', no more reason needed than that.
(Also, as a personal bias, merge keys are really bad because they are ambiguous, and I haven't implemented them in my C++ yaml library (yaml-cpp) because of that.)
[1]: https://ktomk.github.io/writing/yaml-anchor-alias-and-merge-...
First, he can just not use the feature, not advocate for its removal.
Second, his example alternative is wrong: it would set variables for all steps, not just those 2, he didn't think of a scenario where there are 3 steps and you need to have common envs in just 2 of them.
> First, he can just not use the feature, not advocate for its removal.
I maintain a tool that ~thousands of projects use to analyze their workflows and actions. I can avoid using anchors, but I can't avoid downstreams using them. That's why the post focuses on static analysis challenges.
> Second, his example alternative is wrong: it would set variables for all steps, not just those 2, he didn't think of a scenario where there are 3 steps and you need to have common envs in just 2 of them.
This is explicitly addressed immediately below the example.
baobun•1h ago
OPs main argument seems to be "I don't have a use for it and find it hard to read so it should be removed".
woodruffw•1h ago
I don't think this is a fair characterization: it's not that I don't have a use for it, but that I think the uses are redundant with existing functionality while also making static and human analysis of workflows harder.
willseth•57m ago
woodruffw•45m ago
Or in other words: if your problem is DRYness, GitHub should be fixing or enhancing the ~dozen other ways in which the components of a workflow shadow and scope with each other. Adding a new cross-cutting form of interaction between components makes the overall experience of using GitHub Actions less consistent (and less secure, per points about static analysis challenges) at the benefit of a small amount of deduplication.
btreecat•37m ago
woodruffw•31m ago
(As the post notes, neither I nor GitHub appears to see full compliance with YAML 1.1 to be an important goal: they still don't support merge keys, and I'm sure they don't support all kinds of minutiae like non-primitive keys that make YAML uniquely annoying to analyze. Conforming to a complex specification is not inherently a good thing; sometimes good engineering taste dictates that only a subset should be implemented.)
btreecat•28m ago
That's a long way to say "yes, actually"
woodruffw•25m ago
"Because I don't like it" makes it sound like I don't have a technical argument here, which I do. Do you think it's polite or charitable to reduce peoples' technical arguments into "yuck or yum" statements like this?
VGHN7XDuOXPAzol•24m ago
Kind of a hard disagree here; if you don't want to conform to a specification, don't claim that you're accepting documents from that specification. Call it github-flavored YAML (GFY) or something and accept a different file extension.
https://github.com/actions/runner/issues/1182
> YAML 1.1 to be an important goal: they still don't support merge keys
right, they don't do merge keys because it's not in YAML 1.2 anymore. Anchors are, however. They haven't said that noncompliance with YAML 1.2 spec is intentional
woodruffw•18m ago
Sure, I wouldn't be upset if they did this.
To be clear: there aren't many fully conforming YAML 1.1 and 1.2 parsers out there: virtually all YAML parsers accept some subset of one or the other (sometimes a subset of both), and virtually all of them emit the JSON object model instead of the internal YAML one.
GuinansEyebrows•53m ago
woodruffw•52m ago
(This post is written from my perspective as a static analysis tool author. It's my opinion from that perspective that the benefits of anchors are not worth their costs in the specific context of GitHub Actions, for the reasons mentioned in the post.)
detaro•46m ago
woodruffw•40m ago
That in turn means that there's no way to construct a source span back to the anchor itself, because the parsed representation doesn't know where the anchor came from (only that it was flattened).
VGHN7XDuOXPAzol•28m ago
I think it makes way more sense for GitHub to support YAML anchors given they are after all part of the YAML spec. Otherwise, don't call it YAML! (This was a criticism of mine for many years, I'm very glad they finally saw the light and rectified this bug)
woodruffw•21m ago
Yes, it's just difficult. The point made in the post isn't that it's impossible, but that it significantly changes the amount of of "ground work" that static analysis tools have to do to produce useful results for GitHub Actions.
> I think it makes way more sense for GitHub to support YAML anchors given they are after all part of the YAML spec. Otherwise, don't call it YAML! (This was a criticism of mine for many years, I'm very glad they finally saw the light and rectified this bug)
It's worth noting that GitHub doesn't support other parts of the YAML spec: they intentionally use their own bespoke YAML parser, and they don't have the "Norway" problem because they intentionally don't apply the boolean value rules from YAML.
All in all, I think conformance with YAML is a red herring here: GitHub Actions is already its own thing, and that thing should be easy to analyze. Adding anchors makes it harder to analyze.
VGHN7XDuOXPAzol•1m ago
maybe, but not entirely sure. 'Two wrongs don't make a right' kind of thinking on my side here.
But if they call it GFY and do what they want, then that would probably be better for everyone involved.
> they don't have the "Norway" problem because they intentionally don't apply the boolean value rules from YAML.
I think this is YAML 1.2. I have not done or seen a breakdown to see if GitHub is aiming for YAML 1.2 or not but they appear to think that way, given the discussion around merge keys
nirvdrum•1h ago
Half the argument against supporting YAML anchors appears to boil down some level of tool breakage. While you can rely on simplifying assumptions, you take a risk that your software breaks when that assumption is invalidated. I don't think that's a reason to stop evolving software.
I've never seen a project use any of the tools the author listed, but I have seen duplicated config. That's not to say the tools have no value, but rather I don't want to be artificially restricted to better support tools I don't use. I'll grant that the inability to merge keys isn't ideal but, I'll take what I can get.