What Kubernetes is missing most is a 10 year track record of simplicity/stability. What it needs most to thrive is a better reputation of being hard to foot-gun yourself with.
It's just not a compelling business case to say "Look at what you can do with kubernetes, and you only need a full-time team of 3 engineers dedicated to this technology at tho cost of a million a year to get bin-packing to the tune of $40k."
For the most part Kubernetes is becoming the common-tongue, despite all the chaotic plugins and customizations that interact with each other in a combinatoric explosion of complexity/risk/overhead. A 2.0 would be what I'd propose if I was trying to kill kuberenetes.
Kubernetes 2.0 should be a boring pod scheduler with some RBAC around it. Let folks swap out the abstractions if they need it instead of having everything so tightly coupled within the core platform.
Kubernetes clusters shouldn't be bespoke and weird with behaviors that change based on what flavor of plugins you added. That is antithetical to the principal of the workloads you're trying to manage. You should be able to headshot the whole thing with ease.
Service discovery is just one of many things that should be a different layer.
hard agree. Its like jenkins, good idea, but its not portable.
Having regretfully operated both k8s and Jenkins, I fully agree with this, they have some deep DNA in common.
Swiss Army Buggy Whips for Everyone!
I get that it's not for everyone, I'd not recommend it for everyone. But once you start getting a pretty diverse ecosystem of services, k8s solves a lot of problems while being pretty cheap.
Storage is a mess, though, and something that really needs to be addressed. I typically recommend people wanting persistence to not use k8s.
I have actually come to wonder if this is actually an AWS problem, and not a Kubernetes problem. I mention this because the CSI controllers seem to behave sanely, but they are only as good as the requests being fulfilled by the IaaS control plane. I secretly suspect that EBS just wasn't designed for such a hot-swap world
Now, I posit this because I haven't had to run clusters in Azure nor GCP to know if my theory has legs
I guess the counter-experiment would be to forego the AWS storage layer and try Ceph or Longhorn but no company I've ever worked at wants to blaze trails about that, so they just build up institutional tribal knowledge about treating PVCs with kid gloves
Or is your assertion that Kubernetes should be its own storage provider and EBS can eat dirt?
The problem is k8s is both a orchestration system and a service provider.
Grid/batch/tractor/cube are all much much more simple to run at scale. More over they can support complex dependencies. (but mapping storage is harder)
but k8s fucks around with DNS and networking, disables swap.
Making a simple deployment is fairly simple.
But if you want _any_ kind of ci/cd you need flux, any kind of config management you need helm.
Absurdly wrong on both counts.
Sure, but then one of those third party products (say, X) will catch up, and everyone will start using it. Then job ads will start requiring "10 year of experience in X". Then X will replace the core orchestrator (K8s) with their own implementation. Then we'll start seeing comments like "X is a horribly complex, bloated platform which should have been just a boring orchestrator" on HN.
Right now running K8S on anything other than cloud providers and toys (k3s/minikube) is disaster waiting to happen unless you're a really seasoned infrastructure engineer.
Storage/state is decidedly not a solved problem, debugging performance issues in your longhorn/ceph deployment is just pain.
Also, I don't think we should be removing YAML, we should instead get better at using it as an ILR (intermediate language representation) and generating the YAML that we want instead of trying to do some weird in-place generation (Argo/Helm templating) - Kubernetes sacrificed a lot of simplicity to be eventually consistent with manifests, and our response was to ensure we use manifests as little as possible, which feels incredibly bizzare.
Also, the design of k8s networking feels like it fits ipv6 really well, but it seems like nobody has noticed somehow.
* Uses local-only storage provider by default for PVC
* Requires entire cluster to be managed by k3s, meaning no freebsd/macos/windows node support
* Master TLS/SSL Certs not rotated (and not talked about).
k3s is very much a toy - a nice toy though, very fun to play with.
My main pain point is, and always has been, helm templating. It's not aware of YAML or k8s schemas and puts the onus of managing whitespace and syntax onto the chart developer. It's pure insanity.
At one point I used a local Ansible playbook for some templating. It was great: it could load resource template YAMLs into a dict, read separately defined resource configs, and then set deeply nested keys in said templates and spit them out as valid YAML. No helm `indent` required.
https://artifacthub.io/docs/topics/repositories/
You can do the same with just about any K8s related artifact. We always encourage projects to go through the process but sometimes they need help understanding that it exists in the first place.
Artifacthub is itself an incubating project in the CNCF, ideas around making this easier for everyone are always welcome, thanks!
(Disclaimer: CNCF Staff)
Including ingress-nginx? Per OP, it's not marked as verified. If even the official components don't bother, it's hard to recommend it to third parties.
For me, the only thing that really changed was LLMs - chatgpt is exceptional at understanding and generating valid k8s configs (much more accurately than it can do coding). It's still complex, but it feels I have a second brain to look at it now
I'd agree that YAML isn't a good choice, but neither is HCL. Ever tried reading Terraform, yeah, that's bad too. Inherently we need a better way to configure Kubernetes clusters and changing out the language only does so much.
IPv6, YES, absolutely. Everything Docker, container and Kubernetes should have been IPv6 only internal from the start. Want IPv4? That should be handle by a special case ingress controller.
The longer I look at k8s, the more I see it "batteries not included" around storage, networking, etc, with the result being that the batteries come with a bill attached from AWS, GCP, etc. K8s is less of an open source project, and more as a way encourage dependency on these extremely lucrative gap filler services from the cloud providers.
Is that true. No one is really using it?
I think one thing k8s would need is some obvious answer for stateful systems(at scale, not mysql at a startup). I think there are some ways to do it? Where I work there is basically everything on k8s, then all the databases on their own crazy special systems to support they insist its impossible and costs to much. I work in the worst of all worlds now supporting this.
re: comments about k8s should just schedule pods. mesos with aurora or marathon was basically that. If people wanted that those would have done better. The biggest users of mesos switched to k8s
1. etcd did an fsync on every write and required all nodes to complete a write to report a write as successful. This was not configurable and far higher a guarantee than most use cases actually need - most Kubernetes users are fine with snapshot + restore an older version of the data. But it really severely impacts performance.
2. At the time, etcd had a hard limit of 8GB. Not sure if this is still there.
3. Vanilla etcd was overly cautious about what to do if a majority of nodes went down. I ended up writing a wrapper program to automatically recover from this in most cases, which worked well in practice.
In conclusion there was no situation where I saw etcd used that I wouldn't have preferred a highly available SQL DB. Indeed, k3s got it right using sqlite for small deployments.
Of course configurability is good (e.g. for automated fasts tests you don't need it), but safe is a good default here, and if somebody sets up a Kubernetes cluster, they can and should afford enterprise SSDs where fsync of small data is fast and reliable (e.g. 1000 fsyncs/second).
I didn't! Our business DR plan only called for us to restore to an older version with short downtime, so fsync on every write on every node was a reduction in performance for no actual business purpose or benefit. IIRC we modified our database to run off ramdisk and snapshot every few minutes which ran way better and had no impact on our production recovery strategy.
> if somebody sets up a Kubernetes cluster, they can and should afford enterprise SSDs where fsync of small data is fast and reliable
At the time one of the problems I ran into was that public cloud regions in southeast asia had significantly worse SSDs that couldn't keep up. This was on one of the big three cloud providers.
1000 fsyncs/second is a tiny fraction of the real world production load we required. An API that only accepts 1000 writes a second is very slow!
Also, plenty of people run k8s clusters on commodity hardware. I ran one on an old gaming PC with a budget SSD for a while in my basement. Great use case for k3s.
So, to circle back to your original point: rke2 (Apache 2) is a fantastic, airgap-friendly, intelligence community approved distribution, and pairs fantastic with rancher desktop (also Apache 2). It's not the kubernetes part of that story which is hard, it's the "yes, but" part of the lego build
- https://github.com/rancher/rke2/tree/v1.33.1%2Brke2r1#quick-...
- https://github.com/rancher-sandbox/rancher-desktop/releases
Last time I tried loading the openapi schema in the swagger ui on my work laptop (this was ~3-4 years ago, and i had an 8th gen core i7 with 16gb ram) it hang my browser, leading to tab crash.
4,5 probably don't require 2.0 - can be easily added within existing API via KEP (cri-o already does userns configuration based on annotations)
Why not use protobuf, or similar interface definition languages? Then let users specify the config in whatever language they are comfortable with.
The idea that someone who works with JavaScript all day might find HCL confusing seems hard to imagine to me.
To be clear, I am talking about the syntax and data types in HCL, not necessarily the way Terraform processes it, which I admit can be confusing/frustrating. But Kubernetes wouldn’t have those pitfalls.
outer {
example {
foo = "bar"
}
example {
foo = "baz"
}
}
it reminds me of the insanity of toml [lol]
[[whut]]
foo = "bar"
[[whut]]
foo = "baz"
only at least with toml I can $(python3.13 -c 'import tomllib, sys; print(tomllib.loads(sys.stdin.read()))') to find out, but with hcl too badI see some value instead. Lately I've been working on Terraform code to bring up a whole platform in half a day (aws sub-account, eks cluster, a managed nodegroup for karpenter, karpenter deployment, ingress controllers, LGTM stack, public/private dns zones, cert-manager and a lot more) and I did everything in Terraform, including Kubernetes resources.
What I appreciated about creating Kubernetes resources (and helm deployments) in HCL is that it's typed and has a schema, so any ide capable of talking to an LSP (language server protocol - I'm using GNU Emacs with terraform-ls) can provide meaningful auto-completion as well proper syntax checking (I don't need to apply something to see it fail, emacs (via the language server) can already tell me what I'm writing is wrong).
I really don't miss having to switch between my ide and the Kubernetes API reference to make sure I'm filling each field correctly.
Absolutely loathsome syntax IMO
I use Terranix to make config.tf.json which means I have the NixOS module system that's composable enough to build a Linux distro at my fingertips to compose a great Terraform "state"/project/whatever.
It's great to be able to run some Python to fetch some data, dump it in JSON, read it with Terranix, generate config.tf.json and then apply :)
I couldn't tell you exactly, but modules always end up either not exposing enough or exposing too much. If I were to write my module with Terranix I can easily replace any value in any resource from the module I'm importing using "resource.type.name.parameter = lib.mkForce "overridenValue";" without having to expose that parameter in the module "API".
The nice thing is that it generates "Terraform"(config.tf.json) so the supremely awesome state engine and all API domain knowledge bound in providers work just the same and I don't have to reach for something as involved as Pulumi.
You can even mix Terranix with normal HCL since config.tf.json is valid in the same project as HCL. A great way to get started is to generate your provider config and other things where you'd reach to Terragrunt/friends. Then you can start making options that makes resources at your own pace.
The terraform LSP sadly doesn't read config.tf.json yet so you'll get warnings regarding undeclared locals and such but for me it's worth it, I generally write tf/tfnix with the provider docs open and the language (Nix and HCL) are easy enough to write without full LSP.
https://terranix.org/ says it better than me, but by doing it with Nix you get programatical access to the biggest package library in the world to use at your discretion (Build scripts to fetch values from weird places, run impure scripts with null_resource or it's replacements) and an expressive functional programming language where you can do recursion and stuff, you can use derivations to run any language to transform strings with ANY tool.
It's like Terraform "unleashed" :) Forget "dynamic" blocks, bad module APIs and hacks (While still being able to use existing modules too if you feel the urge).
If the documentation and IDE story for kustomize was better, I'd be its biggest champion
And since it's Terraform you can create resources using any provider in the registry to create resources according to your Kubernetes objects too, it can technically replace things like external-dns and similar controllers that create stuff in other clouds, but in a more "static configuration" way.
Edit: This works nicely with Gitlab Terraform state hosting thingy as well.
The setup with Terranix sounds cool! I am pretty interested in build system type things myself, I recently wrote a plan/apply system too that I use to manage SQL migrations.
I want learn nix, but I think that like Rust, it's just a bit too wide/deep for me to approach on my own time without a tutor/co-worker or forcing function like a work project to push me through the initial barrier.
Try using something like devenv.sh initially just to bring tools into $PATH in a distro agnostic & mostly-ish MacOS compatible way (so you can guarantee everyone has the same versions of EVERYTHING you need to build your thing).
Learn the language basics after it brings you value already, then learn about derivations and then the module system which is this crazy composable multilayer recursive magic merging type system implemented on top of Nix, don't be afraid to clone nixpkgs and look inside.
Nix derivations are essentially Dockerfiles on steroids, but Nix language brings /nix/store paths into the container, sets environment variables for you and runs some scripts, and all these things are hashed so if any input changes it triggers automatic cascading rebuilds, but also means you can use a binary cache as a kind of "memoization" caching thingy which is nice.
It's a very useful tool, it's very non-invasive on your system (other than disk space if you're not managing garbage collection) and you can use it in combination with other tools.
Makes it very easy to guarantee your DevOps scripts runs exactly your versions of all CLI tools and build systems and whatever even if the final piece isn't through Nix.
Look at "pgroll" for Postgres migrations :)
I'm not familiar with HCL so I'm struggling to find much here that would be conclusive, but a lot of this thread sounds like "HCL's features that YAML does not have are sub-par and not sufficient to let me only use HCL" and... yeah, you usually can't use YAML that way either, so I'm not sure why that's all that much of a downside?
I've been idly exploring config langs for a while now, and personally I tend to just lean towards JSON5 because comments are absolutely required... but support isn't anywhere near as good or automatic as YAML :/ HCL has been on my interest-list for a while, but I haven't gone deep enough into it to figure out any real opinion.
From your lips to God's ears. And, as they correctly pointed out, this work is already done, so I just do not understand the holdup. Folks can continue using etcd if it's their favorite, but mandating it is weird. And I can already hear the butwhataboutism yet there is already a CNCF certification process and a whole subproject just for testing Kubernetes itself, so do they believe in the tests or not?
> The Go templates are tricky to debug, often containing complex logic that results in really confusing error scenarios. The error messages you get from those scenarios are often gibberish
And they left off that it is crazypants to use a textual templating language for a whitespace sensitive, structured file format. But, just like the rest of the complaints, it's not like we don't already have replacements, but the network effect is very real and very hard to overcome
That barrier of "we have nicer things, but inertia is real" applies to so many domains, it just so happens that helm impacts a much larger audience
* kpt is still not 1.0
* both kustomize and kpt require complex setups to programatically generate configs (even for simple things like replicas = replicasx2)
- I get HCL, types, resource dependencies, data structure manipulation for free
- I use a single `tf apply` to create the cluster, its underlying compute nodes, related cloud stuff like S3 buckets, etc; as well as all the stuff running on the cluster
- We use terraform modules for re-use and de-duplication, including integration with non-K8s infrastructure. For example, we have a module that sets up a Cloudflare ZeroTrust tunnel to a K8s service, so with 5 lines of code I can get a unique public HTTPS endpoint protected by SSO for whatever running in K8s. The module creates a Deployment running cloudflared as well as configures the tunnel in the Cloudflare API.
- Many infrastructure providers ship signed well documented Terraform modules, and Terraform does reasonable dependency management for the modules & providers themselves with lockfiles.
- I can compose Helm charts just fine via the Helm terraform provider if necessary. Many times I see Helm charts that are just “create namespace, create foo-operator deployment, create custom resource from chart values” (like Datadog). For these I opt to just install the operator & manage the CRD from terraform directly or via a thin Helm pass through chat that just echos whatever HCL/YAML I put in from Terraform values.
Terraform’s main weakness is orchestrating the apply process itself, similar to k8s with YAML or whatever else. We use Spacelift for this.
& get automatic scaling out of the box etc. a more simplified flow rather than wrangling yaml or hcl
in short imaging if k8s was a 2-3 max 5 line docker compose like file
I'd like to see more emphasis on UX for v2 for the most common operations, like deploying an app and exposing it, then doing things like changing service accounts or images without having to drop into kubectl edit.
Given that LLMs are it right now, this probably won't happen, but no harm in dreaming, right?
Even Terraform seems to live on just a single-layer and was relatively straight-forward to learn.
Yes, I am in the middle of learning K8s so I know exactly how steep the curve is.
Much of the complexity then comes from the enormous amount of resource types - including all the custom ones. But the basic idea is really pretty small.
I find terraform much more confusing - there’s a spec, and the real world.. and then an opaque blob of something I don’t understand that terraform sticks in S3 or your file system and then.. presumably something similar to a one-shot reconciler that wires that all together each time you plan and apply?
There’s the state file (base commit, what the system looked like the last time terraform succesfully executed). The current system (the main branch, which might have changed since you “branched off”) and the terraform files (your branch)
Running terraform then merges your branch into main.
Now that I’m writing this down, I realize I never really checked if this is accurate, tf apply works regardless of course.
You skipped the { while true; do tofu plan; tofu apply; echo "well shit"; patch; done; } part since the providers do fuck-all about actually, no kidding, saying whether the plan could succeed
I agree with you on a human level. Operators and controllers remind me of COM and CORBA, in a sense. They are hightly abstract things, that are intrinsically so flexible that they allow judgement (and misjudgement) in design.
For simple implementations, I'd want k8s-lite, that was more opinionated and less flexible. Something which doesn't allow for as much shooting ones' self in the foot. For very complex implementations, though, I've felt existing abstractions to be limiting. There is a reason why a single cluster is sometimes the basis for cell boundaries in cellular architectures.
I sometimes wonder if one single system - kubernetes 2.0 or anything else - can encompass the full complexity of the problem space while being tractable to work with by human architects and programmers.
You seem to want something like https://skateco.github.io/ (still compatible to k8s manifests).
Or maybe even something like https://uncloud.run/
Or if you still want real certified Kubernetes, but small, there is https://k3s.io/
This whole thing started scratching my own itch of wanting an orchestrator that I can confidently stand up, delpoy to, then forget about.
Things we are aiming to improve:
* Globally distributed * Lightweight, can easily run as a single binary on your laptop while still scaling to thousands of nodes in the cloud. * Tailnet as the default network stack * Bittorrent as the default storage stack * Multi-tenant from the ground up * Live migration as a first class citizen
Most of these needs were born out of building modern machine learning products, and the subsequent GPU scarcity. With ML taking over the world though this may be the norm soon.
Non-requirement?
> * Tailnet as the default network stack
That would probably be the first thing I look to rip out if I ever was to use that.
Kubernetes assuming the underlying host only has a single NIC has been a plague for the industry, setting it back ~20 years and penalizing everyone that's not running on the cloud. Thank god there are multiple CNI implementation.
Only recently with Multus (https://www.redhat.com/en/blog/demystifying-multus) some sense seem to be coming back into that part of the infrastructure.
> * Multi-tenant from the ground up
How would this be any different from kubernetes?
> * Bittorrent as the default storage stack
Might be interesting, unless you also mean seeding public container images. Egress traffic is crazy expensive.
> the first thing I look to rip out
This only shows how varied the requirements are across the industry. One size does not fit all, hence multiple materially different solutions spring up. This is only good.*
Since k8s manifests are a language, there can be multiple implementations of it, and multiple dialects will necessarily spring up.
Also, ohgawd please never ever do this ever ohgawd https://github.com/agentsea/nebulous/blob/v0.1.88/deploy/cha...
As someone who has the same experience you described with janky sidecars blowing up normal workloads, I'm violently anti service-mesh. But, cert expiry and subjectAltName management is already hard enough, and you would want that to happen for every pod? To say nothing of the TLS handshake for every connection?
1. Helm: make it official, ditch the text templating. The helm workflow is okay, but templating text is cumbersome and error-prone. What we should be doing instead is patching objects. I don't know how, but I should be setting fields, not making sure my values contain text that are correctly indented (how many spaces? 8? 12? 16?)
2. Can we get a rootless kubernetes already, as a first-class citizen? This opens a whole world of possibilities. I'd love to have a physical machine at home where I'm dedicating only an unprivileged user to it. It would have limitations, but I'd be okay with it. Maybe some setuid-binaries could be used to handle some limited privileged things.
While I agree type safety in HCL beats that of YAML (a low bar), it still leaves a LOT to be desired. If you're going to go through the trouble of considering a different configuration language anyway, let's do ourselves a favor and consider things like CUE[1] or Starlark[2] that offer either better type safety or much richer methods of composition.
1. https://cuelang.org/docs/introduction/#philosophy-and-princi...
2. https://github.com/bazelbuild/starlark?tab=readme-ov-file#de...
Every JSON Schema aware tool in the universe will instantly know this PodSpec is wrong:
kind: 123
metadata: [ {you: wish} ]
I think what is very likely happening is that folks are -- rightfully! -- angry about using a text templating language to try and produce structured files. If they picked jinja2 they'd have the same problem -- it does not consider any textual output as "invalid" so jinja2 thinks this is a-ok jinja2.Template("kind: {{ youbet }}").render(youbet=True)
I am aware that helm does *YAML* sanity checking, so one cannot just emit whatever crazypants yaml they wish, but it does not then go one step further to say "uh, your json schema is fubar friend"I personally use TypeScript since it has unions and structural typing with native JSON support but really anything can work.
In response, there's been a wave of "serverless" startups because the idea of running anything yourself has become understood as (a) a time sink, (b) incredibly error prone, and (c) very likely to fail in production.
I think a Kubernetes 2.0 should consider what it would look like to have a deployment platform that engineers can easily adopt and feel confident running themselves – while still maintaining itself as a small-ish core orchestrator with strong primitives.
I've been spending a lot of time building Rivet to itch my own itch of an orchestrator & deployment platform that I can self-host and scale trivially: https://github.com/rivet-gg/rivet
We currently advertise as the "open-source serverless platform," but I often think of the problem as "what does Kubernetes 2.0 look like." People are already adopting it to push the limits into things that Kubernetes would traditionally be good at. We've found the biggest strong point is that you're able to build roughly the equivalent of a Kubernetes controller trivially. This unlocks features more complex workload orchestration (game servers, per-tenant deploys), multitenancy (vibe coding per-tenant backends, LLM code interpreters), metered billing per-tenant, more powerful operators, etc.
Someone decides X technology is too heavy-weight and wants to just run things simply on their laptop because "I don't need all that cruft". They spend time and resources inventing technology Y to suit their needs. Technology Y gets popular and people add to it so it can scale, because no one runs shit in production off their laptops. Someone else comes along and says, "damn, technology Y is too heavyweight, I don't need all this cruft..."
"There are neither beginnings nor endings to the Wheel of Time. But it was a beginning.”
Just because something’s complex doesn’t necessarily mean it has to be that complex.
- it requires too much RAM to run in small machines (1GB RAM). I want to start small but not have to worry about scalability. docker swarm was nice in this regard.
- use KCL lang or CUE lang to manage templates
zdw•6h ago
jauntywundrkind•4h ago