Inventing on Principle: https://www.youtube.com/watch?v=PUv66718DII
I mean even for something that is in theory fully understandable like the linux kernel it is not feasible to actually read the source before using it.
To me this really makes no sense. Even for traditional programming we only have so powerful systems because we use a layered approach. You can look into these layers and understand them but it is totally out of scope for a single human being.
I believe this is the crux of what the author is getting at: LLMs are, by their very nature, a black box that cannot ever be understood. You will never understand how an LLM reached its output, because their innate design prohibits that possibility from ever manifesting. These are token prediction machines whose underlying logic would take mathematicians decades to reverse engineer even a single query, by design.
I believe that’s what the author was getting at. As we can never understand LLMs in how they reached their output, we cannot rely on them as trustworthy agents of compute or knowledge. Just like we would not trust a human who gives a correct answer much of the time but can never explain how they knew that answer or how they reached that conclusion, so should we not trust LLMs in that same capacity.
Unless they have a lot of knowledge in electrical engineering/optics, the average user of this isn't going to understand how the camera or projector work except at a very high level.
I feel like the problem with LLMs here is more that they are not very predictable in their output and can fail in unexpected ways that are hard to resolve. You can rely on the camera to output some bits corresponding to whatever you're pointing it at even if you don't know anything about its internals.
But the point is to make users understand the system enough to instill the confidence to change things and explore further. This is important because the true power of computer systems comes from their flexibility and malleability.
You can never build that level of confidence with LLMs.
Fwiw I personally describe than as white, not black boxes. For we know, and can trace back every single bit of the output, back to the input. That does not help us as much as we'd like though. When drilling down into "why did the model answer wrongly 1, and not rightly 2", it comes down to "well, it added one trillion small numbers, and the sum came close to 1, but didn't reach 2". Which is unsatisfactory, and your "understanding" v.s. "comprehension" delineates that nicely.
Maybe more productive to think of them more "artefacts", less "mechanical contraptions". We shape them in many ways, but we are not in complete control of their making. We don't make them explicitly with out hands: we make a maker algorithm, and that algorithm then makes them. Or even "biological", grown artefacts. Given we don't control the end result fully. Yes we know and apply the algorithm that builds them, but we don't know the end result before hand, the final set of weights. Unlike say when we are making a coffee machine - we know all the parts to a millimetre in advance, have it all worked out pre-planned, before embarking on the making of the machine.
2. There is a lot of ongoing work on mechanistic interpretability by e.g. antropic that shows we can understand LLMs better than we initially thought.
Now, whether that's possible is up for debate, but we could definitely use a push in that direction, a system that we can genuinely understand top to bottom, so that we can modify it to what we need. That's where agency comes into his argument.
One of the challenges I found when I played with RealTalk is interoperability. The aim is to use the "spacial layer" to bootstrap people's intuitions on how programs should work, and interact with the world. It's really cool when this works. But key intuitions about how things interact when combined with each other, only work if the objects have been programmed to be compatible. A balloon wants to "pop if it comes into contact with anything sharp". A cactus wants to say "I am sharp". But if someone else has programmed a needle card to say "I am pointy", then it won't interact with the balloon in a satisfying way. Or, to use one of Dynamicland's favorite examples: say I have an interactive chart which shows populations of different countries when I place the "Mexico card" into the filter spot. What do you think should happen if I put a card showing the Mexican flag in that same spot, or some other card which just says the string "Mexico" on it? Wouldn't it be better if their interaction "just works"?
Visual LLMs can aid with this. Even a thin layer which can assign tags or answer binary questions about objects could be used to make programs massively more interoperable.
For Dynamicland I get the issue though putting the whole thing through an LLM to make pointy and sharp both trigger the same effects on another card would just hide the interaction entirely. It could or couldn't work for reasons completely opaque to both designer and user.
It's still at the cool demo level, though. How do you scale this thing?
The typical “scale” mindset is almost the opposite of that — the people doing the scaling are the ones with agency, and the rest get served slop they didn’t choose!
If the system is an unreliable demo, then that can promote agency. In the same way that you could fix your car 40 years ago, but you can’t now, because of scaled corporate processes.
You can fix your car just fine - just not the electronics. And those were to a large degree added for safety reasons. It is due to the complexity that they are difficult or impossible to fix.
This isn't about the cost; they already pay the cost to write the documentation or software for their own dealerships. It isn't about other carmakers; any company large enough to actually make a car would have no trouble getting a copy of it from one of the dealers. The only reason it's not published on their websites is that they don't want the vehicle owners and independent mechanics to have it, which is spiteful and obnoxious.
Not necessarily "scaling", but the point stands. If the goal is enable agency, then obviously you want that to happen in multiple places!
I love the project but it's nearly a decade old and still lives in one location or places Bret's directly collaborated with like the biolab. [0]
[0] https://dynamicland.org/2023/Improvising_cellular_playground...
If you really wanted to play around with similar ideas it doesn't take a needing to do a full reimplemention of the reactive engine.
https://www.youtube.com/watch?v=kr1O917o4jI
> How do you scale this thing?
You release a tablet and call it Dynamicland? I think that's what Microsoft did, but don't quote me on that.
Occlusion is definitely a problem.
Do still need to keep hands out of the light to see everything but that can also be part of the interaction too. If we ever get ubiquitous AR glasses or holograms I'm sure Bret will integrate them into DL.
[0] Which leads to a bit of a catch 22 you want a surface that looks dark but prefectly reflects all the colors of your projector so you need a white screen which means you ideally want zero other light other than the projector to make the projector act the most like a screen.
I've seen systems like this that use multiple projectors from different angles, calibrated for the space and the angle. They're very effective at preventing occlusion, and it takes fewer than you'd think (also see Valve's Lighthouse tech for motion tracking).
Unfortunately, doing that is expensive, big, and requires recalibrating whenever it's moved.
I've made a lot of progress recently working on my own homebrew version, running it in the browser in order to share it with people. Planning to take some time soon to take another stab at the real (physical) thing.
Progress so far: https://deosjr.github.io/dynamicland/
Under visibility they say:
>To empower people to understand and have full agency over the systems they are involved in, we aim for a computing system that is fully visible and understandable top-to-bottom — as simple, transparent, trustable, and non-magical as possible
But the programming behind the projector-camera system feels like it would be pretty impenetrable to the average person, right? What is so different about AI?
I think the vision is neat but hampered by the projector tech and the cost of setting up a version of your own, since it's so physically tied and Bret is (imo stubbornly) dedicated to the concept there's not a community building on this outside the local area that can make it to DL in person. It'd be neat to have a version for VR for example and maybe some day AR becomes ubiquitous enough to make it work anywhere.
[0] Annoyingly it's not open sourced so you can't really build your own version easily or examine it. There have been a few attempts at making similar systems but they haven't lasted as long or been as successful as Bret's Dynamicland.
I'm reading more about the "OS" Realtalk
>Some operating system engineers might not call Realtalk an operating system, because it’s currently bootstrapped on a kernel which is not (yet) in Realtalk.
You definitely couldn't fit the code for an LLM on the wall, so that makes sense. But I still have so many questions.
Are they really intending to have a whole kernel written down? How does this work in practice? If you make a change to Realtalk which breaks it, how do you fix it? Do you need a backup version of it running somewhere? You can't boot a computer from paper (unless you're using punch cards or something) so at some level it must exist in a solely digital format, right?
I think even if you could squeeze down an LLM and get it to run in realtalk I don't think it fits with the radical simplicity model they're going for. LLMs are fundamentally opaque, we have no idea why they output what they do in the end and can only twiddle the prompt knobs as a user which is the complete opposite direction from a project that refuses to provide the tools to build a version because it's putting the program back into the box instead of fileted out into the physical instantiation.
I wish he'd relent and package it up in a way that could be replicated more simply than reimplementing entirely from scratch.
I'm not sure where to draw the line between Realtalk and the underlying operating system. I'm willing to give it some credit, it's interesting without being written entirely from scratch. IIRC most of the logic that defines how things interact IS written in Realtalk and physcially accessible within the conceptual system instead of only through traditional computing.
Also, if you haven't heard of folk computer[1] as a viable alternative, I'd highly suggest checking it out! I'm one of the contributors, and it's definitely not dead (unlike all the other dynamic land spin-offs I've seen). The head programmers—Omar and Andreas—both worked at dynamic land for a couple months, so they've been able to carry over the good parts while also open sourcing it. The implementations have definitely diverged, but imho in a good way—folk computer is working on multi threading and is much more realtime-safe (you'll see in the latter dynamic land videos that it pauses every second or so).
You probably could fit the code for an LLM on a wall. Usually the code for an LLM is no more than a couple hundred lines.
Of course the weights wouldn't fit on a wall.
Like, you can write a script that talks to functionality that may or may not exist yet.
Programming by moving pieces of paper around deservedly gets attention, but there's a lot more to it.
I’m an artist who’ve always struggled to learn how to code. I can pick up on computer science concepts, but when I try to sit down and write actual code my brain just pretends it doesn’t exist.
Over like 20 years, despite numerous attempts I could never get past few beginner exercises. I viscerally can’t stand the headspace that coding puts me in.
Last night I managed to build a custom CDN to deliver cool fonts to my site a la Google fonts, create a gorgeous site with custom code injected CSS and Java (while grokking most of it), and best part … it was FUN! I have never remotely done anything like that in my entire life, and with ChatGPT’s help I managed to it in like 3 hours. It’s bonkers.
AI is truly what you make of it, and I think it’s an incredible tool that allows you to learn things in a way that fits how your brain works.
I think schools should have curriculum that teaches people how to use AI effectively. It’s truly a force multiplier for creativity.
Computers haven’t felt this fun for a long time.
From the context, it's not Java, but Javascript.
My takeaway as an AI skeptic is AI as human augmentation may really have potential?
I feel like, AI makes learning way more accessible, at least it did for me, where it evoked a childlike sense of curiosity and joy for learning new things.
I’m also working on a Trading Card Game, where I feed it my drawings and it renders it into final polished form based on visual style that I spent some time building in chat GPT. It’s like an amplifier / accelerator.
I feel like, yes while it can augment us, at the end day it depends on our desire to grow and learn. Otherwise, you will end up with same result as everybody else.
But, basically I wanted a way to have a custom repository of fonts a la Google Fonts (found their selection kinda boring) that I could pull from.
Ran fonts through transfonter to convert them to .woff2, set up a GitHub repository (which is not designed for people like me), and set up an instance on Netlify, then wrote custom CSS tags for my ghost.org site.
The thing that amazes me is that aside from my vague whiff of GitHub, I had absolutely no idea how to do this. Zilch. Nada. Chat GPT gave me a clear step by step plan, and exposed me to Netlify, how to write CSS injections, how ghost.org tagging works from styling side of things. And I’m able to have back and forth dialogue with it, not only to figure out how to do it, but understand how it works.
A Content Delivery Network (CDN) is a collection of geographically scattered servers that speeds up delivery of web content by being closer to users. Most video/image services use CDNs to efficiently serve up content to users around the world. For example, someone watching Netflix in California will connect to a different server than someone watching the same show in London.
Those are probably near the top of the list of things you don't want to blindly trust an LLM with building.
Think of a version that is even more fun, won't teach your kids wrong stuff, won't need a datacenter full of expensive chips and won't hit the news with sensacionalist headlines.
You're an artist, I'm sure you understand what art that comes from tradition tries to accomplish.
Have you tried using AI to further changes to any of these projects down the line?
Since I’ve literally been working on this project for two days, here’s a somewhat related answer to your question: I’ve been using chat gpt to build art for TCG. Initially I was resistant and upset at AI companies were hoovering up people’s work wholesale for training data (which is why I think now is an excellent time to have serious conversation about UBI, but I digress).
But I finally realized that I could develop my own distinctive 3D visual style by feeding GPT my drawings and having it iterate in interesting directions. It’s fun to refine the style, by having GPT simulate actual camera lens and lighting set up.
But yes I’ve used AI to make numerous stylistic tweaks to my site, including building out a tagging system that allows me to customize the look of individual pages when I write a post)
Hope I’ll be able to learn how to build an actual complex app one day, or games.
Enjoy the ride.
we are never ready for seismic changes. But we will have to adapt one way or another, might as well find a good use for it and develop awareness as a child would around handling knives.
Yes, AI currently has limitations and isn't a panacea for cognitive tasks. But in many specific use cases it is enormously useful, and the rapid growth of ChatGPT, AI startups, etc. is evidence of that. Many will argue that it's all fake, that it's all artificial hype to prop of VC evaluations, etc. They literally will see the billions in revenue as not real, same with all the real people upskilled via LLM's in ways that are entirely unique to the utility of AI.
I would trust many peoples' evaluations on the impacts of AI if they could at least engage with reality first.
stop underestimating the amount of internalized knowledge people can have about projects in the real world, it's so annoying.
an llm can't ever possibly get close to it. there's some guy in a team in another building who knows why a certain weird piece of critical business logic was put there 6 years ago, the llm will never know this and won't understand this even if it consumed the whole repository because it would have to work there for years to understand how the business works
Theese notable ways may also not be commonly known or put into words, but they persist nevertheless.
This obviously was a temporary tool we'd never let touch our github repo but it still very much worked and solved a niche problem. It even looked like our app because the LLM could consume screenshots to copy our designs.
I'm on board with vibe coding = non-maintainable, non-tested, mostly useless code by non-devs. But the plus side it will expose many many people to learn basic programming and fill many tiny gaps not solved by bigger more serious pieces of code. Especially once people start building infrastructure and tooling around these non-devs, like hosting, deployment, webhook integrations, etc.
In my experience working with agents helps eliminate that crap, because you have to bring the agent along as it reads your code (or process or whatever) for it to be effective. Just like human co-workers need to be brought along, so it’s not all on poor Bob.
If anything, HN is in general very much on the LLM hype train. The contrarian takes tend to be from more experienced folks working on difficult problems that very much see the fundamental flaws in how we're talking about AI.
> Many will argue that it's all fake, that it's all artificial hype to prop of VC evaluations, etc. They literally will see the billions in revenue as not real
That's not what people are saying. They're noting that revenue is meaningless in the absence of looking at cost. And it's true, investor money is propping up extremely costly ventures in AI. These services operate at a substantial loss. The only way they can hope to survive is through promising future pricing power by promising they can one day (the proverbial next week) replace human labor.
> same with all the real people upskilled via LLM's in ways that are entirely unique to the utility of AI.
Again, no one really denies that LLMs can be useful in learning.
This all feels like a strawman-- it's important to approach these topics with nuance.
This is very basic stuff, not rewriting a codebase, creating a video game from text prompt or generating imagery.
Simply - I would like to be able to verbally prompt my phone something like "make sure the lights and AC are set to I will be comfortable when I get home, follow up with that plumber if they haven't gotten back to us, place my usual grocery order plus add some berries plus anything my wife put on our shared grocery list, and schedule a haircut for the end of next week some time after 5pm".
Basically 15-30min of daily stupid personal time sucks that can all be accomplished via smartphone.
Given the promise of IoT, smart home, LLMs, voice assistants, etc.. this should be possible.
This would require it having access to my calendar, location, ability to navigate apps on my phone, read/send email/text, and spend money. Given the current state of the tools, even if there is a 0.1% chance it changes my contact card photo to Hitler, replies to an email from my boss with an insult, purchases $100,000 in bananas, or sets the thermostats to 99F.. then I couldn't imagine giving an LLM access to all those things.
Are we 3 months, 5 years, or never away from that being achievable? These feel like the kind of things previous voice assistants promised 10 years ago.
One person told me the other day that for the rest of time people will see using an AI as equivalent to crossing a picket line.
It works better than you for UI prototypes when you don’t know how to do UI (and maybe even faster even if you do). It doesn’t work at all on problems it hasn’t seen. I literally just saw a coworker staring at code for hours and getting completely off track trying to correct AI output vs stepping through the problem step by step using how we thought the algorithm should work.
There’s a very real difference between where it could be in the future to be useful vs what you can do with it today in a useful way and you have to be very careful about utilizing it correctly. If you don’t know what you’re doing and AI helps you get it done cool, but also keep in mind that you also won’t know if it has catastrophic bugs because you don’t understand the problem and the conceptual idea of the solution well enough to know if what it did is correct. For most people there’s not much difference but for those of us who care it’s a huge problem.
This is actually what I'm most excited about: in the reasonably near future, productivity will be related to who is most creative and who has the most interesting problems rather than who's spent the most hours behind a specific toolchain/compiler/language. Solutions to practical problems won't be required to go through a layer of software engineer. It's going to be amazing, and I'm going to be without a job.
Consumer apps may see less sales as people opt to just clone an app using AI for their own personal use, customized for their preferences.
But there’s a lot of engineering being done out there that people don’t even know exists, and that has to be done by people who know exactly what they’re doing, not just weekend warriors shouting stuff at an LLM.
Why stop at software? AI will do this to pretty much every discipline and artform, from music and painting, to law and medicine. Learning, mastery, expertise, and craftsmanship are obsolete; there's no need to expend 10,000 hours developing a skill when the AI has already spent billions of hours in the cloud training in its hyperbolic time chamber. Academia and advanced degrees are worthless; you can compress four years of study into a prompt the size of a tweet.
The idea guy will become the most important role in the coming aeon of AI.
What you're describing is a dead simple hobby project that could be completed by a complete novice in less than a week before the advent of LLMs.
It's like saying "I'm absolutely blown away by microwaves, I can have a meal hot and ready in just a few minutes with no effort or understanding. I think all culinary schools should have a curriculum that teaches people how to use microwaves effectively."
Maybe the goal of education should be giving people a foundation that they can build on, not make them an expert in something that a low skill ceiling and diminishing returns.
pmkary•6h ago