Bret Victor on why current trend of AIs is at odds with his work

https://dynamicland.org/2024/FAQ/#What_is_Realtalks_relationship_to_AI

154•prathyvsh•7h ago

Comments

pmkary•3h ago

There never is anything more satisfying than to see maestro Victor, or maestro Kay being up voted in the hacker news.

lif•3h ago

thanks for sharing this, seems like a very idealistic project, had not heard of Bret nor dynamicland before.

iambateman•2h ago

"Stop drawing dead fish" by Bret has stuck with me for a decade – https://www.youtube.com/watch?v=ZfytHvgHybA

mumbisChungo•2h ago

Very much worth the watch if you haven't seen this one before.

jasonjmcghee•2h ago

Highly recommend one of the most memorable talks I've ever seen...

Inventing on Principle: https://www.youtube.com/watch?v=PUv66718DII

randomNumber7•3h ago

> we aim for a computing system that is fully visible and understandable top-to-bottom

I mean even for something that is in theory fully understandable like the linux kernel it is not feasible to actually read the source before using it.

To me this really makes no sense. Even for traditional programming we only have so powerful systems because we use a layered approach. You can look into these layers and understand them but it is totally out of scope for a single human being.

stego-tech•2h ago

That’s because you’re conflating “understanding” with “comprehension”. You can understand every component in a chain and its function, how it works, where its fragilities lay or capabilities are absent, without reviewing the source code for everything you install. To comprehend, however, you must be intimately familiar with the underlying source code, how it compiles, how it speaks to the hardware, etc.

I believe this is the crux of what the author is getting at: LLMs are, by their very nature, a black box that cannot ever be understood. You will never understand how an LLM reached its output, because their innate design prohibits that possibility from ever manifesting. These are token prediction machines whose underlying logic would take mathematicians decades to reverse engineer even a single query, by design.

I believe that’s what the author was getting at. As we can never understand LLMs in how they reached their output, we cannot rely on them as trustworthy agents of compute or knowledge. Just like we would not trust a human who gives a correct answer much of the time but can never explain how they knew that answer or how they reached that conclusion, so should we not trust LLMs in that same capacity.

why_at•53m ago

I get that LLMs are a black box in ways that most other technologies aren't. It still feels to me like they have to be okay with abstracting out some of the details of how things work.

Unless they have a lot of knowledge in electrical engineering/optics, the average user of this isn't going to understand how the camera or projector work except at a very high level.

I feel like the problem with LLMs here is more that they are not very predictable in their output and can fail in unexpected ways that are hard to resolve. You can rely on the camera to output some bits corresponding to whatever you're pointing it at even if you don't know anything about its internals.

ljosifov•34m ago

I get what you are talking about. My gripe with that is - yeah, would be indeed great if we could at some point get the structure to such deep level, as to write down pen on paper on one page, a string of maths symbols that is a good enough description. However - it's possible that for many things, that's not possible. Suspect maybe not possible in e.g. biology. Possible that the great success of physics of the 20-th century over-indulged us. So our expectations are out of kilter with realities of our world.

Fwiw I personally describe than as white, not black boxes. For we know, and can trace back every single bit of the output, back to the input. That does not help us as much as we'd like though. When drilling down into "why did the model answer wrongly 1, and not rightly 2", it comes down to "well, it added one trillion small numbers, and the sum came close to 1, but didn't reach 2". Which is unsatisfactory, and your "understanding" v.s. "comprehension" delineates that nicely.

Maybe more productive to think of them more "artefacts", less "mechanical contraptions". We shape them in many ways, but we are not in complete control of their making. We don't make them explicitly with out hands: we make a maker algorithm, and that algorithm then makes them. Or even "biological", grown artefacts. Given we don't control the end result fully. Yes we know and apply the algorithm that builds them, but we don't know the end result before hand, the final set of weights. Unlike say when we are making a coffee machine - we know all the parts to a millimetre in advance, have it all worked out pre-planned, before embarking on the making of the machine.

ethan_smith•2h ago

There's a fundamental difference between systems that are theoretically comprehensible but practically large (Linux) versus systems whose internal reasoning is inherently opaque by design (modern neural networks).

luke-stanley•2h ago

You're right, and when the behaviour of large codebases violates expectations, the intertwined webs of code are problems. Alan Kay's Viewpoints Research studied this, and famously Kay proposed "t-shirt sized algorithms", where short rules can be used to make even fancy desktop word processing and presentation software. There are also projects like From NAND to Tetris that show understanding a full stack is achievable. Could this go broader, and deeper? Of course, and this is what Bret Victor is getting at. Not just for when code goes wrong, but to make it more modifiable and creative in the first place (See Maggie Appleton's essay/talk "Home-Cooked Software and Barefoot Developers"). Projects like SerenityOS show how powerful "small" software can be. For instance, its spin-off project, the Ladybug browser has vastly fewer lines of code compared to Chromium, and yet it seems that the Ladybug team is able to implement one specification after another. Last I saw, they were close to meeting the minimum feature set Apple requires for shipping as a browser on iOS.

thatguymike•2h ago

I'm sympathetic, but I do think Realtalk could be improved with some simple object recognition and LLMing.

One of the challenges I found when I played with RealTalk is interoperability. The aim is to use the "spacial layer" to bootstrap people's intuitions on how programs should work, and interact with the world. It's really cool when this works. But key intuitions about how things interact when combined with each other, only work if the objects have been programmed to be compatible. A balloon wants to "pop if it comes into contact with anything sharp". A cactus wants to say "I am sharp". But if someone else has programmed a needle card to say "I am pointy", then it won't interact with the balloon in a satisfying way. Or, to use one of Dynamicland's favorite examples: say I have an interactive chart which shows populations of different countries when I place the "Mexico card" into the filter spot. What do you think should happen if I put a card showing the Mexican flag in that same spot, or some other card which just says the string "Mexico" on it? Wouldn't it be better if their interaction "just works"?

Visual LLMs can aid with this. Even a thin layer which can assign tags or answer binary questions about objects could be used to make programs massively more interoperable.

rtkwe•1h ago

That's similar to the issue with the whole NFT craze where you'd "take items from one game to another", it requires everything to work with everything.

For Dynamicland I get the issue though putting the whole thing through an LLM to make pointy and sharp both trigger the same effects on another card would just hide the interaction entirely. It could or couldn't work for reasons completely opaque to both designer and user.

Animats•2h ago

Here's a video of Dynamicland.[1] The textual description doesn't tell you much.

It's still at the cool demo level, though. How do you scale this thing?

[1] https://www.youtube.com/watch?v=7wa3nm0qcfM

chubot•1h ago

What do you mean by “scale”? It’s designed to be decentralized, and promote agency of small, co-located groups of people

The typical “scale” mindset is almost the opposite of that — the people doing the scaling are the ones with agency, and the rest get served slop they didn’t choose!

If the system is an unreliable demo, then that can promote agency. In the same way that you could fix your car 40 years ago, but you can’t now, because of scaled corporate processes.

quonn•59m ago

> you could fix your car 40 years ago, but you can’t now, because of scaled corporate processes.

You can fix your car just fine - just not the electronics. And those were to a large degree added for safety reasons. It is due to the complexity that they are difficult or impossible to fix.

Animats•22m ago

They've been at this since 2017 and there's only one location where it's working. Occasionally they do a demo somewhere else. It is only used, apparently, when supervised by its promoters. That's the "scaling" issue. They need a few more deployments.

rtkwe•16m ago

The difficulty with that is there's no code or instructions to build your own so despite being "more open than open source" you're stuck implementing it from scratch if you want to make your own. Even if you can make the trek out the the current instance you can't take it home because there's still the core interpreter you need to run on a regular system to read the cameras, recognize the feducial marks, run the interpreter, and output that to the projectors that isn't immediately replicable.

I love the project but it's nearly a decade old and still lives in one location or places Bret's directly collaborated with like the biolab. [0]

[0] https://dynamicland.org/2023/Improvising_cellular_playground...

infinite8s•7m ago

Folk.computer (https://folk.computer) is an open source version of DL-like system, and even though the code uses TCL it's pretty easy to reimplement any bits you see in the DynamicLand archives (I've done this). For example, the code in the video here https://dynamicland.org/archive/2022/Knobs could be almost 1-1 translated into TCL and it works the same.

If you really wanted to play around with similar ideas it doesn't take a needing to do a full reimplemention of the reactive engine.

bobajeff•2h ago

I've never experienced dynamicaland in person (only seen videos). However, one concern I have about it's demos so far is that they use a projector. So you need a room dark enough to for the projected light and you need to keep your heads, hands, and body out of the way of it.

fzzzy•2h ago

This is true, but modern laser projectors are very, very bright. I use one as my main computer display with no problems with the blinds open, and the sun shining in.

Occlusion is definitely a problem.

rtkwe•2h ago

Projectors have been strong enough to be visible in decently lit rooms for ages. The reason you want the room extremely dark for most projector setups is for contrast, because the darkest thing you can make on a projected image is the ambient surface illumination (and the brightest is that surface under full power from your projector [0]). If you accept that compromise you don't need a super dark room, the recommendation for tight light control is mostly for media viewing where you want reasonable black levels.

Do still need to keep hands out of the light to see everything but that can also be part of the interaction too. If we ever get ubiquitous AR glasses or holograms I'm sure Bret will integrate them into DL.

[0] Which leads to a bit of a catch 22 you want a surface that looks dark but prefectly reflects all the colors of your projector so you need a white screen which means you ideally want zero other light other than the projector to make the projector act the most like a screen.

Miraste•1h ago

>you need to keep your heads, hands, and body out of the way of it.

I've seen systems like this that use multiple projectors from different angles, calibrated for the space and the angle. They're very effective at preventing occlusion, and it takes fewer than you'd think (also see Valve's Lighthouse tech for motion tracking).

Unfortunately, doing that is expensive, big, and requires recalibrating whenever it's moved.

ijk•57m ago

The light level isn't an issue in practice: when I visited the actual installation during the day, the building was brightly lit with natural light and the projections were easily visible, to the point that I didn't think about it at the time.

deosjr•2h ago

Such an amazing project.

I've made a lot of progress recently working on my own homebrew version, running it in the browser in order to share it with people. Planning to take some time soon to take another stab at the real (physical) thing.

Progress so far: https://deosjr.github.io/dynamicland/

why_at•2h ago

I'm only just now reading about Dynamicland for the first time, so maybe I'm not understanding something obvious. The text description is not very helpful, as far as I can tell from pictures it's a place where you can move around physical objects and papers to do computer programming type stuff?

Under visibility they say:

>To empower people to understand and have full agency over the systems they are involved in, we aim for a computing system that is fully visible and understandable top-to-bottom — as simple, transparent, trustable, and non-magical as possible

But the programming behind the projector-camera system feels like it would be pretty impenetrable to the average person, right? What is so different about AI?

rtkwe•1h ago

Dynamicland is bootstrapped in a sense, [0] the same way you write the first compiler/interpreter for your code in another language then later write it in it's own language. The code running the camera and projector systems is also running from physically printed programs in one of the videos you can see a wall that's the core 'OS' so to speak of Dynamicland.

I think the vision is neat but hampered by the projector tech and the cost of setting up a version of your own, since it's so physically tied and Bret is (imo stubbornly) dedicated to the concept there's not a community building on this outside the local area that can make it to DL in person. It'd be neat to have a version for VR for example and maybe some day AR becomes ubiquitous enough to make it work anywhere.

[0] Annoyingly it's not open sourced so you can't really build your own version easily or examine it. There have been a few attempts at making similar systems but they haven't lasted as long or been as successful as Bret's Dynamicland.

why_at•1h ago

That's pretty cool. I figure this is explained in some of the videos but I can't watch them right now.

I'm reading more about the "OS" Realtalk

>Some operating system engineers might not call Realtalk an operating system, because it’s currently bootstrapped on a kernel which is not (yet) in Realtalk.

You definitely couldn't fit the code for an LLM on the wall, so that makes sense. But I still have so many questions.

Are they really intending to have a whole kernel written down? How does this work in practice? If you make a change to Realtalk which breaks it, how do you fix it? Do you need a backup version of it running somewhere? You can't boot a computer from paper (unless you're using punch cards or something) so at some level it must exist in a solely digital format, right?

rtkwe•39m ago

Yeah he's put out a fair number of videos and the whole idea makes more sense there or if you can manage to visit in person.

I think even if you could squeeze down an LLM and get it to run in realtalk I don't think it fits with the radical simplicity model they're going for. LLMs are fundamentally opaque, we have no idea why they output what they do in the end and can only twiddle the prompt knobs as a user which is the complete opposite direction from a project that refuses to provide the tools to build a version because it's putting the program back into the box instead of fileted out into the physical instantiation.

I wish he'd relent and package it up in a way that could be replicated more simply than reimplementing entirely from scratch.

I'm not sure where to draw the line between Realtalk and the underlying operating system. I'm willing to give it some credit, it's interesting without being written entirely from scratch. IIRC most of the logic that defines how things interact IS written in Realtalk and physcially accessible within the conceptual system instead of only through traditional computing.

ijk•49m ago

RealTalk has some interesting features that I wish there was a more complete writeup that explained it in detail.

Like, you can write a script that talks to functionality that may or may not exist yet.

Programming by moving pieces of paper around deservedly gets attention, but there's a lot more to it.

Ask HN: Why isn't mobile phone service restricted to emergency numbers only?

Using Large Language Models to Infer Problematic Instagram Use

Robinhood's Crypto Trading Promotions Probed by Florida AG

Italy's Mosaic School

A Doctor Said Israel's War Is Fueling Health Crises in Gaza. UCSF Fired Her

Claude now connects with Canvas, Panopto, and Wiley for educational access

Open Source Tools to Detect CVE-2024-54085

Show HN: Highlight Text to Send Directly into ChatGPT

Color Block Jam Level Guide

Extending Self-Discharge Time of Dicke Quantum Batteries with Molecular Triplets

Advice, like youth, probably just wasted on the young (1997)

Peter Jackson-backed biotech company sets its "de-extinction" sights on NZ Moa

Perlan Project

A model for IV&V that's useful

Learning to Learn (In the Age of LLMs)

Rattleback

Is Telecom the New Tequila?

Beyond CVE: Integrating Multiple Sources for Complete Vulnerability Intelligence

Linda Yaccarino Resigns as 'CEO' of X

Show HN: Bullpost about founders and startups to show your conviction

Symbiosis as Metaphor

Show HN: Bedrock – An 8-bit computing system for running programs anywhere

Anyone's Steam Just Die?

Show HN: Scribble Draw (Image Generation)

Trump to use presidential authority to send weapons to Ukraine, sources say

Use LLMs over DNS at Ch.at

An open letter from educators who refuse the call to adopt GenAI in education

Energy and AI Observatory

Establishing First-Time Security Functions in FAANG [video]

What Google Indexing Instagram Means for Your Business Visibility