Anyone melding GPT-level intelligence with physical world?

2•iamnnk•5mo ago

The current state of LLMs (ChatGPT, Gemini) give the impression of having 'solved digital experience' completely. They are self contained to the extent that the 2023 technique of building wrappers on top of them to customise experiences seems redundant.

I intuitively sense scope for a meld of such intelligence with the physical world.

Are there startups that are building anything cool in this space?

Comments

ai_critic•5mo ago

What on earth ever gave you that impression?

gtirloni•5mo ago

That's an interesting question but the "AI wrappers" aren't going away because the LLMs 1) aren't totally deterministic and 2) feeding them the correct prompts and context is still very valuable. In other words, one-shotting doesn't work for every use case (which is essentially what your saying when you say they are "self-contained", right? Unfortunately, they aren't/can't be).

Regarding the physical world, that's a deeper question. You have people that say LLM's "understand", that they are "intelligent" and that this is an "emergent behavior" of all their weights. You also have people that say they are nothing more than a stochastic parrot or auto-complete on steroids.

I'm in neither camp but let's do a thought exercise. Multi-modal LLM's are training on text, video, and sound. They can know what a chair looks like, what sound it make if you drag it over a wooden floor, and what it would look like when you do that (from this mysterious PoV somewhere). Now take that "knowledge" and ask it to give you 3D coordinates to move a chair right now in the room you're standing in: it simply can't. It's lacking a lot of information about the actual measurements of the room, its own movement capabilities (or those of the human to carry out the task), etc.

There are AI that can do this, but they aren't good for text. We have self-driving cars and factory robots doing things constrained to those domains.

If you say "meld" as in "let's combine a bunch of different AI technologies together with each one doing what it does best", I'm sure people are working on this already. But LLM's are but a small part of solving that problem.

EDIT: if you still can, please add "Ask HN: " to your title here.

iamnnk•5mo ago

That's insightful.

Yes, I had moving-the-chair-in-physical-space class of capabilities in mind: robots guided by multimodal intelligence, cars 'surprising me' on a day I'm idle, etc. The challenge here may be in what can be achieved at the edge, the feedback control system for correction of successive prompts.

Show HN: Identifier for files and directories (like ISBN for Books)

Show HN: Holy Grail: Open-Source Autonomous Development Agent

Show HN: Minecraft Creeper meets 90s Tamagotchi

Show HN: Termiteam – Control center for multiple AI agent terminals

The only U.S. particle collider shuts down

Ask HN: Why do purchased B2B email lists still have such poor deliverability?

Show HN: Remotion directory (videos and prompts)

Portable C Compiler

Show HN: Kokki – A "Dual-Core" System Prompt to Reduce LLM Hallucinations

Software Engineering Transformation 2026

Microsoft purges Win11 printer drivers, devices on borrowed time

Lunch with the FT: Tarek Mansour

Old Mexico and her lost provinces (1883)

'AI' is a dick move, redux

The source code was the moat. But not anymore

Does anyone else feel like their inbox has become their job?

An AI model that can read and diagnose a brain MRI in seconds

Dev with 5 of experience switched to Rails, what should I be careful about?

AlphaFace: High Fidelity and Real-Time Face Swapper Robust to Facial Pose

Scientists discover “levitating” time crystals that you can hold in your hand

Rammstein – Deutschland (C64 Cover, Real SID, 8-bit – 2019) [video]

Tell HN: Yet Another Round of Zendesk Spam

Postgres Message Queue (PGMQ)

Show HN: Django-rclone: Database and media backups for Django, powered by rclone

NY lawmakers proposed statewide data center moratorium

OpenClaw AI chatbots are running amok – these scientists are listening in

Show HN: AI agent forgets user preferences every session. This fixes it

Introduce the Vouch/Denouncement Contribution Model

Show HN: SSHcode – Always-On Claude Code/OpenCode over Tailscale and Hetzner

Microsoft appointed a quality czar. He has no direct reports and no budget