Similar to Rod of Iron Ministries (The Church of the AR-15) Taking what is says, fine tuning it, testing it, feeding back in and mostly waiting as LLMs improve.
LLMs will never be smarter than humans, but they can be a meeting place where people congregate to work on goals and worship.
Like QAnon, that's where the collective IQ and power comes from, something to believe in. At the micro level this is also mostly how LLMs are used in practical ways.
If you look to the Middle East there is a lot of work on rockets but a limited community working together.
Think of the absurdity of trying to understand the Pi number by looking at its first billion digits and trying to predict the next digit. And think of what it takes to advance from memorizing digits of such numbers and predicting continuation with astrology-style logic to understanding the math behind the digits of Pi.
I'm prepared to believe that a sufficiently advanced LLM around today will have some "neural" representation of a generalization of a Taylor Series, thus allowing it to "natively predict" digits of Pi.
This is the opposite of engineering/science. This is animism.
Nothing against this sim in particular but all such simulations that attempt to model any non-trivial system are imperfect. Nature is just too complex to model precisely and accurately. The LLM (or other DL network architecture) will only learn information that is presented to it. When trained on simulation the network can not help but infer incorrectly about messy reality.
For example, if RocketPy lacks any model of cross breezes, the network would never learn to design to counter them. Or, if it does model variable winds but does so with the wrong mean, or variance, or skew (of intensity, period, etc) the network can not properly learn and the design will not be optimal. The design will fail when it faces reality that differs from model.
Replace "rocket" with any other thing and you have AI/ML applied to science and engineering - fundamentally flawed, at least at some level of precision/accuracy.
At the least, real learning on reality is required. Once we can back-propagate through nature, then perhaps DL networks can begin to be actually trustworthy for science and engineering.
I believe the future of such simulation is to start from the lowest level - ie. schrodinger's equation, and get the simulator to derive all higher level stuff.
Obviously the higher level models are imperfect, but then it's the AI's job to decide if a pile of soil needs to be simulated as a load of grains of sand, or as crystals of quartz, or as atoms of silicon, or as quarks...
The AI can always check its answer by redoing a lower level simulation of a tiny part of the result, and check it is consistent with a higher level/cheaper simulation.
I do hate to burst your bubble here but I've been doing real-time simulation (in the form of games, 2D, 3D, VR) for enough decades to know this is only a pipe-dream.
Maybe at the point when we have a Dyson sphere and have all universally agreed upon the principles that cause an airfoil to generate lift this would be possible, otherwise it's orders of magnitude beyond all of the terrestrial compute that we have now.
To quote Han Solo, the way we do effective and convincing science and simulation now is ... "a lot of simple tricks and nonsense."
Any competent person can simulate 100 atoms in a crystal of some material, and say "whoa, it seems the bulk of this material behaves like a spring with f=kx, lets replace the individual atom simulation with a bulk simulation which is computationally far cheaper", and then we can simulate trillions of atoms.
I don't see why AI couldn't do the same.
Really I think it would be cool to explore -- I've been working on a procedural game engine (conceptually at least) for a long time and want to incorporate even "basic" things like chemistry. I think it's still decades away for that, not even considering quantum phenomena.
Considering how fast you can go with simulations vs real launches, I'm not surprised the took the first approach.
Depends on what your goal is. If you are trying to solve the narrow problem of rocketry or whatever, sure. But maybe not if your goal is making models smarter.
The broader context is that we need new oracles beyond math and programming in order to exercise CoT models on longer planning-horizon tasks.
In this case, if working with a toy world model lets you learn generalizable strategies (I bet it does, as video games do too) then this sort of eval can be a useful addition.
We need innovative disruptors to train LLMs to do engineering from ground up and to make calls to simulation software/routines when they need specialized/unique datapoints.
I have seen some demos of Claude being connected to Blender etc. But when I dug into the code, it was using another LLM to generate the objects rather than building the objects from fundamental shapes.
Workaccount2•2mo ago
Software engineering lends itself greatly to LLMs because it just fits so nicely into tokenization. Whereas mechanical drawings or electronic schematics are sort of more like a visual language. Image art but with very exacting and important pixel placement, with precise underlying logical structure.
In my experience so far, only O3 can kind of understand an electronic schematic, but really only at a "Hello World!" level difficulty. I don't know how easy it will be to get to the point where it can render a proper schematic or edit one it is given to meet some specified electronic characteristics.
There are programming languages that are used to define drawings, but the training data would be orders of magnitude less than what is written for humans to learn from.
slicktux•2mo ago
davemp•2mo ago
echoangle•2mo ago
davemp•2mo ago
nyrikki•2mo ago
It works explicitly because it doesn't hit the often counter-intuitive limitations with generalization in pure math.
Remember that Boolean circuit satisfiability is NP-complete, and is beyond UHAT's + poly length CoT expressibility, which is capped at PTIME.
Even int logic with boolean circuits is in PSPACE.
When you start to deal with values, you are going to have to add in heuristics and/or find reductions that will cost your generalizability.
Even if you model analog circuits as finite labelled directed graphs with labelled vertices, similar to what Shannon used; removing some of the real world electrical impacts and focus on them as computational units, the complexity can get crazy fast.
Those circuits, with specific constraints (IIRC local feedback, etc..) can be simulated by a Turing machine, but require ELEMENTARY space or time, and despite it's name ELEMENTARY is iterated exponential: 2^2^2^2^2^...^n with k n's.
Also note that P/poly, viewed as problems that can be solved by small circuits is not a practical class and in fact contains all of the unary languages that we know are unsolvable by real computers in the general case.
That apparent paradox that P/poly, which has small bool circuits, also contains all of those undecidable unary languages is a good starter into that rat hole.
While we will have tools and models that are better at math logic, the constrains are actually limits on computation in the general case. Generalization often has these types of costs, and the RL benefits in this case relate to demonstrating that IMHO.
heisenzombie•2mo ago
As for the drawings themselves, I have found them pretty unreliable at reading even quite simple things (i.e. what's the ID of the thru hole?), even when they're specifically dimensioned. As soon as spatial reasoning is required (i.e. there's a dimension from A to B and from A to C and one asks for the dimension B to C), they basically never get it right.
This is a place where there's a LOT of room for improvement.
Terr_•2mo ago
[0] https://www.dkriesel.com/en/blog/2013/0802_xerox-workcentres...
tintor•2mo ago
Problem #2 is low control over outputs of text-to-image models. Models don't follow prompts well.
yieldcrv•2mo ago
neodypsis•2mo ago
flipflipper•2mo ago
kurthr•2mo ago
There are good reasons not to vibecode Verilog, but a lot of test cases are already being written by LLMs and the big EDA vendors (Cadence, Synopsys, Siemens) all tout their new AI capabilities.
It's like saying it can't read handwritten mathematical formulas, when it solves most math problems in markup (and if you aren't using it you're asking for trouble).
flipflipper•2mo ago
discordance•2mo ago
If you look at the data structure of a gerber or DWG, it’s vectors and metadata. These happen to be great for LLMs.
My hypothesis is that we haven’t done the work on that yet because the market is more interested in things like Ghibli imagery.
danielbln•2mo ago
notahacker•2mo ago
jayd16•2mo ago
imranq•2mo ago
rjsw•2mo ago
Someone could try training a LLM on a combination of a STEP AP242 [1] data model and sample exchange files, or do the same for the Building Information Model [2].
[1] http://www.ap242.org/ [2] https://en.wikipedia.org/wiki/Industry_Foundation_Classes