frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

Open in hackernews

Continuous Thought Machines

https://pub.sakana.ai/ctm/
187•hardmaru•9h ago

Comments

robwwilliams•7h ago
Great to refocus in this important topic. So cool to see this bridge being built across fields.

In wet-ware it is hard not to think of “time” as linear Newtonian time driven by a clock. But in the cintext of brain- and-body what really is critical is generating well ordered sequences of acts and operations that are embedded in thicker or thinner sluce of “now” that can range from 300 msec of the “specious present” to 50 microseconds in cells that evaluate the sources of sound (the medial superior olivary nucleus).

For more context on contingent temporality see interview with RW Williams in this recent publication in The European Journal of Neuroscience by John Bickle:

https://pubmed.ncbi.nlm.nih.gov/40176364/

ttoinou•6h ago
Ironically this webpage continuously refreshes itself on my firefox iOS :P
tonyhart7•5h ago
it literally never load for me
rvz•5h ago
> The Continuous Thought Machine (CTM) is a neural network architecture that enables a novel approach to thinking about data. It departs from conventional feed-forward models by explicitly incorporating the concept of Neural Dynamics as the central component to its functionality.

Still going through the paper, But this looks very exciting to actually see, the internal visual recurrence in action when confronting a task (such as the 2D Puzzle) - making it easier to interpret neural networks over several tasks involving 'time'.

(This internal recurrence may not be new, but applying neural synchronization as described in this paper is).

> Indeed, we observe the emergence of interpretable and intuitive problem-solving strategies, suggesting that leveraging neural timing can lead to more emergent benefits and potentially more effective AI systems

Exactly. Would like to see more applications of this in existing or new architectures that can also give us additional transparency into the thought process on many tasks.

Another great paper from Sakana.

omneity•4h ago
Is it the same Sakana from the cheating AI coder tribulations? There were some fundamental mistakes in that work that made me question the team.

https://www.hackster.io/news/sakana-ai-claims-its-ai-cuda-en...

https://techcrunch.com/2025/02/21/sakana-walks-back-claims-t...

doall•3h ago
They admitted, apologized, and are in the process of revising the paper. Mistakes always happen whether small or big. What is more important is to be transparent, learn from it, and make sure the same mistake doesn't happen again.
coolcase•5h ago
I love the ML diagrams that hybrid maths and architecture. It is much less dry than all formal math.
dcrimp•3h ago
I'm quite enthusiastic about reading this. Since watching the progress by the larger LLM labs, I've noted that they're not making material changes in model configuration that I think to be necessary to proceed toward more refined and capable intelligence. They're adding tools and widgets to things we know don't think like a biological brain. These are really useful things from a commercial perspective, but I think LLMs won't be an enduring paradigm, at least wrt genuine stabs at artificial intelligence. I've been surprised that there hasn't been more effort to transformative work like in the linked article.

The two things that hang me up on current progress in intelligence is that:

- there don't seem to be models which possess continuous thought. Models are alive during a forward pass on their way to produce a token and brain-dead any other time - there don't seem to be many models that have neural memory - there doesn't seem to be any form of continuous learning. To be fair, the whole online training thing is pretty uncommon as I understand it.

Reasoning in token space is handy for evals, but is lossy - you throw away all the rest of the info when you sample. I think Meta had a paper on continuous thought in latent space, but I don't think effort in that has continued to anything commercialised.

Somehow, our biological brains are capable of super efficiently doing very intelligent stuff. We have a known-good example, but research toward mimicking that example is weirdly lacking?

All the magic happens in the neural net, right? But we keep wrapping nets with tools we've designed with our own inductive biases, rather than expanding the horizon of what a net can do and empowering it to do that.

Recently I've been looking into SNNs, which feel like a bit of a tech demo, as well as neuromorphic computing, which I think holds some promise for this sort of thing, but doesn't get much press (or, presumably, budget?)

(Apologies for ramble, writing on my phone)

liamwire•3h ago
Seems really interesting, and the in-browser demo and model was a really great hook to get interest in the rest of the research. I’m only partially through it but the idea itself is compelling.
erewhile•3h ago
The ideas of these machines isn't entirely new. There's some research from 2002, where Liquid State Machines (LSM) are introduced[1]. These are networks that generally rely on continuous inputs into spiking neural networks, which are then read by some dense layer that connects to all the neurons in this network to read what is called the liquid state.

These LSMs have also been used for other tasks, like playing Atari games in a paper from 2019[2], where they show that while sometimes these networks can outperform humans, they don't always, and they tend to fail at the same things more conventional neural networks failed at at the time as well. They don't outperform these conventional networks, though.

Honestly, I'd be excited to see more research going into continuous processing of inputs (e.g., audio) with continuous outputs, and training full spiking neural networks based on neurons on that idea. We understand some of the ideas of plasticity, and they have been applied in this kind of research, but I'm not aware of anyone creating networks like this with just the kinds of plasticity we see in the brain, with no back propagation or similar algorithms. I've tried this myself, but I think I either have a misunderstanding of how things work in our brains, or we just don't have the full picture yet.

[1] doi.org/10.1162/089976602760407955 [2] doi.org/10.3389/fnins.2019.00883

AIorNot•2h ago
Can someone explain this paper in the context of LLM architectures - it seems this cannot be combined with LLM deep learning or can it?
davedx•2h ago
So this weekend we have:

- Continuous thought machines: temporally encoding neural networks (more like how biological brains work)

- Zero data reasoning: (coding) AI that learns from doing, instead of by being trained on giant data sets

- Intellect-2: a globally distributed RL architecture

I am not an expert in the field but this feels like we just bunny hopped a little closer to the singularity...

spiderfarmer•1h ago
Also not an expert, but I think this is like saying robots will dominate the world because we invented camera's, actuators and batteries.

In other words, baby steps, not bunny hops.

iandanforth•2h ago
This paper is concerning. While divorced from the standard ML literature there is a lot of work on biologically plausible spiking, timing dependant artificial neutral networks. The nomenclature here doesn't seem to acknowledge that body of work. Instead it appears as a step toward that bulk of research coming from the ML/LLM field without a clear appreciation of the ground well traveled there.*

In addition some of the terminology is likely to cause confusion. By calling a synaptic integration step "thinking" the authors are going to confuse a lot of people. Instead of the process of forming an idea, evaluating that idea, potentially modifying it and repeating (what a layman would call thinking) they are trying to ascribe "thinking" to single unit processes! That's a pretty radical departure from both ML and ANN literature. Pattern recognition/signal discrimination is well known at the level of synaptic integration and firing, but "thinking?" No, that wording is not helpful.

*I have not reviewed all the citations and am reacting to the plain language of the text as someone familiar with both lines of research.

bob1029•1h ago
> Emulating these mechanisms, particularly the temporal coding inherent in spike timing and synchrony, presents a significant challenge. Consequently, modern neural networks do not rely on temporal dynamics to perform compute, but rather prioritize simplicity and computational efficiency.

Simulating a proper time domain is a very difficult thing to do with practical hardware. It's not that we can't do it - it's that all this timing magic requires additional hyperparameter dimensions that need to be searched over. Finding a set of valid parameters when the space is this vast seems very unlikely. You want to eliminate parameters, not introduce ones.

Also, computational substrates that are efficient to execute can be searched over much more quickly than those that are not. Anything where we need to model a spike that is delivered at a future time immediately chops a few orders of magnitude off the top because you have to keep things like priority queue structures around to serialize events.

Unless hard real time interaction is an actual design goal, I don't know if chasing this rabbit is worth it on the engineering/product side.

The elegance of STDP and how it could enable online, unsupervised learning is still highly alluring to me. I just don't see a path with silicon right now or on the horizon. Purpose built hardware could work but is like taking a really big leap of faith by setting some of the hyperparameters to const in code. The chances of getting this right before running out of money seem low to me.

I ruined my vacation by reverse engineering WSC

https://blog.es3n1n.eu/posts/how-i-ruined-my-vacation/
206•todsacerdoti•7h ago•95 comments

Plain Vanilla Web

https://plainvanillaweb.com/index.html
1111•andrewrn•19h ago•516 comments

Continuous Thought Machines

https://pub.sakana.ai/ctm/
188•hardmaru•9h ago•16 comments

Armbian Updates: OMV support, boot improvents, Rockchip optimizations

https://www.armbian.com/newsflash/armbian-updates-nas-support-lands-boot-systems-improve-and-rockchip-optimizations-arrive/
25•transpute•3h ago•1 comments

Intellect-2 Release: The First 32B Model Trained Through Globally Distributed RL

https://www.primeintellect.ai/blog/intellect-2-release
136•Philpax•9h ago•39 comments

Making PyPI's test suite 81% faster – The Trail of Bits Blog

https://blog.trailofbits.com/2025/05/01/making-pypis-test-suite-81-faster/
70•rbanffy•3d ago•19 comments

Dart added support for cross-compilation

https://dart.dev/tools/dart-compile#cross-compilation-exe
31•Alifatisk•3d ago•24 comments

Why Bell Labs Worked

https://1517.substack.com/p/why-bell-labs-worked
228•areoform•14h ago•166 comments

Car companies are in a billion-dollar software war

https://insideevs.com/features/759153/car-companies-software-companies/
356•rntn•17h ago•610 comments

Show HN: Vom Decision Platform (Cursor for Decision Analyst)

https://www.vomdecision.com
7•davidreisbr•3d ago•3 comments

Absolute Zero Reasoner

https://andrewzh112.github.io/absolute-zero-reasoner/
83•jonbaer•4d ago•16 comments

High-school shop students attract skilled-trades job offers

https://www.wsj.com/lifestyle/careers/skilled-trades-high-school-recruitment-fd9f8257
196•lxm•20h ago•313 comments

Ask HN: Cursor or Windsurf?

159•skarat•6h ago•205 comments

Scraperr – A Self Hosted Webscraper

https://github.com/jaypyles/Scraperr
195•jpyles•17h ago•68 comments

The Academic Pipeline Stall: Why Industry Must Stand for Academia

https://www.sigarch.org/the-academic-pipeline-stall-why-industry-must-stand-for-academia/
104•MaysonL•8h ago•79 comments

Writing an LLM from scratch, part 13 – attention heads are dumb

https://www.gilesthomas.com/2025/05/llm-from-scratch-13-taking-stock-part-1-attention-heads-are-dumb
286•gpjt•3d ago•57 comments

Title of work deciphered in sealed Herculaneum scroll via digital unwrapping

https://www.finebooksmagazine.com/fine-books-news/title-work-deciphered-sealed-herculaneum-scroll-digital-unwrapping
215•namanyayg•21h ago•96 comments

One-Click RCE in Asus's Preinstalled Driver Software

https://mrbruh.com/asusdriverhub/
472•MrBruh•1d ago•224 comments

LSP client in Clojure in 200 lines of code

https://vlaaad.github.io/lsp-client-in-200-lines-of-code
147•vlaaad•17h ago•18 comments

How friction is being redistributed in today's economy

https://kyla.substack.com/p/the-most-valuable-commodity-in-the
215•walterbell•3d ago•97 comments

ToyDB rewritten: a distributed SQL database in Rust, for education

https://github.com/erikgrinaker/toydb
97•erikgrinaker•15h ago•13 comments

A formatter for your kdl files

https://github.com/hougesen/kdlfmt
3•riegerj•3d ago•1 comments

Burrito Now, Pay Later

https://enterprisevalue.substack.com/p/burrito-now-pay-later
137•gwintrob•15h ago•235 comments

Show HN: Codigo – The Programming Language Repository

https://codigolangs.com
43•adamjhf•2d ago•13 comments

Why alien languages could be far stranger than we imagine Essays

https://aeon.co/essays/why-alien-languages-could-be-far-stranger-than-we-imagine
8•rbanffy•1h ago•11 comments

A simple 16x16 dot animation from simple math rules

https://tixy.land
460•andrewrn•2d ago•91 comments

Lazarus Release 4.0

https://forum.lazarus.freepascal.org/index.php?topic=71050.0
244•proxysna•5d ago•138 comments

Avoiding AI is hard – but our freedom to opt out must be protected

https://theconversation.com/avoiding-ai-is-hard-but-our-freedom-to-opt-out-must-be-protected-255873
179•gnabgib•11h ago•105 comments

The Epochalypse Project

https://epochalypse-project.org/
187•maxeda•1d ago•81 comments

3D printing in vivo for non-surgical implants and drug delivery

https://www.science.org/doi/10.1126/science.adt0293
22•Phreaker00•1d ago•5 comments