frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

Open in hackernews

Questioning Representational Optimism in Deep Learning

https://arxiv.org/abs/2505.11581
1•publicdaniel•1d ago

Comments

publicdaniel•1d ago
From the author's tweet (https://x.com/kenneth0stanley/status/1924650124829196370)

Could a major opportunity to improve representation in deep learning be hiding in plain sight? Check out our new position paper: Questioning Representational Optimism in Deep Learning: The Fractured Entangled Representation Hypothesis. The idea stems from a little-known observation about networks trained to output a single image: when they are discovered through an unconventional open-ended search process, their representations are incredibly elegant and exhibit astonishing modular decomposition. In contrast, when SGD (successfully) learns to output the same image its underlying representation is fractured, entangled - an absolute mess!

This stark difference in the underlying representation of the same "good" output behavior carries deep lessons for deep learning. It shows you cannot judge a book by its cover - an LLM with all the right responses could similarly be a mess under the hood. But also, surprisingly, it shows us that it doesn't have to be this way! Without the unique examples in this paper that were discovered through open-ended search, we might assume neural representation has to be a mess. These results show that is clearly untrue. We can now imagine something better because we can actually see it is possible.

We give several reasons why this matters: generalization, creativity, and learning are all potentially impacted. The paper shows examples to back up these concerns, but in brief, there is a key insight: Representation is not only important for what you're able to do now, but for where you can go from there. The ability to imagine something new (and where your next step in weight space can bring you) depends entirely upon how you represent the world. Generalization, creativity, and learning itself depend upon this critical relationship. Notice the difference in appearance between the nearby images to the skull in weight space shown in the top-left and top-right image strips of the attached graphic. The difference in semantics is stark.

The insight that representation could be better opens up a lot of new paths and opportunities for investigation. It raises new urgency to understand the representation underlying foundation models and LLMs while exposing all kinds of novel avenues for potentially improving them, from making learning processes more open-ended to manipulating architectures and algorithms.

Don't mistake this paper as providing comfort for AI pessimists. By exposing a novel set of stark and explicit differences between conventional learning and something different, it can act as an accelerator of progress as opposed to a tool of pessimism. At the least, the discussion it provokes should be quite illuminating.

Fredkin•1d ago
What does it mean to train using an 'open ended' process? Is it like using a genetic algorithm to explore / generate _any_ image resembling something from the training set, instead of adjusting weights according to gradients on a case-by-case or batch-by-batch basis?
publicdaniel•1d ago
Here's my really amateur understanding of this:

- Conventional SGD: Fixed target (e.g. "make an exact replica of this butterfly image") and it follows greedy path to minimize the error

- Open Ended Search Process: No predetermined goal, explores based on what's "interesting" or novel. In Picbreeder, humans would see several generated images, pick the "interesting" ones, and the system would mutate/evolve from there. If you were evolving an image that looked like an egg and it mutated toward a teapot like shape, you could pivot and pursue that direction instead.

This is kinda the catch -- there is a human element here where individuals are choosing what's "interesting" to explore, it's not a pure algorithmic process. That said, yes, it does use a genetic algorithm (NEAT) under the hood, but I think what the authors are suggesting is that the key difference isn't whether it's genetic or gradient based optimization... they're getting at the difference in objective driven vs. open-ended search.

I think the main position / takeaway from the paper is that something about conventional SGD training produces these "fractured entangled representations" that work but are not well structured internally so they're hard to build on top of. They look at things like the curriculum / order things are learned in, objective search vs. open-ended search, etc...

Jony Ive's LoveFrom helped design Rivian's first electric bike

https://techcrunch.com/2025/06/06/jony-ives-lovefrom-helped-design-rivians-first-electric-bike/
1•coloneltcb•15m ago•0 comments

Michigan triples waters with 'Do Not Eat' warning for PFAS in fish

https://www.mlive.com/environment/2025/06/michigan-triples-waters-with-do-not-eat-warning-for-pfas-in-fish.html
1•mahirsaid•16m ago•1 comments

Dear High Schoolers, Time Is Precious

https://byronsharman.com/blog/dear-high-schoolers
2•chilipepperhott•17m ago•0 comments

Show HN: Bridgit – In-Person-First Networking

https://www.bridgitsocial.com/
1•amfooladgar•17m ago•1 comments

Understanding MCP Evals: Why Evals Matter for MCP

https://huggingface.co/blog/mclenhard/mcp-evals
1•mooreds•17m ago•0 comments

Let's Learn About MCP Together

https://medium.com/womenintechnology/lets-learn-about-mcp-together-be1601dc7a81
1•mooreds•18m ago•0 comments

Higher education is shockingly right-wing

https://drafts.interfluidity.com/2023/03/01/higher-education-is-shockingly-right-wing/index.html
3•corimaith•21m ago•0 comments

Photographing a City That Stopped Changing: A Decade of Suburban Decay

https://aboutphotography.blog/blog/ghost-world-by-juan-rodrguez-morales
1•ChompChomp•24m ago•0 comments

Show HN: I built an AI that helps you chat with and visualize your codebase

https://www.thesuperfriend.com/
1•hez2000•29m ago•0 comments

University of Michigan using undercover investigators to surveil Gaza protestors

https://www.theguardian.com/us-news/2025/jun/06/michigan-university-gaza-surveillance
10•cempaka•30m ago•0 comments

Food additive titanium dioxide likely has more toxic effects than thought

https://www.theguardian.com/us-news/2025/jun/06/titanium-dioxide-food-additive-toxic
2•Jimmc414•33m ago•0 comments

I Built an AI Agent with Gmail Access and Discovered a Security Hole

1•Ada-Ihueze•35m ago•1 comments

Linux Foundation Announces the Fair Package Manager Project

https://www.linuxfoundation.org/press/linux-foundation-announces-the-fair-package-manager-project-for-open-source-content-management-system-stability
2•Kye•38m ago•1 comments

Bonobara – Data Aggregation and Analysis Engineer

https://www.bonobara.com
1•benkatzir•38m ago•2 comments

Bonobara – REST API Integration Developer

1•benkatzir•39m ago•1 comments

DIY bruxism detector prevents jaw clenching during sleep

https://blog.arduino.cc/2025/05/23/this-diy-bruxism-detector-prevents-jaw-clenching-during-sleep/
1•PaulHoule•40m ago•0 comments

Justices Grant Doge Access to Social Security Data

https://www.nytimes.com/2025/06/06/us/politics/supreme-court-doge-social-security.html
2•gametorch•42m ago•1 comments

GPU Memory Consistency: Specs, Testing, and Opportunities for Perf Tooling

https://www.sigarch.org/gpu-memory-consistency-specifications-testing-and-opportunities-for-performance-tooling/
2•matt_d•43m ago•0 comments

The Furthest Points from Any Ocean

https://en.wikipedia.org/wiki/Pole_of_inaccessibility
1•Willingham•47m ago•0 comments

You need to care about Product

https://taoem.com/chapters/6/the-engineering-role-in-shaping-product
1•jampa•47m ago•0 comments

Buyer with Ties to Chinese Communist Party Got VIP Treatment at Crypto Dinner

https://www.nytimes.com/2025/06/06/us/politics/trump-crypto-dinner-china-he-tianying.html
3•2OEH8eoCRo0•48m ago•0 comments

Wiregrass Archives launches interactive map for Alabama historical markers

https://today.troy.edu/news/wiregrass-archives-launches-interactive-map-for-alabama-historical-markers/
1•gnabgib•52m ago•0 comments

These are the leading science and technology hotspots

https://www.weforum.org/stories/2023/10/innovation-technology-wipo-countries-ranking/
1•mahirsaid•52m ago•0 comments

Increased Toxicity Risk Identified for Children with ADHD, Autism

https://www.sciencealert.com/increased-toxicity-risk-identified-for-children-with-adhd-autism
2•minifyre•52m ago•0 comments

What Explains Today's Trade Tensions?

https://yalebooks.yale.edu/2025/06/06/what-explains-todays-trade-tensions/
1•chmaynard•54m ago•0 comments

Ask HN: What would you work on if you couldn't fail?

1•rblion•56m ago•0 comments

What "Working" Means in the Era of AI Apps

https://a16z.com/revenue-benchmarks-ai-apps/
2•Brysonbw•56m ago•0 comments

My science teacher created a Wordle-like game all on his own

https://categoriq.xyz/
1•weinerdiner•56m ago•1 comments

Formal Methods Tutorials – FizzBee

https://fizzbee.io/design/tutorials/
4•isadubois•57m ago•0 comments

I Read All of Cloudflare's Claude-Generated Commits

https://www.maxemitchell.com/writings/i-read-all-of-cloudflares-claude-generated-commits/
1•maxemitchell•57m ago•0 comments