frontpage.
newsnewestaskshowjobs

Open Source @Github

fp.

Open in hackernews

Subquadratic – Introducing SubQ 1.1 Small

https://subq.ai/subq-1-1-small-technical-report
38•EDM115•1h ago

Comments

EDM115•1h ago
https://subq.ai/docs/subq-1-1-small-model-card.pdf
giancarlostoro•1h ago
This one's interesting, and I think the next frontier for LLMs should really just be, how can we get something like Opus 4.6 to cost drastically less, for the same output? I say 4.6 because from 4.6 onwards it's been pretty darn good, at least for me, always feels like every model upgrade someone hates it, heck even 4.5 was fine.
robmccoll•1h ago
Yes - I want that and dramatically faster. Newer models don't seem to need any more or less guidance and iteration, so let's make the time-to-wrong-answer as short as possible.
giancarlostoro•48m ago
I'm not as crazy about speed as long as it's reasonably as "quick" as Opus. Which is faster than most developers can spit out code. I do get annoyed with Claude Code because it looks like it chooses to be as slow as possible, but maybe that's by design so its not pounding their backend every milisecond? Would probably be bad.

Local inference is insanely fast on my M4 Pro MBP though, so I can understand where you're coming from, but I don't need it too much faster. I still need time to review, test, review and provide feedback to the model. Fast is okay I guess for true vibe coding.

robmccoll•27m ago
I just don't want to have to have a pipeline going in order to fully occupy my time. I don't want to wait on the model to review the prompt, read the parts of the codebase indicated, do its own research in the codebase and documentation, plan, run agents ... actually write the code and NOW I can start reading it and reviewing it. That means I either need to run a lot of operations in parallel so that I always have something to do and the agent(s) are highly utilized or I'm writing something on my own that I keep getting that keeps getting interrupted. It's the constant context switching that kills me. I want to work on one problem at a time and really focus on it - even if I'm not writing every line myself.
aesthesia•59m ago
Disappointing they don't actually say how their sparse attention mechanism works.
cmogni1•58m ago
I don’t understand why this lab is allergic to providing details on what they actually made, especially when Chinese labs are more than willing to share architectural specs/code/kernels (eg NSA/FSA, RAMBa, HISA, DSA LightningIndexer, etc). I don’t doubt that they’ve done something here, but the lack of details makes me default not trust this, particularly when this is the second time that they’ve released a “technical report” that just waxes poetic about the concept.
famouswaffles•42m ago
Business wise, it would make sense to hold off on details till they're at least ready to serve. Look at what happened with Open AI and reasoning models. Everyone struggled with getting RL to work with LLMs for a good while. Open AI figured it out, and a few months later everyone had their prototypes out in short order. Don't forget who these labs employ. They're some of the brightest people around. Sub-q aren't really in a position for that lol. If they'd shared details at the first announcement for instance, the big labs might have had something out by now while they're still pulling resources to scale and then what ?
supern0va•31m ago
You don't understand why the thing their entire company is valued upon is...not being given away freely? They literally are taking an open source model and then adapting it with this technique. If they disclose it, the frontier labs will immediately copy it and outperform them.

My guess is that they're angling for an acquisition.

embedding-shape•56m ago
> SubQ 1.1 Small scores near-perfect at 1M, 2M, 6M, and 12M tokens. The model was trained predominantly at 1M tokens yet the retrieval held near perfectly at 12x that length, despite compressing attention to just 0.13% of relationships. This generalization is a direct consequence of SSA routing attention based on content relevance rather than fixed positional patterns.

If the results persists from 1M to 12M, why not 24M or 48M? Sounds almost too good to be true.

With back of the napkin math from inside my head, that'd be like 0.5/1 million LOC, depending on language/code density, could just fold the entire codebase into one prompt if it's a small one, that'd be neat :)

monster_truck•9m ago
It likely falls off very steeply after that. 8 to 1 (which I am assuming based on the 0.13% figure) is a pretty common ratio for sparse matrix stuff.
chrsw•50m ago
There was, let's say, significant skepticism the last time they announced something. What's changed?
supern0va•28m ago
I have no idea if the evaluator themselves is trustworthy, but it was supposedly independently evaluated by Appen: https://www.appen.com/whitepapers/benchmarking-subquadratics...
wxw•39m ago
> SSA replaces the O(n²) dense attention pass with a learned sparse formulation that scales linearly with context length.

> At 1M tokens, SubQ 1.1 Small requires 64.5x less compute than dense attention and runs 56x faster than FlashAttention-2.

Awesome stuff. Solving context at the model architecture layer rather than trying to bolt on extra memory is the right direction IMO.

satyarohith•31m ago
It's been all talk and no action ever since their first announcement.
maz1b•23m ago
They've done multiple "evaluations" by third parties, but still, it seems that they aren't being fully transparent. I think the approach is quite interesting and novel, but this feels like deja vu.

I get why they aren't disclosing all the details, but it seems more hype-train-esque to me for this moment. I don't disagree that this could be big.

Depurator•17m ago
What kind of hardware would be needed to serve an instance with the full 12m context? And what kind of speeds can one expwct at those extremes at 10m+?

Steve Jobs in Exile by Geoffrey Cain

https://auxiliarymemory.com/2026/06/01/steve-jobs-in-exile-by-geoffrey-cain/
1•speckx•12s ago•0 comments

Stop rebuilding your billing system

https://useautumn.com/blog/stop-rebuilding-billing
1•johnyeocx•21s ago•0 comments

Russian frigate fires warning shots at British yacht in English Channel

https://www.theguardian.com/uk-news/2026/jun/16/russian-frigate-fires-warning-shots-at-british-ya...
1•manarth•1m ago•0 comments

We should vaccinate wild animals

https://worksinprogress.co/issue/why-we-should-vaccinate-wild-animals/
1•duffydotsvg•1m ago•0 comments

Show HN: Docket – Semantic search over your local files, runs in the browser

https://docketapp.netlify.app/
1•owenthecoder13•1m ago•0 comments

2024-25 Covid-19 Vaccine and Major Adverse Cardiovascular Events in US Veterans

https://jamanetwork.com/journals/jamainternalmedicine/fullarticle/2850241
1•bookofjoe•2m ago•0 comments

The Dangerous Tech Found Aboard 'Dark-Fleet' Tankers Captured by the U.S.

https://www.wsj.com/articles/the-dangerous-tech-found-aboard-dark-fleet-tankers-captured-by-the-u...
2•CSMastermind•3m ago•0 comments

Arrests, prosecutions, convictions or fines for online speech by country

https://github.com/kevinnbass/state_action_against_online_speech_globally
3•MrBuddyCasino•4m ago•1 comments

Show HN: In Browser semantic wallpaper search over 16k+ wallpapers

https://web-inky-ten-60.vercel.app
2•rdksu•5m ago•0 comments

Good Pricing Grows with the Value You Deliver

https://www.hauser.io/good-pricing-grows-with-the-value-you-deliver/
2•bkfh•5m ago•0 comments

NovaVest/VN-Noxa-v1-7B-Beta-Low

https://huggingface.co/NovaVest/VN-Noxa-v1-7b-Beta-Low
2•ilreb•6m ago•0 comments

Brazos: Liquid cooling system for air-cooled data centers

https://cloud.google.com/blog/topics/systems/brazos-liquid-cooling-system-for-air-cooled-data-cen...
3•ilreb•8m ago•0 comments

Show HN: Shivvr – Ephemeral semantic embedding and cognitive agent service

https://shivvr.nuts.services/
2•kordlessagain•8m ago•0 comments

SpaceX Set to Overtake Amazon in Value as It Soars for Third Day

https://www.bloomberg.com/news/articles/2026-06-16/spacex-spcx-stock-set-for-more-than-50-jump-in...
3•pera•8m ago•0 comments

Tell HN: Anthropic walks back on Agent SDK credit changes

2•lostmsu•8m ago•0 comments

Commodore announces Linux-based flip phone with 'no social media, no browser'

https://www.tomshardware.com/phones/commodore-announces-linux-based-flip-phone-with-no-social-med...
3•neilfrndes•9m ago•0 comments

CRA Compliance Kit – Open-Source Toolkit for the EU Cyber Resilience Act

https://cra-watch.starcaller-teq.workers.dev/dashboard
3•sparka•9m ago•0 comments

Canadian Government Plans to Shut Down Debate and Pass Bill C-22 This Week

https://www.michaelgeist.ca/2026/06/government-moves-to-shut-down-lawful-access-hearing-in-order-...
4•EmbarrassedHelp•10m ago•1 comments

DOJ claims xAI's unpermitted gas turbines are a matter of national security

https://techcrunch.com/2026/06/16/doj-claims-xais-unpermitted-gas-turbines-are-a-matter-of-nation...
5•landonxjames•10m ago•0 comments

SearchGate – An AI Content Blocker

https://chromewebstore.google.com/detail/searchgate/cholhbhkhcnekbbobehnepiifckhbmkd
2•sorethescore•10m ago•0 comments

FT Alphaville's AI Prediction World Cup

https://www.ft.com/content/ce992051-05f4-40cd-a515-1ccd615f9e40
2•aanet•11m ago•1 comments

Email should have been a meeting

https://justinjackson.ca/communication
3•duvander•12m ago•1 comments

Zitchain: Bitcoin has structural flaws. I designed an alternative

https://zitchain.com
2•tiagofsilva•12m ago•0 comments

SpaceX tops Amazon and Microsoft in market value

https://www.nbcnews.com/business/markets/spacex-tops-amazon-microsoft-value-rcna350254
3•geox•13m ago•0 comments

The Era of AI Malaise

https://www.technologyreview.com/2026/04/21/1135921/ai-malaise-artificial-intelligence-public-sen...
2•karakoram•13m ago•1 comments

Titan's Hidden Blanket

https://www.universetoday.com/articles/titans-hidden-blanket
2•bookofjoe•14m ago•1 comments

Show HN: Claireon – MCP Server for Unreal Editor

https://github.com/believer-oss/claireon
6•karl_gluck•15m ago•0 comments

Why AI Will Accelerate Health Care Inflation

https://www.healthaffairs.org/content/forefront/why-ai-accelerate-health-care-inflation
4•brandonb•16m ago•0 comments

Connecting to a Lot of People on LinkedIn via Browser DevTools

https://justinribeiro.com/chronicle/2026/06/11/connecting-to-a-lot-of-people-on-linkedin-via-brow...
2•speckx•16m ago•0 comments

'David Bowie was a crazy workaholic': Labyrinth at 40 – an oral history

https://www.theguardian.com/film/2026/jun/16/david-bowie-workaholic-labyrinth-at-40-oral-history
2•tosh•16m ago•0 comments