frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Show HN: Facetime Influencer AI Avatars Real-Time

https://popclone.io/bre
1•spolanki•1m ago•1 comments

HP and Dell disable HEVC support built into their laptops' CPUs

https://arstechnica.com/gadgets/2025/11/hp-and-dell-disable-hevc-support-built-into-their-laptops...
1•stalfosknight•2m ago•0 comments

CDC website changed to contradict conclusion that vaccines don't cause autism

https://apnews.com/article/cdc-autism-vaccines-7b1890f626dd5921fafd00fdd1e6425a
1•petethomas•3m ago•0 comments

Data Science Weekly – Issue 626

https://datascienceweekly.substack.com/p/data-science-weekly-issue-626
1•sebg•4m ago•0 comments

Show HN: 0Portfolio – AI-powered portfolio builder for everyone

https://0portfolio.com/
1•adityamallah•4m ago•0 comments

Trustworthy Systems Group: secure and performant real-world computer systems

https://trustworthy.systems/
1•doener•5m ago•0 comments

Are cellular towers the next landlines?

https://ssg.dev/are-cellular-towers-the-next-landlines/
1•sedatk•5m ago•0 comments

Show HN: CampaignTree – A visual alternative to spreadsheets for planning ads

https://campaigntree.app
1•advanttage•6m ago•0 comments

RI judge intervenes after ICE mistakenly detains Superior Court intern

https://www.wpri.com/news/local-news/providence/ri-judge-intervenes-after-ice-mistakenly-detains-...
1•chmaynard•6m ago•0 comments

I Let a Brain Organoid Make My Investment Decisions

https://epicquest.bio/brain-organoid.html
2•kemmishtree•7m ago•1 comments

Is C++26 getting destructive move semantics?

https://stackoverflow.com/questions/79817124/is-c26-getting-destructive-move-semantics
1•todsacerdoti•8m ago•0 comments

Federal prosecutors move to dismiss charges against woman shot by Border Patrol

https://apnews.com/article/chicago-immigration-crackdown-woman-shot-border-e58ca635feeb2ef8ddb0b4...
1•petethomas•8m ago•0 comments

The Calvin and Hobbes search Takedown (2010)

https://www.s-anand.net/blog/the-calvin-and-hobbes-search-takedown/
1•thunderbong•8m ago•0 comments

Mudyla: Multimodal dynamic launcher, a DAG-based bash script orchestrator

https://github.com/7mind/mudyla
1•pshirshov•12m ago•0 comments

PrivateCut – Trim videos 100% in the browser, no upload, works offline

https://privatecut.app
2•privatecutapp•14m ago•1 comments

Putting Down Your Phone May Help You Live Longer (2019)

https://www.nytimes.com/2019/04/24/well/mind/putting-down-your-phone-may-help-you-live-longer.html
1•abixb•14m ago•0 comments

Ask HN: How Do you undo or checkout changes from Codex CLI and others?

1•elpakal•15m ago•1 comments

Suppression of pair beam instabilities in a laboratory analogue of blazar jets

https://arxiv.org/abs/2509.09040
2•PaulHoule•16m ago•0 comments

Nvidia pushes hotfix after Windows 11 October update tanks gaming performance

https://www.theregister.com/2025/11/20/nvidia_windows_11_hotfix/
2•Bender•17m ago•0 comments

Morgan Stanley Delays Data Center Debt Sale Amid Alibaba Risks

https://www.bloomberg.com/news/articles/2025-11-20/morgan-stanley-delays-data-center-debt-sale-am...
1•petethomas•18m ago•0 comments

Apple Watch's algorithm detects 89% of sleep apnea

https://www.empirical.health/apple-watch-sleep-apnea
2•brandonb•22m ago•0 comments

Humanoid robot Figure 02 helps build over 30k BMW X3s

https://www.heise.de/en/news/Humanoid-robot-Figure-02-helps-build-over-30-000-BMW-X3s-11085687.html
1•thenaturalist•25m ago•0 comments

Abstractive Thinking Model

https://github.com/Jonathan-Monclare/Abstractive-Thinking-Model-ATM-
1•J_Monclare•29m ago•0 comments

Over-Regulation Is Doubling the Cost by Peter Reinhardt

https://rein.pk/over-regulation-is-doubling-the-cost
3•bilsbie•30m ago•0 comments

The Game Awards 2025 Nominations

https://thegameawards.com/nominees
1•mrzool•30m ago•0 comments

France is taking state actions against GrapheneOS

https://grapheneos.social/@GrapheneOS/115584160910016309
74•gabrielgio•31m ago•22 comments

When First Amendment free speech protections came up against the Red Scare

https://theconversation.com/first-amendment-in-flux-when-free-speech-protections-came-up-against-...
2•hn_acker•32m ago•1 comments

Color Palette Pro: A Synthesizer for Color

https://ryanfeigenbaum.com/color-palette-pro/
2•interpol_p•34m ago•0 comments

Spiral Development for Hardware Programs

https://www.asbuilt.pub/p/spiral-development-for-hardware-programs
2•bharbr•35m ago•0 comments

World Bank Published about Artificial Intelligence in Bulgarian

https://wbginstitute.nouswise.com/c/fcd839f7-c91c-412f-baef-32e4842064f3
1•kaven1234•36m ago•2 comments
Open in hackernews

AI Is Writing Its Own Kernels, and They Are 17x Faster

https://adrs-ucb.notion.site/autocomp
34•accheng•1h ago

Comments

qat321•1h ago
I wonder if these results extend beyond AWS Trainium?
charleshong•59m ago
Yes! We have also ported Autocomp to an academic accelerator (Gemmini), an RVV dev board (Canaan K230), and a GPU (NVIDIA L40S).

See our paper: https://arxiv.org/abs/2505.18574 And our prior blog posts: https://charleshong3.github.io/blog/

gfhsad•54m ago
Whenever I see '17x faster than experts,' I read 'the experts didn't actually try very hard on the baseline.'
charleshong•52m ago
Well, most of our results are not 17x. But still (IMO) solid across the board!

Also, the 17x came from a pretty obscure fusion optimization that isn't called out anywhere in the documentation (we had to run the profiler to see what was actually going on). Wouldn't be surprised if whoever within AWS wrote the kernel didn't know about that optimization.

snklt•43m ago
17x is a wild improvement regardless of the baseline. Impressive results.
taqpos•1h ago
This post unintentionally highlights exactly why NVIDIA is untouchable. If you need a farm of H100s running GPT-5 just to figure out how to program Amazon's Trainium chip efficiently, the hardware abstraction is fundamentally broken.
CobbledSteel•11m ago
I'd argue the logic goes the other way, if all it takes to get high performant kernels is to rent a GPU farm, that seems to undercut the years and millions of engineering hours required to build the NVIDIA SW infrastructure. High hopes for smaller players now
pos456•57m ago
Calling beam search 'AI' is doing a lot of heavy lifting here. This is just superoptimization with a very expensive heuristic function.
igorpcosta•52m ago
Very interesting research on this, keen to colab with you folks, I've been building a few experiments for old GTX GPUs to extend lifetime of them with matching performance of tokens for Smol, igor [] autohand.ai let's chat.
quc1k•50m ago
I really appreciate the focus on interpretability. Usually, super-optimizers give you a blob of assembly that runs fast but is impossible to debug or maintain. By forcing the model to output a natural language 'Plan' first, you essentially get documentation for free. If the code breaks later, you can look at the plan to understand why the loop was unrolled or why the memory was laid out that way. That makes this actually usable in a production CI/CD pipeline, unlike most black-box ML optimizations.
pakt1•45m ago
Trainium has always been a black box to me compared to GPUs. Seeing an automated tool reverse-engineer the best way to use the VectorEngine vs the TensorEngine is fascinating. It reveals just how much performance is left on the table by standard compilers.
matll•40m ago
As someone who spent the better part of last year trying to hand-tune kernels for a niche accelerator (not Trainium, but similar vibe), this honestly looks like a dream.

The hardest part of this work isn't coming up with the math; it's the mental overhead of managing the scratchpad memory and async DMA calls without stepping on your own toes. You spend 3 days debugging a race condition just to find out you got a 2% speedup.

If this tool can actually handle the 'grunt work' of generating the tiling logic and memory moves based on a high-level plan, that’s a game changer. I don't even care about the 17x number as much as I care about the '0 to 1' speed. getting any performant kernel running on new hardware usually takes weeks. If this cuts it down to a few hours of LLM churning, that's huge for the industry.

chanwutk•34m ago
Very interesting read!
melissapan•33m ago
ADRS <> Compiler: what if your “compiler” could think?
dksgmlwo•27m ago
Fascinating. Having worked as a kernel engineer before, I know how impactful it is to reduce the initial exploration overhead. It can save a huge amount of the grunt work engineers typically have to do.
maltese669•23m ago
ngl letting AI fiddle with the kernel sounds scary but the results are really impressive
yrh•5m ago
Interesting read. I think the more "whitebox" approach with a laid out menu to choose from makes the resulting kernel more trustworthy, although it does ask the question if going outside of the predefined steps of optimization from time to time may yield insights.