frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Fyrox – A feature-rich game engine built in Rust

https://fyrox.rs/
1•atilimcetin•1m ago•0 comments

Organizing Amazon Should Be a Priority for Labor Globally

https://jacobin.com/2025/10/amazon-coventry-gmb-strikes-teamsters-labor/
1•PaulHoule•1m ago•0 comments

How blocks are chained in a blockchain

https://www.johndcook.com/blog/2025/10/27/blockchain/
1•tapanjk•1m ago•0 comments

What about OpenCL and CUDA C++ alternatives?

https://www.modular.com/blog/democratizing-ai-compute-part-5-what-about-cuda-c-alternatives
1•coffeeaddict1•1m ago•0 comments

OpenIndiana 2025.10 ISOs Available for Download

https://www.phoronix.com/news/OpenIndiana-2025.10
1•rbanffy•2m ago•0 comments

Collective Communication for 100k+ GPUs

https://arxiv.org/abs/2510.20171
1•mfiguiere•2m ago•0 comments

Show HN: I built a free tool to generate vanilla CSS code easily

https://cssify.co/
1•mirodosta•3m ago•0 comments

Advancing Claude for Financial Services

https://www.anthropic.com/news/advancing-claude-for-financial-services
1•mfiguiere•4m ago•0 comments

Self Actualization (Culture)

https://boz.com/articles/self-actualization
2•dotmanish•4m ago•0 comments

The Shape of Math to Come by Alex Kontorovich

https://arxiv.org/abs/2510.15924
1•topekian•5m ago•0 comments

Blowing Bubbles

https://mschoening.notion.site/Blowing-Bubbles-294ac92066fe8037a373d2de50fc4ff8
1•janpio•6m ago•1 comments

JSON Query

https://jsonquerylang.org/
2•wofo•6m ago•0 comments

A Thank You to YC

https://www.enbao.me/posts/yc
1•enbao•7m ago•0 comments

Dependencies, Inputs, Outputs: My Shortcut to Write Tests

https://remy.duthu.org/2025/10/27/dependencies-inputs-outputs.html
1•remyduthu•9m ago•0 comments

Ask HN: Is AWS Down Again?

25•ajdude•13m ago•3 comments

Esports ICS for Your Calendar

https://github.com/snwfdhmp/esports-ics
2•snwfdhmp•13m ago•0 comments

Rare Earth Quotas and Informational Statecraft in China

https://link.springer.com/article/10.1007/s11366-025-09922-9
2•navigate8310•13m ago•0 comments

How I Would Learn Bioinformatics from Scratch 12 Years Later: A Roadmap

https://divingintogeneticsandgenomics.com/post/bioinfo-roadmap/
2•sebg•14m ago•0 comments

I use Cursor for Product Management [video]

https://www.youtube.com/watch?v=rwmR7m5rvqw
2•sinned•14m ago•1 comments

Zero to Productive

https://alok.website/from-zero-to-productive.html
1•alokjnv10•14m ago•0 comments

US banks' private credit loan exposure nears $300B

https://www.moodys.com/web/en/us/insights/data-stories/breakdown-of-banks-annual-reporting-on-pri...
2•tortilla•14m ago•0 comments

Free AI Music Generator – Suno V5 App

https://sunov5.app/
1•lovezac•15m ago•0 comments

Beyond Smoothed Analysis: Analyzing the Simplex Method by the Book

https://arxiv.org/abs/2510.21613
1•sebg•15m ago•0 comments

Avoiding email scams – 3 tell-tale scams

https://alearningaday.blog/2025/10/27/avoiding-email-scams-3-tell-tale-scams/
2•speckx•16m ago•0 comments

Gartner Raises Datacenter Spending Forecasts

https://www.nextplatform.com/2025/10/27/gartner-radically-raises-datacenter-spending-forecasts/
1•rbanffy•16m ago•0 comments

Claude for Excel

https://www.claude.com/claude-for-excel
9•meetpateltech•20m ago•0 comments

Show HN: Erdos – open-source, AI data science IDE

https://www.lotas.ai/erdos
3•jorgeoguerra•20m ago•0 comments

Show HN: Secure File Uploads for Intercom

https://www.fibrehq.com/
1•paulmbw•21m ago•0 comments

CMU team claims vector-based system can turbocharge PostgreSQL

https://www.theregister.com/2025/10/22/cmu_proto_x_postgres/
1•kretaceous•22m ago•0 comments

US Department of Energy forms $1B partnership with AMD

https://www.reuters.com/business/energy/us-department-energy-forms-1-billion-supercomputer-ai-par...
4•ankitg12•23m ago•1 comments
Open in hackernews

Making GPT-2 better at math reasoning with a new attention mechanism

https://github.com/Kim-Ai-gpu/FactorizedAttention
3•umjunsik132•2h ago

Comments

umjunsik132•2h ago
Hi HN Author here.

I built FactorizedAttention - a new attention mechanism based on the GWO framework. Instead of simple QK^T dot products, it uses factorized quadratic forms to model higher-order token interactions.

Testing on GPT-2 small + LoRA fine-tuning:

Math reasoning: 3.4% PPL improvement

Competitive programming: 3.2%

Python code: 1.9%

The bigger gains on reasoning tasks suggest the approach helps with complex relationships. Still early stage (only GPT-2 small), but the results are encouraging. Happy to answer questions! Code + repro steps in the repo.

mynti•1h ago
Cool idea! I had a look at the code and have been wondering about the sigmoid gating, it is used to add some of the q_struct and k_struct into the original key and query. But I wonder why this gating is independend of the input? I would have expected this gating to be dependednd on the input, so if the model sees something more complex it needs more of this information (or something similar). But it is just a fix, learnable parameter per layer, or am I mistaken? What is the intuition about this?
umjunsik132•52m ago
For this initial version, I kept the gating static to keep the model as simple as possible while validating the core idea. Making the gate dynamic based on the input is a great suggestion for the next step, and I agree it could lead to better performance. I really appreciate the feedback