frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: LLM Inference Performance Analytic Tool for Moe Models (DeepSeek/etc.)

https://github.com/kevinyuan/llm-inference-perf-model
1•kevin-2025•1h ago
I built this to answer "what-if" questions about LLM deployment without spinning up expensive infrastructure.

The tool models inference physics - latency, bandwidth saturation, and PCIe bottlenecks for large MoE models like DeepSeek-V3 (671B), Mixtral 8x7B, Qwen2.5-MoE, and Grok-1.

Key features:

- Independent Prefill vs Decode parallelism config (TP/PP/SP/DP) - Hardware modeling: H100, B200, A100, NVLink topologies, IB vs RoCE - Optimizations: Paged KV Cache, DualPipe, FP8/INT4 quantization - Experimental: Memory Pooling (TPP, tiered storage) and Near-Memory Computing - offload cold experts and cold/warm KV-cache to system RAM, node-shared or global-shared memory pool

Live demo: https://llm-inference-performance-calculator-1066033662468.u...

Built with React, TypeScript, Tailwind, and Vite.

Disclaimer: I've calibrated the math models but they're not perfect. Feedback and PRs welcome.

How to Synthesize a House Loop

https://loopmaster.xyz/tutorials/how-to-synthesize-a-house-loop
1•stagas•1m ago•0 comments

Cold Restart Resilience: Why Distributed Systems Fail to Recover Cleanly

https://storiesfromtheedge.substack.com/p/cold-restart-resilience
1•subbukambala•2m ago•0 comments

Gaios

https://www.inkandswitch.com/newsletter/dispatch-014/
1•ferriswil•3m ago•0 comments

ML Assisted Human Name Parser

https://github.com/appeler/parsernaam
1•neehao•3m ago•0 comments

The Iceberg Index: Measuring Skills-Centered Exposure in the AI Economy [pdf]

https://iceberg.mit.edu/report.pdf
1•SquibblesRedux•6m ago•1 comments

Canadian data order risks blowing a hole in EU sovereignty

https://www.theregister.com/2025/11/27/canada_court_ovh/
1•HotGarbage•6m ago•0 comments

Microplastics disrupt gut microbiome and fermentation in farm animals

https://www.helsinki.fi/en/news/animals/microplastics-disrupt-gut-microbiome-and-fermentation-far...
4•robtherobber•8m ago•0 comments

Show HN: GimmeGimme – Finding gifts for hard-to-shop-for people

https://gimmegimme.app/
1•monatron•10m ago•0 comments

Is RCE Just Low Severity?

https://www.0xkato.xyz/Is-RCE-Really-Just-Low-Severity/
4•0xkato•10m ago•0 comments

Can AI Work with You?

https://iceberg.mit.edu/
1•taubek•11m ago•0 comments

How to use ReARM to check if Shai-Hulud 2.0 infiltrated your dependencies [video]

https://www.youtube.com/watch?v=bx9AuvF0zG4
1•taleodor•14m ago•0 comments

Training open source LLMs at ESE Kongress 2025

https://www.collabora.com/news-and-blog/news-and-events/training-open-source-llms-at-ese-kongress...
1•losgehts•15m ago•0 comments

Show HN: Free tool to request indexing from Google

https://www.fastseofix.com/tools/request-indexing
1•certibee•15m ago•0 comments

Swift regrets: a programming language design retrospective

https://belkadan.com/blog/tags/swift-regrets/
1•fanf2•23m ago•0 comments

Vendor Lock-In Lessons from My Internship: Is It Discussed in School?

https://medium.com/datastrato/if-youre-not-all-in-on-databricks-why-metadata-freedom-matters-35cc...
1•birdculture•25m ago•0 comments

Cloud-Init on Raspberry Pi OS

https://www.raspberrypi.com/news/cloud-init-on-raspberry-pi-os/
2•rcarmo•26m ago•1 comments

Fedora Sig Proposed to Improve Production Stability

https://www.phoronix.com/news/Fedora-SIG-Production-Stability
1•Bender•28m ago•0 comments

Replace your boss before they replace you

https://replaceyourboss.ai/
3•_tk_•28m ago•0 comments

Zlib-Ng 2.3.1 Released with More CPU Performance Optimizations

https://www.phoronix.com/news/zlib-ng-2.3.1-Released
1•Bender•29m ago•0 comments

US Patent Office issues new guidelines for AI-assisted inventions

https://www.reuters.com/legal/government/us-patent-office-issues-new-guidelines-ai-assisted-inven...
3•speckx•29m ago•0 comments

What's Your Incentive?

https://newsroom.ucla.edu/magazine/jana-gallus-economist-incentive-design
1•paulpauper•29m ago•0 comments

Adolescence lasts into 30s – new study shows four pivotal ages for your brain

https://www.bbc.com/news/articles/cgl6klez226o
2•abe94•34m ago•0 comments

James Watson tells the inconvenient truth: faces the consequences

https://pubmed.ncbi.nlm.nih.gov/18440722/
6•nullbyte808•37m ago•4 comments

Neural Annealing: Directing Psychedelic Trips Towards Healing (25/30)

https://psychotechnology.substack.com/p/neural-annealing-directing-psychedelic
1•paulpauper•38m ago•0 comments

"The Etiology and Treatment of Childhood", Smoller 1986 [pdf]

https://gwern.net/doc/psychiatry/adhd/1986-smoller.pdf
1•paulpauper•38m ago•0 comments

Command-Enter to Submit in AI Chats – never lose an incomplete prompt again

https://gist.github.com/intellectronica/e9302c17e2b01db9ca9ab17f0bcb8f16
2•intellectronica•40m ago•0 comments

What were Afghan "Zero Units" where the National Guard shooting suspect worked?

https://www.cbsnews.com/news/national-guard-shooting-suspect-afghanistan-zero-units/
2•clanky•41m ago•0 comments

Show HN: EasyDictate – Offline Voice Dictation for Windows

https://github.com/charleslukowski/easydictate
2•chux52•42m ago•0 comments

Skald: Open-Source Production RAG in Your Infrastructure

https://github.com/skaldlabs/skald
2•yakkomajuri•43m ago•0 comments

Cherry gives up German production and wants to sell core division

https://www.heise.de/en/news/Cherry-gives-up-German-production-and-wants-to-sell-core-division-11...
5•jsheard•43m ago•0 comments