frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

P2P crypto exchange development company

1•sonniya•5m ago•0 comments

Vocal Guide – belt sing without killing yourself

https://jesperordrup.github.io/vocal-guide/
1•jesperordrup•10m ago•0 comments

Write for Your Readers Even If They Are Agents

https://commonsware.com/blog/2026/02/06/write-for-your-readers-even-if-they-are-agents.html
1•ingve•11m ago•0 comments

Knowledge-Creating LLMs

https://tecunningham.github.io/posts/2026-01-29-knowledge-creating-llms.html
1•salkahfi•11m ago•0 comments

Maple Mono: Smooth your coding flow

https://font.subf.dev/en/
1•signa11•18m ago•0 comments

Sid Meier's System for Real-Time Music Composition and Synthesis

https://patents.google.com/patent/US5496962A/en
1•GaryBluto•26m ago•1 comments

Show HN: Slop News – HN front page now, but it's all slop

https://dosaygo-studio.github.io/hn-front-page-2035/slop-news
4•keepamovin•27m ago•2 comments

Show HN: Empusa – Visual debugger to catch and resume AI agent retry loops

https://github.com/justin55afdfdsf5ds45f4ds5f45ds4/EmpusaAI
1•justinlord•29m ago•0 comments

Show HN: Bitcoin wallet on NXP SE050 secure element, Tor-only open source

https://github.com/0xdeadbeefnetwork/sigil-web
2•sickthecat•32m ago•1 comments

White House Explores Opening Antitrust Probe on Homebuilders

https://www.bloomberg.com/news/articles/2026-02-06/white-house-explores-opening-antitrust-probe-i...
1•petethomas•32m ago•0 comments

Show HN: MindDraft – AI task app with smart actions and auto expense tracking

https://minddraft.ai
2•imthepk•37m ago•0 comments

How do you estimate AI app development costs accurately?

1•insights123•38m ago•0 comments

Going Through Snowden Documents, Part 5

https://libroot.org/posts/going-through-snowden-documents-part-5/
1•goto1•38m ago•0 comments

Show HN: MCP Server for TradeStation

https://github.com/theelderwand/tradestation-mcp
1•theelderwand•41m ago•0 comments

Canada unveils auto industry plan in latest pivot away from US

https://www.bbc.com/news/articles/cvgd2j80klmo
3•breve•42m ago•1 comments

The essential Reinhold Niebuhr: selected essays and addresses

https://archive.org/details/essentialreinhol0000nieb
1•baxtr•45m ago•0 comments

Rentahuman.ai Turns Humans into On-Demand Labor for AI Agents

https://www.forbes.com/sites/ronschmelzer/2026/02/05/when-ai-agents-start-hiring-humans-rentahuma...
1•tempodox•46m ago•0 comments

StovexGlobal – Compliance Gaps to Note

1•ReviewShield•49m ago•1 comments

Show HN: Afelyon – Turns Jira tickets into production-ready PRs (multi-repo)

https://afelyon.com/
1•AbduNebu•50m ago•0 comments

Trump says America should move on from Epstein – it may not be that easy

https://www.bbc.com/news/articles/cy4gj71z0m0o
6•tempodox•51m ago•3 comments

Tiny Clippy – A native Office Assistant built in Rust and egui

https://github.com/salva-imm/tiny-clippy
1•salvadorda656•55m ago•0 comments

LegalArgumentException: From Courtrooms to Clojure – Sen [video]

https://www.youtube.com/watch?v=cmMQbsOTX-o
1•adityaathalye•58m ago•0 comments

US moves to deport 5-year-old detained in Minnesota

https://www.reuters.com/legal/government/us-moves-deport-5-year-old-detained-minnesota-2026-02-06/
8•petethomas•1h ago•3 comments

If you lose your passport in Austria, head for McDonald's Golden Arches

https://www.cbsnews.com/news/us-embassy-mcdonalds-restaurants-austria-hotline-americans-consular-...
1•thunderbong•1h ago•0 comments

Show HN: Mermaid Formatter – CLI and library to auto-format Mermaid diagrams

https://github.com/chenyanchen/mermaid-formatter
1•astm•1h ago•0 comments

RFCs vs. READMEs: The Evolution of Protocols

https://h3manth.com/scribe/rfcs-vs-readmes/
3•init0•1h ago•1 comments

Kanchipuram Saris and Thinking Machines

https://altermag.com/articles/kanchipuram-saris-and-thinking-machines
1•trojanalert•1h ago•0 comments

Chinese chemical supplier causes global baby formula recall

https://www.reuters.com/business/healthcare-pharmaceuticals/nestle-widens-french-infant-formula-r...
2•fkdk•1h ago•0 comments

I've used AI to write 100% of my code for a year as an engineer

https://old.reddit.com/r/ClaudeCode/comments/1qxvobt/ive_used_ai_to_write_100_of_my_code_for_1_ye...
3•ukuina•1h ago•1 comments

Looking for 4 Autistic Co-Founders for AI Startup (Equity-Based)

1•au-ai-aisl•1h ago•1 comments
Open in hackernews

Implement Flash Attention Back End in SGLang – Basics and KV Cache

https://hebiao064.github.io/fa3-attn-backend-basic
36•latchkey•9mo ago

Comments

behnamoh•9mo ago
is sglang an LLM engine or does it use vLLM/llama.cpp under the hood? and while we're at it, has anyone done a comparison of LLM engines? I've also heard of Mistral.rs, LLM MLC, and obviously HF transformers library and its ktransformers alternative.
imtringued•9mo ago
SGLang is a competitor to vLLM.
zacksiri•9mo ago
Here is a list of inference engines i've tried:

- SGLang

- vLLM

- TGI (Huggingface's)

- llama.cpp

- infinity (great for embedding / reranking models not for LLMs)

My personal feeling is SGLang / vLLM have issues that make me not want to use it. Sure it's fast, but there are reliability issues, you need lots of flags and tinkering to make it work. Also there is the problem of 100% cpu usage on idle which the core contributors say is 'normal' and 'expected'. You can do a search in the respective repositories on this topic if you don't believe me. People even submitted PRs to solve these issues which they have not merged. The mindset of these software is just to get it to 'work' but not really on polish and ease of use.

TGI on the other hand is in a class of it's own. You can just feel the polish that went into it. Things tend to 'just work'. It's the only engine I tried that was able to run a model I wanted in a single try. Then I added the flags to make it fit with my hardware (like sharding and max prefill tokens). TGI uses flashinfer by default which is SOTA when it comes to flash attention backend.

llama.cpp has widest model support, however it does not perform as well as TGI / vLLM / SGLang. So if you can accept that you are losing performance (based on my testing about 30% slower) tt's great for testing, development purposes but if you want to do production grade stuff I would recommend TGI.

behnamoh•9mo ago
Thanks for sharing your XP. I liked the documentation of sglang, especially when it comes to structured output: https://docs.sglang.ai/backend/structured_outputs.html

I couldn't find info on TGI constrained generation though.

ikeashark•9mo ago
SGLang is a fork of VLLM