news newest ask show jobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

FlashHead: Up to 40% Faster Multimodal Reasoning on Top of Quantization

https://huggingface.co/embedl/Cosmos-Reason2-2B-W4A16-Edge2-FlashHead

1•Embedl-Wilhelm•1h ago

Comments

Embedl-Wilhelm•1h ago

We released a Cosmos-Reason2-2B W4A16 + FlashHead build optimized for Jetson devices. FlashHead is a drop-in replacement for the LM head that increases token generation throughput without sacrificing reasoning quality, on top of techniques like quantization.

Try it with vllm-serve:

ssh <your-orin>

docker run --rm -it \ --network host \ --runtime=nvidia \ --name=vllm-serve \ -e HF_TOKEN=<YOUR_HUGGINGFACE_TOKEN_HERE> \ embedl/vllm:latest-jetson-orin-flashhead \ vllm serve "embedl/Cosmos-Reason2-2B-W4A16-Edge2-FlashHead" \ --gpu-memory-utilization 0.75 \ --trust-remote-code

curl localhost:8000/v1/chat/completions \ -H 'Content-Type: application/json' \ -d '{"model":"embedl/Cosmos-Reason2-2B-W4A16-Edge2-FlashHead","messages":[{"role":"user","content":"Hi"}]}' Jetson video inference benchmark (TPS with batch size = 1, 12 frames, 1280×720):

Device FP16 W4A16 FlashHead Orin Nano OOM 43.7 53.5 AGX Orin 39.6 74.4 92.2 AGX Thor 56.2 88.3 128.2 Model: https://huggingface.co/embedl/Cosmos-Reason2-2B-W4A16-Edge2-...

We’re Embedl, a research startup from Gothenburg, Sweden and the team behind FlashHead. Let us know what other models you’d like to see it applied to.

Disallow usage of generative AI to write code

https://github.com/systemd/systemd/issues/41085

1•_____k•56s ago•0 comments

Turn any software into an agent-native CLI

https://github.com/HKUDS/CLI-Anything

1•saran945•2m ago•1 comments

Collection of E-Commerce Skills for AI agents

https://github.com/finsilabs/awesome-ecommerce-skills/

1•andrebrov•2m ago•1 comments

A Beginner's Guide to Safe LLM-Assisted Development

https://brooksmcmillin.com/blog/llm-safety-setup-guide/

1•bengal•3m ago•0 comments

Daydreaming

https://astra-mag.com/articles/dreamers-in-broad-daylight-ten-conversations/

1•vishkk•5m ago•0 comments

Effect TS: A New Way to Structure TypeScript Apps

https://jsdev.space/meet-effect-ts/

1•javatuts•8m ago•0 comments

Eating your own dog food

https://en.wikipedia.org/wiki/Eating_your_own_dog_food

1•omeysalvi•8m ago•0 comments

A2A Protocol Ships v1.0: Production-Ready Standard for Agent-to-Agent

https://a2a-protocol.org/latest/announcing-1.0/

1•mindcrime•9m ago•0 comments

Run NanoClaw in Docker Sandboxes

https://nanoclaw.dev/blog/nanoclaw-docker-sandboxes/

5•outofdistro•9m ago•0 comments

Ask HN: How do you detect configuration drift between environments?

1•orkunk•9m ago•1 comments

Guzzle – The GUI LibFuzzer Wizard

https://github.com/jabberwock/guzzle

1•thejabberwock•10m ago•0 comments

Coding Agents Are Reshaping Engineering, Product and Design

https://twitter.com/hwchase17/status/2031051115169808685

1•gmays•11m ago•0 comments

Agent.json – robots.txt for AI agent-to-website communication

1•charlkruger•12m ago•1 comments

The spec said "handle user input securely." Three teams interpreted this differe

1•Lliora•13m ago•0 comments

Finding a CPU Design Bug in the Xbox 360

https://randomascii.wordpress.com/2018/01/07/finding-a-cpu-design-bug-in-the-xbox-360/

1•mariuz•14m ago•0 comments

Pingtrace – One command to ping and trace networks

https://www.npmjs.com/package/pingtrace

1•skhell•15m ago•0 comments

Improving instruction hierarchy in frontier LLMs

https://openai.com/index/instruction-hierarchy-challenge/

1•gmays•15m ago•0 comments

AI policy's new power center

https://www.axios.com/2026/03/13/ai-policy-power-center-pentagon-anthropic

1•Brajeshwar•15m ago•0 comments

IPv6 support for cloning Git repositories

https://github.com/orgs/community/discussions/10539

2•stefankuehnel•16m ago•0 comments

Computer History Museum Presents Apple at 50: Five Decades of Thinking Different [video]

https://www.youtube.com/watch?v=w8wt0LBCjXM

1•ChrisArchitect•16m ago•0 comments

How the Eon Team Produced a Virtual Embodied Fly

https://eon.systems/updates/embodied-brain-emulation

1•hmokiguess•17m ago•0 comments

Amid xAI co-founder exits, Elon Musk hires key engineers from AI startup Cursor

https://www.businesstoday.in/technology/news/story/amid-xai-co-founder-exits-elon-musk-poaches-ke...

2•Zigurd•17m ago•1 comments

Joint statement of scientists and researchers on Age Assurance [pdf]

https://csa-scientist-open-letter.org/ageverif-Feb2026

2•speckx•18m ago•0 comments

Show HN: Oxyde – Pydantic-native async ORM with a Rust core

https://github.com/mr-fatalyst/oxyde

1•mr_Fatalyst•19m ago•0 comments

I traced $2B in nonprofit grants, lobbying records for age verification bills

https://old.reddit.com/r/opensource/comments/1rsfhf0/i_traced_2_billion_in_nonprofit_grants_and_45/

1•thunderbong•20m ago•1 comments

The Formation of Star Patterns on Lake Ice (2007) [pdf]

https://www.whoi.edu/cms/files/Victor_21243.pdf

2•mooreds•21m ago•0 comments

The Gap

https://codeplusconduct.substack.com/p/the-gap

1•mooreds•21m ago•0 comments

GitHub Sudo Mode

https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/sudo-mode

1•mooreds•22m ago•0 comments

A list of tech co-ops and resources concerning worker owned co-ops

https://github.com/hng/tech-coops

1•iamnothere•23m ago•1 comments

Betrayed by My Own Blog

https://ossama.is/writing/betrayed

3•jllyhill•25m ago•3 comments