frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Launch HN: General Instinct (YC P26) – Frontier models on edge devices

9•guanming0717•1h ago
Hey HN, Guanming and Bill here from General Instinct (https://general-instinct.com/).

After years of working in robotics, we kept running into the same problem: the best models never fit the hardware we actually had available.

The models that performed best were usually designed around datacenter assumptions: large GPUs, lots of memory bandwidth, and reliable network access. But most physical systems have the opposite constraints.

That led us down the path of figuring out how much of a frontier model could be preserved while still making it practical to run on edge hardware.

As part of that work, we recently open sourced InstinctRazor (https://github.com/General-Instinct/InstinctRazor)

One result we're excited about is compressing Qwen3.5-122B-A10B, a roughly 245 GB BF16 MoE model, into a 48 GiB GGUF. The resulting model is actually smaller than Gemma-4-26B-A4B while outperforming it on benchmarks like MMLU-Pro and GPQA-D etc. we preserve the parts that are always active (router, norms, Gated-DeltaNet/SSM layers, vision pathway, etc.) and quantize the routed experts much more aggressively. We then use on-policy distillation to recover capability lost during quantization.

The model can also run in a "small GPU" configuration where experts are streamed from system RAM. With an 8k context window, peak VRAM usage is around 7.6–8 GB.

If you're interested in the technical details, we wrote up the approach here (https://general-instinct.com/blog/frontier-moe-sub-4-bit)

We're especially interested in hearing from people deploying models onto robots or other edge devices. What models are you trying to run locally today? What has been the biggest bottleneck in getting them into production?

Comments

VikRubenfeld•44m ago
You've likely heard about this - he'd probably like to talk to you and might potentially give you some good PR.

https://www.youtube.com/watch?v=rAzT5lcezPs&t=467s

guanming0717•36m ago
Thanks for sharing! I'd love to chat with him. Would you be open to introducing us? :)
smokel•6m ago
For those too lazy to watch someone talk on video for ages to make a point:

The link is to a famous YouTuber called PewDiePie and he uses a local LLM to parse his email, to save time with that. They have an autoreply system and get notified about urgent matters.

The 1N4148: The Signal Diode That Ended Up Everywhere

https://www.allaboutcircuits.com/news/the-1n4148-the-signal-diode-that-ended-up-everywhere/
1•tomclancy•50s ago•0 comments

Show HN: MimicScribe – transcriber with ~97% accurate on-device speaker IDing

https://mimicscribe.app/
1•marshalla•1m ago•0 comments

Software security in 2026 – Practical tips for the mildly paranoid

https://www.dedoimedo.com/computers/internet-security-2026.html
1•speckx•7m ago•0 comments

Sakana AI's Recursive Self-Improvement (RSI) Lab

https://sakana.ai/rsi-lab/
2•hardmaru•7m ago•0 comments

Runcap, I built a local cost cap for coding agents

https://github.com/kirder24-code/ai-agent-manager
1•kirillAIsolo•8m ago•1 comments

Firebase SQL Connect

https://firebase.google.com/docs/sql-connect
1•tosh•8m ago•0 comments

Agents Need a Public Square

https://substack.com/@srulyrosenblat/p-200664261
1•srulyrosenblat•10m ago•0 comments

Waymo's Growing Pains – KQED Forum (Live)

https://www.kqed.org/forum/2010101914030/waymos-growing-pains
1•aanet•15m ago•1 comments

The token bill comes due

https://techcrunch.com/2026/06/05/the-token-bill-comes-due-inside-the-industry-scramble-to-manage...
1•amichae2•15m ago•0 comments

Show HN: Micron: a high performance C++23 (re)implementation of Libc and the STL

https://github.com/rfgplk/micron.cpp
2•rfgplk•20m ago•0 comments

Heat Therapy: Targeting Health, Disease, and Disability

https://onlinelibrary.wiley.com/doi/10.1002/cph4.70089
1•PaulHoule•20m ago•0 comments

Symbolica 2.0: Programmable Symbols for Python and Rust

https://symbolica.io/posts/symbolica_2_0_release/
2•mmastrac•21m ago•0 comments

Microsoft's Project Solara is an Android OS designed for agents instead of apps

https://arstechnica.com/gadgets/2026/06/microsofts-project-solara-is-an-android-os-designed-for-a...
1•e12e•22m ago•0 comments

The destruction of 3D printing: Bloomberg is behind it [video]

https://www.youtube.com/watch?v=E1B2cWEaWDw
3•CharlesW•23m ago•2 comments

Anthropic proposes a global slowdown of AI development

https://www.engadget.com/2188066/anthropic-proposes-global-ai-development-slowdown/
2•oogali•24m ago•1 comments

OpenAI on AWS: When to Use the API, Bedrock, or Managed Agents

https://eliza.com/blog/openai-on-aws-when-to-use-the-api-bedrock-or-managed-agents
1•mooreds•25m ago•0 comments

Show HN: SnapToCode – Screenshot any UI and get clean Tailwind code

https://chromewebstore.google.com/detail/snaptocode/jpchamlmjfoccmkdoiaibbpgkidapcnk
2•adithagrawaal•30m ago•0 comments

Claude Code, no need to worry about sunk costs anymore

https://www.npmjs.com/package/schwabe
3•schwabenschwabe•30m ago•1 comments

The Download: AI hacking beyond Mythos, and chatbots' impact on our brains

https://www.technologyreview.com/2026/06/05/1138452/the-download-ai-hacking-mythos-chatbots-brain...
1•joozio•33m ago•0 comments

Americans on GLP-1s Are Overwhelming Retailers with Their Nonstop Returns

https://www.wsj.com/business/retail/americans-on-glp-1s-are-overwhelming-retailers-with-their-non...
3•satvikpendem•34m ago•0 comments

Fake Money Built America

https://mail.blockworks.com/p/how-fake-money-built-america
3•speckx•35m ago•0 comments

Adyen Selected as Payment Services Provider for GOV.UK Pay

https://www.adyen.com/press-and-media/adyen-payments-gov-uk
10•ChrisArchitect•35m ago•2 comments

The Orchard Bug and the Unfolding Cybersecurity Reckoning

https://bengoertzel.substack.com/p/the-orchard-bug-and-the-unfolding
5•Ariarule•36m ago•0 comments

SVG of a Hamster Playing Table-Tennis

https://aibenchy.com/ro/showcase/hamster-playing-table-tennis-svg/
7•XCSme•38m ago•3 comments

Reid Hoffman leaves Microsoft board

https://www.reuters.com/business/linkedin-co-founder-reid-hoffman-step-down-microsofts-board-2026...
6•sosomoxie•39m ago•1 comments

Gov.uk goes Dutch on payments as it dumps Stripe

https://www.theregister.com/public-sector/2026/06/04/govuk-goes-dutch-on-payments-as-it-dumps-str...
25•toomuchtodo•39m ago•4 comments

Profanity, a console based XMPP client

https://profanity-im.github.io/
2•speckx•43m ago•2 comments

Show HN: Gito v4.1.0 – AI code reviewer now runs on Claude Code / Gemini CLI

https://github.com/Nayjest/Gito/releases/tag/v4.1.0
1•Nayjest•43m ago•0 comments

Show HN: SteelWorks, a free-first autonomous business OS

https://therealmacsteel.github.io/
1•therealmacsteel•50m ago•0 comments

Stop Building Custom Agent Identity [video]

https://www.youtube.com/watch?v=wWoA0Ct99Tc
1•mooreds•51m ago•0 comments