frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Paperclip The human control plane for AI labor

https://paperclip.ing/
1•swazzy•44s ago•0 comments

After Deaths, Lawsuits Against A.I. Companies Test a New Strategy

https://www.nytimes.com/2026/05/12/technology/chatgpt-lawsuit-wrongful-death.html
1•1vuio0pswjnm7•48s ago•0 comments

The Origins of "Hello, World" [video]

https://www.youtube.com/watch?v=vLer3fRwwxE
1•saikatsg•1m ago•0 comments

AI isn't paying off in the way companies think

https://fortune.com/2026/05/11/ai-automation-layoffs-gartner-study-roi/
2•1vuio0pswjnm7•1m ago•0 comments

U.S. inflation jumps to 3.8% YoY (7.2% MoM, annualized)

https://www.bls.gov/news.release/cpi.nr0.htm
1•JumpCrisscross•4m ago•0 comments

AI in the rare disease news desert

https://www.thekabukipapers.org/articles/36
2•marstall•5m ago•1 comments

CC-Ledger: Claude Code Cost Tracker (Per-Session and Per-PR)

https://github.com/delta-hq/cc-ledger
1•tsv650•7m ago•0 comments

Parent sues Palo Alto Unified after son is accused of using AI on essay

https://www.paloaltoonline.com/palo-alto-schools/2026/05/11/parent-sues-palo-alto-school-district...
3•ua709•9m ago•2 comments

Carmack on starting a video game company today

https://twitter.com/ID_AA_Carmack/status/2054230690242212133
2•tosh•10m ago•0 comments

Nemotron-Cascade 2: Post-Training LLMs with Cascade RL

https://research.nvidia.com/labs/nemotron/nemotron-cascade-2/
1•daureg•10m ago•1 comments

Trump says US FDA Commissioner Makary is out

https://www.reuters.com/business/healthcare-pharmaceuticals/fda-commissioner-makary-is-resigning-...
1•randycupertino•10m ago•1 comments

Did Ancient Civilizations Have Organized Crime?

https://talesoftimesforgotten.com/2026/03/16/did-ancient-civilizations-have-organized-crime/
2•dbrereton•10m ago•1 comments

The Main Path to Creative AI

https://danielmiessler.com/blog/the-main-path-to-truly-creative-ai
2•pplonski86•11m ago•0 comments

Redraw: 2d Primitives for Web and Native

https://wcandillon.github.io/redraw/
1•memalign•13m ago•0 comments

Modeling the US-Europe Paradox

https://paulkrugman.substack.com/p/modeling-the-us-europe-paradox-very
1•vquemener•15m ago•0 comments

Building a Local AI Workspace Inside VS Code

https://jsdevspace.substack.com/p/building-a-fully-local-ai-workspace
1•javatuts•16m ago•0 comments

In-Kernel Broadcast Optimization: Co-Designing Kernels for RecSys Inference

https://pytorch.org/blog/in-kernel-broadcast-optimization-co-designing-kernels-for-recsys-inference/
1•gmays•16m ago•0 comments

ChatGPT adoption broadened in early 2026

https://openai.com/signals/research/2026q1-update/
2•Brajeshwar•16m ago•0 comments

Company behind GLiNER model released open source model for running LLM guardrail

https://pioneer.ai/blog/gliguard-16x-faster-safety-moderation-with-a-small-language-model
12•neon_share1•17m ago•0 comments

Dependencies Are Someone Else's Attack Surface

https://quodeq.ai/blog/supply-chain-attack-surface/
3•VictorPurMar•18m ago•0 comments

AI Is Starting to Build Better AI (Recursive self-improvement)

https://spectrum.ieee.org/recursive-self-improvement
1•marojejian•19m ago•0 comments

AI overlay that stays invisible to screen recorders

1•unviewable•19m ago•0 comments

Are LLM Useful for Solo Founders

1•sinsudo•21m ago•2 comments

US budget watchdog estimates Golden Dome will cost $1.2T

https://www.reuters.com/business/aerospace-defense/us-budget-watchdog-estimates-golden-dome-will-...
2•OutOfHere•21m ago•0 comments

What Is RTSP Streaming and Why It Is Still Relevant in 2026

https://www.red5.net/blog/4-reasons-rtsp-streaming-is-still-relevant/
1•mondainx•21m ago•0 comments

Show HN: Mealplannr – turn YouTube chef videos into weekly meal plans

https://mealplannr.io
1•nullandvoid•22m ago•0 comments

GitLab Outage

https://status.gitlab.com/
2•Sparkle-san•22m ago•2 comments

New Project

https://nebulad-studios.gt.tc
1•dom_kom•22m ago•1 comments

Show HN: Awesome Stars- render github awesome list with live star/fork badges

https://awesome-stars.github.io
1•arashbehmand•24m ago•0 comments

A code (reformatting) conundrum in Python, and heuristics

https://utcc.utoronto.ca/~cks/space/blog/python/CodeFormattingBlockHeuristics
1•speckx•24m ago•0 comments
Open in hackernews

Show HN: Needle: We Distilled Gemini Tool Calling into a 26M Model

https://github.com/cactus-compute/needle
4•HenryNdubuaku•59m ago
Hey HN, Henry here from Cactus. We open-sourced Needle, a 26M parameter function-calling (tool use) model. It runs at 6000 tok/s prefill and 1200 tok/s decode on consumer devices.

We were always frustrated by the little effort made towards building agentic models that run on budget phones, so we conducted investigations that led to an observation: agentic experiences are built upon tool calling, and massive models are overkill for it. Tool calling is fundamentally retrieval-and-assembly (match query to tool name, extract argument values, emit JSON), not reasoning. Cross-attention is the right primitive for this, and FFN parameters are wasted at this scale.

Simple Attention Networks: the entire model is just attention and gating, no MLPs anywhere. Needle is an experimental run for single-shot function calling for consumer devices (phones, watches, glasses...).

Training: - Pretrained on 200B tokens across 16 TPU v6e (27 hours) - Post-trained on 2B tokens of synthesized function-calling data (45 minutes) - Dataset synthesized via Gemini with 15 tool categories (timers, messaging, navigation, smart home, etc.)

You can test it right now and finetune on your Mac/PC: https://github.com/cactus-compute/needle

The full writeup on the architecture is here: https://github.com/cactus-compute/needle/blob/main/docs/simp...

We found that the "no FFN" finding generalizes beyond function calling to any task where the model has access to external structured knowledge (RAG, tool use, retrieval-augmented generation). The model doesn't need to memorize facts in FFN weights if the facts are provided in the input. Experimental results to published.

While it beats FunctionGemma-270M, Qwen-0.6B, Granite-350M, LFM2.5-350M on single-shot function calling, those models have more scope/capacity and excel in conversational settings. We encourage you to test on your own tools via the playground and finetune accordingly.

This is part of our broader work on Cactus (https://github.com/cactus-compute/cactus), an inference engine built from scratch for mobile, wearables and custom hardware. We wrote about Cactus here previously: https://news.ycombinator.com/item?id=44524544

Everything is MIT licensed. Weights: https://huggingface.co/Cactus-Compute/needle GitHub: https://github.com/cactus-compute/needle