frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Show HN: I built a free UCP checker – see if AI agents can find your store

https://ucphub.ai/ucp-store-check/
1•vladeta•4m ago•1 comments

Show HN: SVGV – A Real-Time Vector Video Format for Budget Hardware

https://github.com/thealidev/VectorVision-SVGV
1•thealidev•5m ago•0 comments

Study of 150 developers shows AI generated code no harder to maintain long term

https://www.youtube.com/watch?v=b9EbCb5A408
1•lifeisstillgood•6m ago•0 comments

Spotify now requires premium accounts for developer mode API access

https://www.neowin.net/news/spotify-now-requires-premium-accounts-for-developer-mode-api-access/
1•bundie•8m ago•0 comments

When Albert Einstein Moved to Princeton

https://twitter.com/Math_files/status/2020017485815456224
1•keepamovin•10m ago•0 comments

Agents.md as a Dark Signal

https://joshmock.com/post/2026-agents-md-as-a-dark-signal/
1•birdculture•11m ago•0 comments

System time, clocks, and their syncing in macOS

https://eclecticlight.co/2025/05/21/system-time-clocks-and-their-syncing-in-macos/
1•fanf2•13m ago•0 comments

McCLIM and 7GUIs – Part 1: The Counter

https://turtleware.eu/posts/McCLIM-and-7GUIs---Part-1-The-Counter.html
1•ramenbytes•16m ago•0 comments

So whats the next word, then? Almost-no-math intro to transformer models

https://matthias-kainer.de/blog/posts/so-whats-the-next-word-then-/
1•oesimania•17m ago•0 comments

Ed Zitron: The Hater's Guide to Microsoft

https://bsky.app/profile/edzitron.com/post/3me7ibeym2c2n
2•vintagedave•20m ago•1 comments

UK infants ill after drinking contaminated baby formula of Nestle and Danone

https://www.bbc.com/news/articles/c931rxnwn3lo
1•__natty__•21m ago•0 comments

Show HN: Android-based audio player for seniors – Homer Audio Player

https://homeraudioplayer.app
2•cinusek•21m ago•0 comments

Starter Template for Ory Kratos

https://github.com/Samuelk0nrad/docker-ory
1•samuel_0xK•22m ago•0 comments

LLMs are powerful, but enterprises are deterministic by nature

2•prateekdalal•26m ago•0 comments

Make your iPad 3 a touchscreen for your computer

https://github.com/lemonjesus/ipad-touch-screen
2•0y•31m ago•1 comments

Internationalization and Localization in the Age of Agents

https://myblog.ru/internationalization-and-localization-in-the-age-of-agents
1•xenator•31m ago•0 comments

Building a Custom Clawdbot Workflow to Automate Website Creation

https://seedance2api.org/
1•pekingzcc•34m ago•1 comments

Why the "Taiwan Dome" won't survive a Chinese attack

https://www.lowyinstitute.org/the-interpreter/why-taiwan-dome-won-t-survive-chinese-attack
2•ryan_j_naughton•34m ago•0 comments

Xkcd: Game AIs

https://xkcd.com/1002/
1•ravenical•36m ago•0 comments

Windows 11 is finally killing off legacy printer drivers in 2026

https://www.windowscentral.com/microsoft/windows-11/windows-11-finally-pulls-the-plug-on-legacy-p...
1•ValdikSS•36m ago•0 comments

From Offloading to Engagement (Study on Generative AI)

https://www.mdpi.com/2306-5729/10/11/172
1•boshomi•38m ago•1 comments

AI for People

https://justsitandgrin.im/posts/ai-for-people/
1•dive•39m ago•0 comments

Rome is studded with cannon balls (2022)

https://essenceofrome.com/rome-is-studded-with-cannon-balls
1•thomassmith65•45m ago•0 comments

8-piece tablebase development on Lichess (op1 partial)

https://lichess.org/@/Lichess/blog/op1-partial-8-piece-tablebase-available/1ptPBDpC
2•somethingp•46m ago•0 comments

US to bankroll far-right think tanks in Europe against digital laws

https://www.brusselstimes.com/1957195/us-to-fund-far-right-forces-in-europe-tbtb
3•saubeidl•47m ago•0 comments

Ask HN: Have AI companies replaced their own SaaS usage with agents?

1•tuxpenguine•50m ago•0 comments

pi-nes

https://twitter.com/thomasmustier/status/2018362041506132205
1•tosh•52m ago•0 comments

Show HN: Crew – Multi-agent orchestration tool for AI-assisted development

https://github.com/garnetliu/crew
1•gl2334•52m ago•0 comments

New hire fixed a problem so fast, their boss left to become a yoga instructor

https://www.theregister.com/2026/02/06/on_call/
1•Brajeshwar•54m ago•0 comments

Four horsemen of the AI-pocalypse line up capex bigger than Israel's GDP

https://www.theregister.com/2026/02/06/ai_capex_plans/
1•Brajeshwar•54m ago•0 comments
Open in hackernews

Launch HN: Cactus (YC S25) – AI inference on smartphones

https://github.com/cactus-compute/cactus
123•HenryNdubuaku•4mo ago
Hey HN, Henry & Roman here, we are building Cactus (https://cactuscompute.com/), an AI inference engine specifically designed for phones.

We're seeing a major push towards on-device AI, and for good reason: on-device AI decreases latency from >1sec to <100ms, guarantees privacy by default, works offline, and doesn't rack up a massive API bill at scale.

Also, tools and agentic designs make small models really good beyond benchmarks. This has been corroborated by other papers like https://arxiv.org/abs/2506.02153, and we see model companies like DeepMind aggressively going into smaller models with Gemma3 270m and 308m. We found Qwen3 600m to be great at tool calls for instance.

Some frameworks already try to solve this but in my previous job, they struggled in production compared to research and playgrounds:

- They optimise for modern devices but 70% of phones today are low-mid budget.

- Bloated app bundle sizes and battery drain are serious concerns for users.

- Phone GPU battery drain is unacceptable, NPUs are preferred, but few phones have those for now.

- Some are platform-specific, requiring different models and workflows for different operating systems.

At Cactus, we’ve written kernels and inference engine for running AI locally on any phone, from the ground-up.

Cactus is designed for mobile devices and their constraints. Every design choice like energy efficiency, accelerator support, quantization levels, supported models, weight format, and context management were determined by this. We also provide minimalist SDKs for app developers to build agentic workflows in 2-5 lines of code.

We made a Show HN post when we started the project to get the community's thoughts (https://news.ycombinator.com/item?id=44524544). Based on your feedback, we built Cactus bottom-up to solve those problems, and are launching the Cactus Kernels, Cactus Graph and Cactus Engine, all designed for phones and tiny devices.

CPU benchmarks for Qwen3-600m-INT8 :

- 16-20 toks/sec on Pixel 6a / Galaxy S21 / iPhone 11 Pro

- 50-70 toks/sec on Pixel 9 / Galaxy S25 / iPhone 16.

- Time-to-first-token is as low as 50ms depending on prompt size.

On NPUs, we see Qwen3-4B-INT4 run at 21 toks/sec.

We are open-source (https://github.com/cactus-compute/cactus). Cactus is free for hobbyists and personal projects, with a paid license required for commercial use.

We have a demo app on the App Store at https://apps.apple.com/gb/app/cactus-chat/id6744444212 and on Google Play at https://play.google.com/store/apps/details?id=com.rshemetsub....

In addition, there are numerous apps using Cactus in production, including AnythingLLM (https://anythingllm.com/mobile) and KinAI (https://mykin.ai/). Collectively they run over 500k weekly inference tasks in production.

While Cactus can be used for all Apple devices including Macbooks due to their design, for computers/AMD/Intel/Nvidia generally, please use HuggingFace, Llama.cpp, Ollama, vLLM, MLX. They're built for those, support x86, and are all great!

Thanks again, please share your thoughts, we’re keen to understand your views.

Comments

VladVladikoff•4mo ago
How does this startup plan to make money?
HenryNdubuaku•4mo ago
Cactus is free for hobbyists and personal projects, but we charge a tiny fee for commercial use which comes with more features that are relevant for enterprises.
binary132•4mo ago
I couldn’t find a pricing page on your site. How tiny is tiny?
HenryNdubuaku•4mo ago
It’s custom for now as we are calibrating to see what works for everyone, wanna get in touch?
binary132•4mo ago
I’m not in the market, just curious.
mritchie712•4mo ago
how many GB does an app packaged with Qwen3 600m + Cactus take up?

e.g. if I built a basic LLM chat app with Qwen3 600m + Cactus, whats the total app size?

HenryNdubuaku•4mo ago
400mb if you ship the model as an asset. However, you can also build the app to download the model post-install, Cactus SDKs support this, as well as agentic workflows you’d need.
cco•4mo ago
I've been using Cactus for a few months, great product!

Makes it really easy to plug and play different models on my phone.

If anybody is curious what a Pixel 9 Pro is capable of:

Tokens: 277- TTFT: 1609ms 9 tok/sec

qwen2.5 1.5b instruct q6_k

Sure, here's a simple implementation of the Bubble Sort algorithm in Python:

def bubble_sort(arr): n = len(arr) for i in range(n): # Flag to detect any swap in current pass swapped = False for j in range(0, n-i-1): # Swap if the element found is greater than the next element if arr[j] > arr[j+1]: arr[j], arr[j+1] = arr[j+1], arr[j] swapped = True # If no swap occurs in the inner loop, the array is already sorted if not swapped: break

# Example usage: arr = [64, 34, 25, 12, 22, 11, 90] bubble_sort(arr) print("Sorted array is:", arr)

This function sorts the array in ascending order using the Butbble Sort algorithm. The outer loop runs n times, where n is the length of the array. The inner loop runs through the array, comparing adjacent elements and swapping them if they are in the wrong order. The swapped flag is used to detect if any elements were swapped in the current pass, which would indicate that the array is already sorted and can be exited early.

HenryNdubuaku•4mo ago
Thanks for the kind words, we’ve improved performance now actually, follow the instructions on the core repo.

Same model should run 3x faster on the same phone.

These improvements are still being pushed to the SDKs though.

cco•4mo ago
Wow! 3x is huge.

I've had great experiences with gpt-oss20b on my laptop, a genuinely useful local model.

3x probably doesn't get my Pixel Pro 9 to being able to run 20b models, but its getting close!

HenryNdubuaku•4mo ago
Although GPT OSS 20B has 1.7B activated parameters which will be fast, 20B weights is a lot for developers to bundle or consumers to download. That’s the actual problem.
dcreater•4mo ago
The first picture on the android app store page shows Claude Haiku as the model
HenryNdubuaku•4mo ago
Thanks for noticing! The app is just a demo for the framework, so devs can compare the open-source models against frontier Cloud models and make a decision. We removed the comparison now so those screenshots indeed has to be updated.
dcreater•4mo ago
Does it incorporate web search tool?
HenryNdubuaku•4mo ago
It can incorporate any tool you want at all. This company’s app use exactly that feature, you can download and get a sense of it before digging in. https://anythingllm.com/mobile
pzo•4mo ago
FWIW They change license 2 weeks ago from Apache 2.0 to non commercial. Understand they need to pay the bills but lost trust with such move. Will stick with react-native-ai [0] that is extension of vercel aisdk but with also local inference on edge devices

[0] react-native-ai.dev

HenryNdubuaku•4mo ago
Understandable, though to explain, Cactus is still free for personal & small projects if you fall into that category. We’re early and would definitely consider your concerns on license in our next steps, thanks.
mdaniel•4mo ago
For fear of having dang show up and scold me, I'll just add the factual statement that I will never ever believe any open source claim in any Launch HN ever. I can now save myself the trouble of checking, because I can be certain it's untrue

I already knew to avoid "please share your thoughts," although I guess I am kind of violating that one by even commenting

HenryNdubuaku•4mo ago
It’s absolutely fine to share your thoughts, that’s the point of this post, we want to understand where people’s heads are at, it’s what determines our next decisions. What do you really think? I’m genuinely asking so I don’t think mods will react.
trollbridge•4mo ago
Here’s an example of what I want to do: ship our application entirely open source/free (AGPL3), but with options for interested parties who want to pay us for support/consulting to do so. Likewise, we want interested parties who want to build their own proprietary app on top of our stack to be able to do so.

Mixing in a “you have to pay if you’re a corporation” licence makes this difficult if not impossible, particularly if we wanted deep integration with eg Cactus. We don’t want to police a “corporation” who wants to use our open source software.

HenryNdubuaku•4mo ago
Thanks for pointing this out, another factor for us to figure out. We waive the license for such cases, wanna get in touch? I don’t think your consumers have to worry about the license.
theturtletalks•4mo ago
I agree, I've seen so many products start open source to gain traction, get into YC, and then either go closed source or change the license. That's a bait and switch and I appreciate the comment pointing it out.

I downloaded Cactus a couple months back because I saw a comment, but bait and switch like this makes we want to look for an actual open source solution.

HenryNdubuaku•4mo ago
The license change doesn’t affect you based on your explanation actually, the licence has been updated with clearer words. We really appreciate you as a user, please share any more feedback you have, thanks.
theturtletalks•4mo ago
I don’t appreciate you dismissing my claim. When I installed Cactus chat months ago, the company was claiming that Cactus chat would allow users to connect to other apps on their device and allow them to be controlled by AI.

Your license change goes against that. You say it’s free for personal use but how many times do people create something for personal use and monetize it later? What if I use Cactus chat to control a commercial app? Does that make Cactus chat use “commercial”?

observationist•4mo ago
Open source for the PR, then switching to non-open licensing is a cowardly, bullshit move.

https://github.com/cactus-compute/cactus/commit/b1b5650d1132...

Use open source and stick with it, or don't touch it at all, and tell any VC shitheels saying otherwise to pound sand.

If your business is so fragile or unoriginal that it can't survive being open source, then it will fail anyway. If you make it open source, embrace the ethos and build community, then your product or service will be stronger for it. If the big players clone your work, you get instant underdog credibility and notoriety.

HenryNdubuaku•4mo ago
Thanks for sharing your thoughts. Honestly, I’d be annoyed too and it might sound like an excuse, but our circumstance was quite unique, it was a difficult decision at that time being an open-source contributor myself.

It’s still free for the community, just that corporations need a license. Should we make this clearer in the license?

typpilol•4mo ago
Yes.

Just say that in the license.

HenryNdubuaku•4mo ago
Done, thanks, let us know anything else.
typpilol•4mo ago
Nice job on taking feedback.
pzo•4mo ago
They updated but not to something they write here sugarcoating like they only try to limit corporations abuse. It’s not that paid license is for corporations only, it’s still non commercial for everyone including community.
ls-a•4mo ago
Chatgpt could spit me the same optimizations they're doing in a few minutes. They're very generic optimizations that anyone who wants to work on mobile should do. Looks like they're planning to troll the competition with lawsuits using this license.
nextworddev•4mo ago
Will this drain my battery
HenryNdubuaku•4mo ago
This was one of the issues we set out to solve, so not as much as you’d expect.
apwell23•4mo ago
curious. what are the usecases for <100ms latency ?
HenryNdubuaku•4mo ago
Real-time video and audio inference.
cientifico•4mo ago
Can you clarify the following sentence:

> We are open-source (https://github.com/cactus-compute/cactus). Cactus is free for hobbyists and personal projects, with a paid license required for commercial use.

If it is open-source, one is free to distribute even for commercial use by definition. Which one is correct and what's your business model?

kvakkefly•4mo ago
Why do you believe open source means free to use and distribute commercially?
Cheer2171•4mo ago
Are you joking or just new? This is a foundational, bedrock principal of open source.

https://opensource.org/faq#commercial

cientifico•4mo ago
Because that’s literally the definition of open source:

> Open-source software is software released under a license where the copyright holder grants users the rights to use, study, change, and distribute the software and its source code, for any purpose.

That’s the first result you get on Google—and it’s exactly why so many companies relicensed their projects (Redis, HashiCorp, Elasticsearch, MongoDB…).

If it’s open source, you can sell it, host it, or give it away for free. The only difference is which obligations the license attaches:

GPL → you must keep the license.

AGPL → you must keep it and extend it to hosted services.

BSD/MIT → do almost whatever you want.

But the core right is always the same: distribute, host, and sell. Courts have even confirmed this is the accepted definition of “open source.”

ApolloRising•4mo ago
Would you consider adding a mode where it could go online if the user instructed it to?
HenryNdubuaku•4mo ago
You can add a web_search tool, checkout what these guys did with Cactus: https://anythingllm.com/mobile
ApolloRising•4mo ago
Thanks I will take a look.
asdfrgtfhgnjn•4mo ago
Wow made an account just for this! I was using cactus for a paid app i soft launched recently. Does that mean that now i cannot update this dependency? What is your pricing? I do not see that anywhere. If i did not notice this and pulled the updated version, would i be liable to be sued? Also, I implemented cactus on the good faith assumption that i was going to receive the updates in your roadmap, as a proper apache project. I must admit this is quite the move guys
HenryNdubuaku•4mo ago
Apologies for this, but you have nothing to worry about, no one is suing you. We are experimenting with the license and monetisation for corporations not indie developers. Please keep using Cactus the way you want, take this response as explicit permission while we go away and chew on your feedback.
giveita•4mo ago
Tried the android app but model download froze. Are you using the same docker-style repositories as Ollama. Because they suck. If you do I suggest use your own s3 instead.
HenryNdubuaku•4mo ago
We host on HuggingFace, were you able to get it to work eventually?
joseph4521•4mo ago
AI Dungeon should contact you to make an offline mode again.
HenryNdubuaku•4mo ago
Ok, looking forward to it!
ks2048•4mo ago
This paragraph is a bit confusing:

> While Cactus can be used for all Apple devices including Macbooks due to their design, for computers/AMD/Intel/Nvidia generally, please use HuggingFace, Llama.cpp, Ollama, vLLM, MLX. They're built for those, support x86, and are all great!

It reads like you're saying for all Apple devices (which would include iOS), use these other things.(?) For iOS, are you trying to beat performance of other options? If so, it would be helpful to include comparison benchmarks.

HenryNdubuaku•4mo ago
We are focused on phones and we did add some benchmarks and will add more. However, anyone can see performance for themselves with the repo directly.
ivape•4mo ago
Are you saying this would be more performant than Apple’s on device LLM/inferencing?
elpakal•4mo ago
Came here to ask about how they view Apple Foundation Models as a threat.

> guarantees privacy by default, works offline, and doesn't rack up a massive API bill at scale.

I’ve been really interested in on-device ML for most of my career, and now I wonder how valuable these benefits really are. LLM vendor APIs are pretty performant these days, security is security, and with an on-device model you have to provide updates every time a new model comes out.

HenryNdubuaku•4mo ago
You don’t have to bundle the weights as an asset, you can do over-the-air updates, new weights are simply downloaded.
elpakal•4mo ago
Neat, but not really addressing my point. My point is that you still need to roll out changes and LLM ApIs just work.
HenryNdubuaku•4mo ago
Valid question. Our perspective is that there can be multiple players, there are 7B devices to power, everyone will get a slice.
MrDrMcCoy•4mo ago
Question: can this utilize multiple forms of compute at once? Many phones have both GPUs that are capable of doing compute as well as NPUs, and that number will only increase. I'm sure it would be challenging, but that's a lot of performance to leave on the table if it can't do so already.

I am very curious what could be done with your impressive optimization on an rk3588, since it has pretty decent bits in all 3 categories, and am now seriously considering a Radxa Orion to play with this on :)

One more if you have a moment: will this be limited to text generation, or will it have audio and image capabilities as well? Would be neat to enable not only image generation, but also explore voice recognition, translation, computer vision, as well as image editing and enhancement features in mobile apps beyond what the big players daign to give us :)

rshemet•4mo ago
Yes! Cactus is optimized for mobile CPU inference and we're finishing internal testing of hybrid kernels that use the NPU, as well other chips.

We don't advise using GPUs on smartphones, since they're very energy-inefficient. Mobile GPU inference is actually the main driver behind the stereotype that "mobile inference drains your battery and heats up your phone".

Wrt to your last question – the short answer is yes, we'll have multimodal support. We currently support voice transcription and image understanding. We'll be expanding these capabilities to add more models, voice synthesis, and much more.

MrDrMcCoy•4mo ago
Very exciting, thanks!
nicktay•4mo ago
I built apps using Flutter and this project seems to make it possible to use models directly in app instead of cloud APIs. Curious about the commercial license here. How is the trade off between pricing and performance?
rshemet•4mo ago
indeed, this is exactly the goal! The license grants rights to commercial use, unlocks additional hardware acceleration, includes cloud telemetry, and offers significant savings over using cloud APIs.

In our deployments, we've seen open source models rival and even outperform lower-tier cloud counterparts. Happy to share some benchmarks if you like.

Our pricing is on a per-monthly-active-device basis, regardless of utilization. For voice-agent workflows, you typically hit savings as soon as you process over ≈2min of daily inference.

potato-peeler•4mo ago
Can it be fine tuned for a specific task?
HenryNdubuaku•4mo ago
Yes, you can fine-tune a model for any task, what do you have in mind?
negro_jj•4mo ago
Up date