frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

Show HN: Shoggoth Mini – A soft tentacle robot powered by GPT-4o and RL

https://www.matthieulc.com/posts/shoggoth-mini
350•cataPhil•9h ago•71 comments

Show HN: Beyond Z²+C, Plot Any Fractal

https://www.juliascope.com/
66•akunzler•6h ago•21 comments

Show HN: I built this to talk Danish to my girlfriend – works with any language

https://menerdu.vercel.app/
167•lil_csom•2d ago•97 comments

Show HN: We made our own inference engine for Apple Silicon

https://github.com/trymirai/uzu
147•darkolorin•13h ago•42 comments

Show HN: I built a tool to sync localStorage between devices

https://htmlsync.io
4•meistertigran•2h ago•4 comments

Show HN: VS Code extension to edit the filesystem like a text buffer

https://github.com/ahrm/voil
59•hexomancer•2d ago•45 comments

Show HN: The Card Caddie, free tool to optimize credit card points

https://www.thecardcaddie.com/
4•hg30•5h ago•4 comments

Show HN: Encode Base64

https://encodebase64.io/
3•artiomyak•8h ago•2 comments

Show HN: Bedrock – An 8-bit computing system for running programs anywhere

https://benbridle.com/projects/bedrock.html
210•benbridle•5d ago•57 comments

Show HN: The HTML Maze – Escape an eerie labyrinth built with HTML pages

https://htmlmaze.com/
52•kyrylo•1d ago•14 comments

Show HN: Ten years of running every day, visualized

https://nodaysoff.run
913•friggeri•5d ago•474 comments

Show HN: ArchGW – An intelligent edge and service proxy for agents

https://github.com/katanemo/archgw/
113•honorable_coder•3d ago•15 comments

Show HN: Cogency – Cognitive Architecture for AI Agents

https://github.com/iteebz/cogency
19•cogencyai•3d ago•4 comments

Show HN: Pagy 2.0, a free drag-and-drop website builder

https://pagy.co
5•hernansartorio•7h ago•4 comments

Show HN: Mochi Invaders – Like Space Invaders but for Practicing Japanese Kana

https://xenodium.com/mochi-invaders-now-on-the-app-store
3•xenodium•7h ago•0 comments

Show HN: FFmpeg in plain English – LLM-assisted FFmpeg in the browser

https://vidmix.app/ffmpeg-in-plain-english/
170•bjano•5d ago•46 comments

Show HN: BotBudget – AI Agent Cost Calculator

https://botbudget.com/calculator
3•jakejohnson•8h ago•0 comments

Show HN: Refine – A Local Alternative to Grammarly

https://refine.sh
394•runjuu•1d ago•204 comments

Show HN: RooAGI's Roo-VectorDB: A New PostgreSQL Extension for Vector Search

https://github.com/RooAGI/Roo-VectorDB
3•rooagi•9h ago•0 comments

Show HN: Minesweeper game I built to be real-time Multiplayer

https://www.minesweeperpro.com/?v=2.1
17•bluelegacy•17h ago•5 comments

Show HN: A Raycast-compatible launcher for Linux

https://github.com/ByteAtATime/raycast-linux
189•ByteAtATime•2d ago•60 comments

Show HN: RAGsplain – What does your RAG model see before it answers?

https://www.ragsplain.com/
4•fredthedeve•11h ago•2 comments

Show HN: ProjectD – Google Drive-based, AES-encrypted C2 in C/C++

https://github.com/BernKing/ProjectD
2•bernking•12h ago•0 comments

Show HN: StartupList EU – A public directory of European startups

https://www.startup-list.eu
11•umbertotancorre•1d ago•4 comments

Show HN: Learn LLMs LeetCode Style

https://github.com/Exorust/TorchLeet
176•Exorust•2d ago•22 comments

Show HN: I bulit Kanba, open source alternative to Trello, self-hostable PM tool

3•uaghazade•5h ago•0 comments

Show HN: Compare Speech APIs Live (OpenAI, Google, Deepgram, Soniox, etc.)

https://soniox.com/compare/
6•easwee•14h ago•1 comments

Show HN: Notsc – A CLI to Scaffold Node.js and TypeScript API Projects

https://www.npmjs.com/package/notsc
2•cedricahenkorah•16h ago•0 comments

Show HN: I built an LLM chat app because we shouldn't need 10 AI subscriptions

https://prismharmony.com/chat
56•maniknt28•2d ago•64 comments

Show HN: Weekday clock, a clock for people who dont work or go to school

https://weekdayclock.1link.fun
4•wenjian•17h ago•3 comments
Open in hackernews

Show HN: We made our own inference engine for Apple Silicon

https://github.com/trymirai/uzu
147•darkolorin•13h ago
We wrote our inference engine on Rust, it is faster than llama cpp in all of the use cases. Your feedback is very welcomed. Written from scratch with idea that you can add support of any kernel and platform.

Comments

sharifulin•10h ago
Wow! Sounds super interesting
slavasmirnov•10h ago
that’s exactly we are looking for not to waste on apis. Wonder how significant trade offs are
TheMagicHorsey•10h ago
Amazing!

How was your experience using Rust on this project? I'm considering a project in an adjacent space and I'm trying to decide between Rust, C, and Zig. Rust seems a bit burdensome with its complexity compared to C and Zig. Reminds me of C++ in its complexity (although not as bad). I find it difficult to walk through and understand a complicated Rust repository. I don't have that problem with C and Zig for the most part.

But I'm wondering if I just need to invest more time in Rust. How was your learning curve with the language?

adastra22•9h ago
You are confusing familiarity with intrinsic complexity. I have 20 years experience with C/C++ before switching to rust a few years ago. After the initial hurdle, it is way easier and very simple to follow.
ednevsky•10h ago
nice
ewuhic•10h ago
>faster than llama cpp in all of the use cases

What's your deliberate, well-thought roadmap for achieving adoption similar to llama cpp?

pants2•10h ago
Probably getting acquired by Apple :)
khurs•2h ago
Ollama is the leader isn't it?

Brew stats (downloads last 30 days)

Ollama - 28,232 Lama.cpp - 7,826

mintflow•10h ago
just curios, will it be supported on iOS, it would be great to build local llm app with this project.
AlekseiSavin•10h ago
already) https://github.com/trymirai/uzu-swift
cwlcwlcwlingg•10h ago
Wondering why use Rust other than C++
adastra22•9h ago
Why use C++?
khurs•2h ago
So C++ users don't need to learn something new.
bee_rider•8h ago
I wonder why they didn’t use Fortran.
giancarlostoro•6h ago
...or D? or Go? or Java? C#? Zig? etc they chose what they were most comfortable with. Rust is fine, it's not for everyone clearly, but those who use it produce high quality software, I would argue similar with Go, without all the unnecessary mental overhead of C or C++
outworlder•6h ago
Why use C++ for greenfield projects?
khurs•2h ago
The recommendation from the security agencies is to prefer Rust over C++ as less risk of exploits.

Checked and Lama.cpp used C++ (obviously) and Llama uses Go.

greggh•10h ago
"trymirai", every time I hear the word Mirai I think of the large IOT DDoS botnet. Maybe it's just me though.
fnord77•6h ago
I think of the goofy Toyota fuel cell car. I think a grand total of about 6 have been sold (leased) in california
rnxrx•10h ago
I'm curious about why the performance gains mentioned were so substantial for Qwen vs Llama?
AlekseiSavin•9h ago
it looks like llama.cpp has some performance issues with bf16
homarp•9h ago
Can you explain the type of quantization you support?

would https://docs.unsloth.ai/basics/kimi-k2-how-to-run-locally be faster with mirai?

AlekseiSavin•9h ago
right now, we support AWQ but are currently working on various quantization methods in https://github.com/trymirai/lalamo
smpanaro•9h ago
In practice, how often do the models use the ANE? It sounds like you are optimizing for speed which in my experience always favors GPU.
AlekseiSavin•9h ago
You're right, modern edge devices are powerful enough to run small models, so the real bottleneck for a forward pass is usually memory bandwidth, which defines the upper theoretical limit for inference speed. Right now, we've figured out how to run computations in a granular way on specific processing units, but we expect the real benefits to come later when we add support for VLMs and advanced speculative decoding, where you process more than one token at a time
J_Shelby_J•8h ago
VLMs = very large models?
mmorse1217•8h ago
Probably vision language models.
skybrian•9h ago
What are the units on the benchmark results? I’m guessing higher is better?
AlekseiSavin•9h ago
yeah, tokens per second
dcreater•9h ago
Somewhat faster on small models. Requires new format.

Not sure what the goal is for this project? Not seeing how this presents adequate benefits to get adopted by the community

koakuma-chan•8h ago
Written in Rust is a big one for me.
worldsavior•7h ago
It's utilizing Apple ANE and probably other optimization tools provided by Apple's framework. Not sure if llama.cpp uses them, but if they're not then the benchmark on GitHub says it all.
zdw•9h ago
How does this bench compared to MLX?
jasonjmcghee•8h ago
I use MLX in lmstudio and it doesn't have whatever issues llama cpp is showing here.

Qwen3-0.6B at 5 t/s doesn't make any sense. Something is clearly wrong for that specific model.

giancarlostoro•6h ago
Hoping the author can answer, I'm still learning about how this all works. My understanding is that inference is "using the model" so to speak. How is this faster than established inference engines specifically on Mac? Are models generic enough that if you build e.g. an inference engine focused on AMD GPUs or even Intel GPUs, would they achieve reasonable performance? I always assumed because Nvidia is king of AI that you had to suck it up, or is it just that most inference engines being used are married to Nvidia?

I would love to understand how universal these models can become.

darkolorin•4h ago
Basically “faster” means better performance e.g. tokens/s without loosing quality (benchmarks scores for models). So when we say faster we provide more tokens per second than llama cpp. That means we effectively utilize hardware API available (for example we wrote our own kernels) to perform better.
nodesocket•6h ago
I just spun up a AWS EC2 g6.xlarge instance to do some llm work. The GPU is NVIDIA L4 24GB and costs $0.8048/per hour. Starting to think about switching to an Apple mac2-m2.metal instance for $0.878/ per hour. Big question is the Mac instance only has 24GB of unified memory.
khurs•2h ago
Unified memory doesn't compare to a Nvidia GPU, the latter is much better.

Just depends on what performance level you need.

floam•5h ago
How does this compare to https://github.com/Anemll/Anemll?
zackangelo•4h ago
We also wrote our inference engine in rust for mixlayer, happy to answer any questions from those trying to do the same.

Looks like this uses ndarray and mpsgraph (which I did not know about!), we opted to use candle instead.

khurs•2h ago
Have you added it to HomeBrew and other package managers yet?

Also any app deployed to PROD but developed on Mac need to be consistent i.e. work on Linux/in container.

woadwarrior01•1h ago
Needs an "API key".

https://github.com/trymirai/uzu-swift?tab=readme-ov-file#qui...