frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Lemonade: Run LLMs Locally with GPU and NPU Acceleration

https://github.com/lemonade-sdk/lemonade
15•ramkrishna2910•5mo ago
Lemonade is an open-source SDK and local LLM server focused on making it easy to run and experiment with large language models (LLMs) on your own PC, with special acceleration paths for NPUs (Ryzen™ AI) and GPUs (Strix Halo and Radeon™).

Why?

There are three qualities needed in a local LLM serving stack, and none of the market leaders (Ollama, LM Studio, or using llama.cpp by itself) deliver all three: 1. Use the best backend for the user’s hardware, even if it means integrating multiple inference engines (llama.cpp, ONNXRuntime, etc.) or custom builds (e.g., llama.cpp with ROCm betas). 2. Zero friction for both users and developers from onboarding to apps integration to high performance. 3. Commitment to open source principles and collaborating in the community.

Lemonade Overview:

Simple LLM serving: Lemonade is a drop-in local server that presents an OpenAI-compatible API, so any app or tool that talks to OpenAI’s endpoints will “just work” with Lemonade’s local models. Performance focus: Powered by llama.cpp (Vulkan and ROCm for GPUs) and ONNXRuntime (Ryzen AI for NPUs and iGPUs), Lemonade squeezes the best out of your PC, no extra code or hacks needed. Cross-platform: One-click installer for Windows (with GUI), pip/source install for Linux. Bring your own models: Supports GGUFs and ONNX. Use Gemma, Llama, Qwen, Phi and others out-of-the-box. Easily manage, pull, and swap models. Complete SDK: Python API for LLM generation, and CLI for benchmarking/testing. Open source: Apache 2.0 (core server and SDK), no feature gating, no enterprise “gotchas.” All server/API logic and performance code is fully open; some software the NPU depends on is proprietary, but we strive for as much openness as possible (see our GitHub for details). Active collabs with GGML, Hugging Face, and ROCm/TheRock.

Get started:

Windows? Download the latest GUI installer from https://lemonade-server.ai/

Linux? Install with pip or from source (https://lemonade-server.ai/)

Docs: https://lemonade-server.ai/docs/

Discord for banter/support/feedback: https://discord.gg/5xXzkMu8Zk

How do you use it?

Click on lemonade-server from the start menu Open http://localhost:8000 in your browser for a web ui with chat, settings, and model management. Point any OpenAI-compatible app (chatbots, coding assistants, GUIs, etc.) at http://localhost:8000/api/v1 Use the CLI to run/load/manage models, monitor usage, and tweak settings such as temperature, top-p and top-k. Integrate via the Python API for direct access in your own apps or research.

Who is it for?

Developers: Integrate LLMs into your apps with standardized APIs and zero device-specific code, using popular tools and frameworks. LLM Enthusiasts, plug-and-play with: Morphik AI (contextual RAG/PDF Q&A) Open WebUI (modern local chat interfaces) Continue.dev (VS Code AI coding copilot) …and many more integrations in progress! Privacy-focused users: No cloud calls, run everything locally, including advanced multi-modal models if your hardware supports it.

Why does this matter?

Every month, new on-device models (e.g., Qwen3 MOEs and Gemma 3) are getting closer to the capabilities of cloud LLMs. We predict a lot of LLM use will move local for cost reasons alone. Keeping your data and AI workflows on your own hardware is finally practical, fast, and private, no vendor lock-in, no ongoing API fees, and no sending your sensitive info to remote servers. Lemonade lowers friction for running these next-gen models, whether you want to experiment, build, or deploy at the edge. Would love your feedback! Are you running LLMs on AMD hardware? What’s missing, what’s broken, what would you like to see next? Any pain points from Ollama, LM Studio, or others you wish we solved? Share your stories, questions, or rant at us.

Links:

Download & Docs: https://lemonade-server.ai/

GitHub: https://github.com/lemonade-sdk/lemonade

Discord: https://discord.gg/5xXzkMu8Zk

Thanks HN!

NASA now allowing astronauts to bring their smartphones on space missions

https://twitter.com/NASAAdmin/status/2019259382962307393
1•gbugniot•2m ago•0 comments

Claude Code Is the Inflection Point

https://newsletter.semianalysis.com/p/claude-code-is-the-inflection-point
1•throwaw12•3m ago•0 comments

MicroClaw – Agentic AI Assistant for Telegram, Built in Rust

https://github.com/microclaw/microclaw
1•everettjf•4m ago•2 comments

Show HN: Omni-BLAS – 4x faster matrix multiplication via Monte Carlo sampling

https://github.com/AleatorAI/OMNI-BLAS
1•LowSpecEng•4m ago•1 comments

The AI-Ready Software Developer: Conclusion – Same Game, Different Dice

https://codemanship.wordpress.com/2026/01/05/the-ai-ready-software-developer-conclusion-same-game...
1•lifeisstillgood•6m ago•0 comments

AI Agent Automates Google Stock Analysis from Financial Reports

https://pardusai.org/view/54c6646b9e273bbe103b76256a91a7f30da624062a8a6eeb16febfe403efd078
1•JasonHEIN•10m ago•0 comments

Voxtral Realtime 4B Pure C Implementation

https://github.com/antirez/voxtral.c
1•andreabat•12m ago•0 comments

I Was Trapped in Chinese Mafia Crypto Slavery [video]

https://www.youtube.com/watch?v=zOcNaWmmn0A
1•mgh2•18m ago•0 comments

U.S. CBP Reported Employee Arrests (FY2020 – FYTD)

https://www.cbp.gov/newsroom/stats/reported-employee-arrests
1•ludicrousdispla•20m ago•0 comments

Show HN: I built a free UCP checker – see if AI agents can find your store

https://ucphub.ai/ucp-store-check/
2•vladeta•25m ago•1 comments

Show HN: SVGV – A Real-Time Vector Video Format for Budget Hardware

https://github.com/thealidev/VectorVision-SVGV
1•thealidev•27m ago•0 comments

Study of 150 developers shows AI generated code no harder to maintain long term

https://www.youtube.com/watch?v=b9EbCb5A408
1•lifeisstillgood•27m ago•0 comments

Spotify now requires premium accounts for developer mode API access

https://www.neowin.net/news/spotify-now-requires-premium-accounts-for-developer-mode-api-access/
1•bundie•30m ago•0 comments

When Albert Einstein Moved to Princeton

https://twitter.com/Math_files/status/2020017485815456224
1•keepamovin•31m ago•0 comments

Agents.md as a Dark Signal

https://joshmock.com/post/2026-agents-md-as-a-dark-signal/
2•birdculture•33m ago•0 comments

System time, clocks, and their syncing in macOS

https://eclecticlight.co/2025/05/21/system-time-clocks-and-their-syncing-in-macos/
1•fanf2•35m ago•0 comments

McCLIM and 7GUIs – Part 1: The Counter

https://turtleware.eu/posts/McCLIM-and-7GUIs---Part-1-The-Counter.html
2•ramenbytes•37m ago•0 comments

So whats the next word, then? Almost-no-math intro to transformer models

https://matthias-kainer.de/blog/posts/so-whats-the-next-word-then-/
1•oesimania•38m ago•0 comments

Ed Zitron: The Hater's Guide to Microsoft

https://bsky.app/profile/edzitron.com/post/3me7ibeym2c2n
2•vintagedave•41m ago•1 comments

UK infants ill after drinking contaminated baby formula of Nestle and Danone

https://www.bbc.com/news/articles/c931rxnwn3lo
1•__natty__•42m ago•0 comments

Show HN: Android-based audio player for seniors – Homer Audio Player

https://homeraudioplayer.app
3•cinusek•42m ago•1 comments

Starter Template for Ory Kratos

https://github.com/Samuelk0nrad/docker-ory
1•samuel_0xK•44m ago•0 comments

LLMs are powerful, but enterprises are deterministic by nature

2•prateekdalal•47m ago•0 comments

Make your iPad 3 a touchscreen for your computer

https://github.com/lemonjesus/ipad-touch-screen
2•0y•53m ago•1 comments

Internationalization and Localization in the Age of Agents

https://myblog.ru/internationalization-and-localization-in-the-age-of-agents
1•xenator•53m ago•0 comments

Building a Custom Clawdbot Workflow to Automate Website Creation

https://seedance2api.org/
1•pekingzcc•56m ago•1 comments

Why the "Taiwan Dome" won't survive a Chinese attack

https://www.lowyinstitute.org/the-interpreter/why-taiwan-dome-won-t-survive-chinese-attack
2•ryan_j_naughton•56m ago•0 comments

Xkcd: Game AIs

https://xkcd.com/1002/
2•ravenical•57m ago•0 comments

Windows 11 is finally killing off legacy printer drivers in 2026

https://www.windowscentral.com/microsoft/windows-11/windows-11-finally-pulls-the-plug-on-legacy-p...
1•ValdikSS•58m ago•0 comments

From Offloading to Engagement (Study on Generative AI)

https://www.mdpi.com/2306-5729/10/11/172
1•boshomi•1h ago•1 comments