frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

Open in hackernews

Show HN: Lemonade: Run LLMs Locally with GPU and NPU Acceleration

https://github.com/lemonade-sdk/lemonade
12•ramkrishna2910•1h ago
Lemonade is an open-source SDK and local LLM server focused on making it easy to run and experiment with large language models (LLMs) on your own PC, with special acceleration paths for NPUs (Ryzen™ AI) and GPUs (Strix Halo and Radeon™).

Why?

There are three qualities needed in a local LLM serving stack, and none of the market leaders (Ollama, LM Studio, or using llama.cpp by itself) deliver all three: 1. Use the best backend for the user’s hardware, even if it means integrating multiple inference engines (llama.cpp, ONNXRuntime, etc.) or custom builds (e.g., llama.cpp with ROCm betas). 2. Zero friction for both users and developers from onboarding to apps integration to high performance. 3. Commitment to open source principles and collaborating in the community.

Lemonade Overview:

Simple LLM serving: Lemonade is a drop-in local server that presents an OpenAI-compatible API, so any app or tool that talks to OpenAI’s endpoints will “just work” with Lemonade’s local models. Performance focus: Powered by llama.cpp (Vulkan and ROCm for GPUs) and ONNXRuntime (Ryzen AI for NPUs and iGPUs), Lemonade squeezes the best out of your PC, no extra code or hacks needed. Cross-platform: One-click installer for Windows (with GUI), pip/source install for Linux. Bring your own models: Supports GGUFs and ONNX. Use Gemma, Llama, Qwen, Phi and others out-of-the-box. Easily manage, pull, and swap models. Complete SDK: Python API for LLM generation, and CLI for benchmarking/testing. Open source: Apache 2.0 (core server and SDK), no feature gating, no enterprise “gotchas.” All server/API logic and performance code is fully open; some software the NPU depends on is proprietary, but we strive for as much openness as possible (see our GitHub for details). Active collabs with GGML, Hugging Face, and ROCm/TheRock.

Get started:

Windows? Download the latest GUI installer from https://lemonade-server.ai/

Linux? Install with pip or from source (https://lemonade-server.ai/)

Docs: https://lemonade-server.ai/docs/

Discord for banter/support/feedback: https://discord.gg/5xXzkMu8Zk

How do you use it?

Click on lemonade-server from the start menu Open http://localhost:8000 in your browser for a web ui with chat, settings, and model management. Point any OpenAI-compatible app (chatbots, coding assistants, GUIs, etc.) at http://localhost:8000/api/v1 Use the CLI to run/load/manage models, monitor usage, and tweak settings such as temperature, top-p and top-k. Integrate via the Python API for direct access in your own apps or research.

Who is it for?

Developers: Integrate LLMs into your apps with standardized APIs and zero device-specific code, using popular tools and frameworks. LLM Enthusiasts, plug-and-play with: Morphik AI (contextual RAG/PDF Q&A) Open WebUI (modern local chat interfaces) Continue.dev (VS Code AI coding copilot) …and many more integrations in progress! Privacy-focused users: No cloud calls, run everything locally, including advanced multi-modal models if your hardware supports it.

Why does this matter?

Every month, new on-device models (e.g., Qwen3 MOEs and Gemma 3) are getting closer to the capabilities of cloud LLMs. We predict a lot of LLM use will move local for cost reasons alone. Keeping your data and AI workflows on your own hardware is finally practical, fast, and private, no vendor lock-in, no ongoing API fees, and no sending your sensitive info to remote servers. Lemonade lowers friction for running these next-gen models, whether you want to experiment, build, or deploy at the edge. Would love your feedback! Are you running LLMs on AMD hardware? What’s missing, what’s broken, what would you like to see next? Any pain points from Ollama, LM Studio, or others you wish we solved? Share your stories, questions, or rant at us.

Links:

Download & Docs: https://lemonade-server.ai/

GitHub: https://github.com/lemonade-sdk/lemonade

Discord: https://discord.gg/5xXzkMu8Zk

Thanks HN!

Phoenix LiveView Colocated Hooks and JavaScript

https://elixircasts.io/liveview-colocated-hooks
1•alekx•2m ago•0 comments

RAG isn't dead, the bar has gone up

https://www.tensorlake.ai/blog/advanced-rag
1•diptanu•3m ago•0 comments

Aldus Corporation (1984)

https://it.wikipedia.org/wiki/Aldus_Corporation
1•maremmano•3m ago•0 comments

Pinned Device Memory Patches for Intel's Multi-GPU Battlematrix Linux Efforts

https://www.phoronix.com/news/Intel-Pinned-Device-Memory
1•losgehts•4m ago•0 comments

Po-Shen Loh on Building Thoughtfulness, Empathy, and Strong Networks in AI Era

https://toolong.link/v?w=xWYb7tImErI&l=en
1•androng•5m ago•0 comments

Stop Paywalling SSO: It Is a Basic Right, Not an Enterprise Perk

https://oneuptime.com/blog/post/2025-08-19-sso-is-a-security-basic-not-an-enterprise-perk/view
2•ndhandala•6m ago•0 comments

Blinking a Light with Ping at 1HZ (2017)

https://hackaday.com/2017/07/06/blinking-a-light-with-ping/
1•z-mach9•7m ago•0 comments

The Cassette Recorder That Went to the Moon

https://obsoletesony.substack.com/p/the-cassette-recorder-that-went-to
1•Michelangelo11•10m ago•0 comments

American Exceptionalism Acquisition Corp

https://www.sec.gov/ix?doc=/Archives/edgar/data/0002079173/000119312525182758/d38750ds1.htm
2•petethomas•13m ago•1 comments

Just One More Prompt

https://steipete.me/posts/just-one-more-prompt
1•jshchnz•15m ago•0 comments

"Things are a bit bumpy right now" – FreeBSD:15 pkg repo is essentially down

https://lists.freebsd.org/archives/freebsd-current/2025-August/008458.html
1•luckman212•15m ago•1 comments

Docker container for running Claude Code in "dangerously skip permissions" mode

https://github.com/tintinweb/claude-code-container
1•Luc•17m ago•1 comments

UV-light method cuts computer chip manufacturing steps in half

https://techxplore.com/news/2025-07-uv-method-chip.html
1•PaulHoule•19m ago•0 comments

Away from Capitol Hill, a Kentucky lawmaker lives off the grid

https://spectrumnews1.com/ky/louisville/news/2025/08/13/thomas-massie-home-kentucky-
1•rami•19m ago•0 comments

Built a back end service to help companies manage multiple ML models

1•DhirajSinghJr•20m ago•0 comments

Many Are Focused on the Wrong Questions When It Comes to AI

https://www.aclu.org/news/civil-liberties/many-are-focused-on-the-wrong-questions-when-it-comes-to-ai
3•stareatgoats•21m ago•2 comments

Britain's AI strategy: the risk that it is dependency dressed up in digital hype

https://www.theguardian.com/commentisfree/2025/aug/18/the-guardian-view-on-britains-ai-strategy-the-risk-is-that-it-is-dependency-dressed-up-in-digital-hype
1•drankl•22m ago•0 comments

The Hacker's Renaissance: A Manifesto Reborn

https://phrack.org/issues/72/19#article
2•_Microft•24m ago•0 comments

China's Meituan launches in Brazil, taking on iFood and Uber

https://restofworld.org/2025/meituan-brazil-launch-food-delivery/
1•colinprince•25m ago•0 comments

Misago is fully featured modern forum that is fast/scalable/responsive

https://github.com/rafalp/Misago
1•indigodaddy•25m ago•0 comments

From M1 MacBook to Arch Linux: A month-long experiment that became permanenent

https://www.ssp.sh/blog/macbook-to-arch-linux-omarchy/
7•articsputnik•26m ago•2 comments

How can England possibly be running out of water?

https://www.theguardian.com/news/ng-interactive/2025/aug/17/how-can-england-possibly-be-running-out-of-water
1•xrayarx•28m ago•0 comments

End-to-end encryption coming to iOS-Android RCS chats as soon as next month

https://www.phonearena.com/news/end-to-end-encryption-securing-ios-android-rcs-chats-could-be-weeks-away_id173353
2•mikece•30m ago•0 comments

Show HN: Wake word detection with custom phrases without model training

https://github.com/st-matskevich/local-wake
1•st-matskevich•31m ago•0 comments

OSHA: Proposed Rule to Revise Asbestos Respirators Requirements

https://www.regulations.gov/document/OSHA-2025-0024-0002
1•impish9208•33m ago•0 comments

Ask HN: Raising high level vision concerns as a junior, internal org takeover?

1•jamboca•33m ago•2 comments

Fuzzing Hardware Like Software (2021)

https://arxiv.org/abs/2102.02308
1•imakwana•35m ago•0 comments

Blobdrop: Drag and drop files directly out of the terminal

https://github.com/vimpostor/blobdrop
1•LorenDB•36m ago•0 comments

Mexico's welfare policies helped 13.4M people out of poverty

https://www.theguardian.com/world/2025/aug/18/mexico-welfare-policies-amlo
2•worik•36m ago•4 comments

Reddit Backdoor: How Google and ChatGPT's Exclusive Access Is Rigging the Game

https://www.generative-engine.org/the-reddit-backdoor-how-google-and-chatgpt-s-exclusive-data--1755623178833
2•flixing•36m ago•4 comments