frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Small hardware box that runs local LLMs and exposes an OpenAI API

https://axis-one-psi.vercel.app/
2•mjupp1•1h ago
I’ve been building a small hardware box that runs local LLMs like Mistral, Qwen and Llama, and exposes an OpenAI compatible API on your local network. There’s no cloud, no login system and no telemetry. I built it because a lot of small firms want ChatGPTbstyle tools but can’t use cloud AI for privacy or compliance reasons, and most don’t want to deal with GPU servers, drivers, Docker or model configs. The aim is to make local AI feel as simple as plugging in a router.

Right now the box boots into a very simple web UI where you choose a model and start using it. The API follows the OpenAI format for chat completions and embeddings. It can run different models depending on the hardware you pick, either a Jetson Orin Nano or an x86 mini-PC with a GPU. It stores data locally, supports basic RAG indexing and only exposes itself on the LAN by default.

A few things still aren’t working. There’s no multi-user rate limiting yet. The RAG quality is basic and I’m still improving chunking and reranking. The Orin runs hot under heavy load, so thermal performance needs work. It’s also still a prototype rather than a finished consumer product.

On the technical side, it runs containerized model servers using Ollama and some custom runners. Models load through GGUF or TensorRT-LLM depending on the hardware. The API layer follows the OpenAI spec. The RAG pipeline uses local embeddings and a vector database. The software stack is a mix of TypeScript and Python.

I’m looking for feedback from anyone who has built or deployed local inference before. I’m trying to understand what thermal and power issues you’ve run into, whether a drop-in OpenAI compatible box is actually useful to small teams, what hardware setups I should consider, and any honest critiques of the idea.

Apakah Tokopedia Punya WhatsApp

1•spegispecial•1m ago•0 comments

Thunderbird Adds Native Microsoft Exchange Email Support

https://blog.thunderbird.net/2025/11/thunderbird-adds-native-microsoft-exchange-email-support/
1•babolivier•2m ago•0 comments

Berapakah Nomor Tokopedia

1•spegispecial•2m ago•0 comments

Show HN: We built an easy way to sync multiple Google Calendars

https://gcalsync.app
1•aggarwalachal•2m ago•0 comments

Unofficial, rules-compliant, browser based Arkham Horror: The Card Game

https://arkhamhorror.app
1•bramadityaw•2m ago•1 comments

Berapakah Wa Tokopedia

1•spegispecial•3m ago•0 comments

ZK-KYC

https://zkp.com/
1•karenkhine•4m ago•1 comments

Show HN: Excel Custom Functions in Zig

https://github.com/AlexJReid/zigxll
1•alexjreid•5m ago•0 comments

Unity and Epic Games Together Advance the Open, Interoperable Future

https://unity.com/news/unity-and-epic-games-together-advance-open-interoperable-future-video-gaming
1•fidotron•7m ago•0 comments

Art Institute of Chicago Guts Video Data Bank Staff, Sparking Outcry

https://news.artnet.com/art-world/school-of-art-institute-of-chicagos-video-data-bank-2714672
1•xbryanx•9m ago•0 comments

AutoSubSync – Synchronize subtitles automatically or manually

https://github.com/denizsafak/AutoSubSync
2•denizsafak•10m ago•1 comments

I created an AI logistics marketplace in Manhattan

https://www.laborhutt.com
1•blessedtrails•15m ago•0 comments

AI adoption needs light, not hope

https://world.hey.com/joaoqalves/ai-adoption-needs-light-not-hope-5d7b4cc4
1•speckx•15m ago•0 comments

Energy resilience key for Taiwan: former Intel CEO

https://www.taipeitimes.com/News/biz/archives/2025/11/19/2003847427
1•keepamovin•16m ago•0 comments

Paola Nagni

https://paolapiseddunagni1.substack.com/p/paola-piseddu-nagni-certified-italian
1•PaolaNagni•18m ago•0 comments

Beyond the Vector API – A Quest for a Lower Level API [JVMLS]

https://inside.java/2025/11/16/jvmls-vector-api/
1•lichtenberger•23m ago•0 comments

New magnetic component discovered in Faraday effect after nearly two centuries

https://phys.org/news/2025-11-magnetic-component-faraday-effect-centuries.html
2•pseudolus•24m ago•0 comments

Open source: what do we think?

https://cal.com/blog/open-source
1•FinnLobsien•25m ago•0 comments

Canada announces massive jump in funding to European Space Agency

https://www.reuters.com/business/canada-announces-massive-jump-funding-european-space-agency-2025...
1•saubeidl•26m ago•0 comments

Learning to Boot from PXE

https://blog.imraniqbal.org/learning-to-boot-from-pxe/
2•speckx•29m ago•1 comments

Rentay

https://www.rentay.dk/
1•bellamoon544•33m ago•0 comments

Show HN: Godantic – JSON Schema and Validation for Go LLM Apps

https://github.com/deepankarm/godantic
2•deepankarm44•38m ago•0 comments

Toon for Oracle: A Token-Efficient Data Format for LLMs

https://hartenfeller.dev/blog/oracle-toon-implementation
1•speckx•41m ago•0 comments

Gemini 3, Winners and Losers, Integration and the Enterprise

https://stratechery.com/2025/gemini-3-winners-and-losers-integration-and-the-enterprise/
1•feross•46m ago•0 comments

JSON to Toon

https://jsontoon.com
2•xbaicai•51m ago•2 comments

Show HN: Synch- an AI dating app with an emotionally intelligent coach

https://synch.coach/
1•emrekuc•51m ago•0 comments

Quadratic Gravity

https://www.quantamagazine.org/old-ghost-theory-of-quantum-gravity-makes-a-comeback-20251117/
2•inshard•53m ago•0 comments

Airbags, and How Mercedes-Benz Hacked Your Hearing

https://hackaday.com/2025/10/06/how-mercedes-benz-hacked-your-hearing/
1•doener•56m ago•0 comments

Loose Wire on Carrier Dali Lead to Blackouts, Contact with Baltimore's Bridge

https://www.ntsb.gov:443/news/press-releases/Pages/NR20251118.aspx
1•jacquesm•56m ago•0 comments

Crypto Mining ASIC Goes Deep Sub-Threshold on 3 Nm

https://www.eetimes.com/crypto-mining-asic-goes-deep-sub-threshold-on-3-nm/
1•JoachimS•56m ago•0 comments