frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: LLMOne – Deploy LLMs from bare metal to production in hours

https://github.com/EM-GeekLab/LLMOne
5•pescn•7mo ago
I spent days trying to deploy DeepSeek on a server this year. Install Ubuntu, NVIDIA drivers, CUDA, Docker, configure vLLM, debug memory issues, tune performance settings. Every deployment was different. Every server had its own quirks. Worse still, these issues are more pronounced on non-NVIDIA accelerators, such as Ascend or Intel NPU.

So, we made LLMOne, which will automates this. You can use it at bare metal (via BMC) or SSH (coming soon) into an existing server, select models, and it handles everything: OS installation, driver setup, inference engine configuration, model deployment, and deploy applications such as Open WebUI or Dify.

The code is open source (Mulan PSL v2, like Apache 2.0). No vendor lock-in.

There is a User Tutorial Video: https://youtu.be/P4MgIPW5K70

How it works:

1. Uses BMC (Redfish) to remotely install OS on bare metal, but not PXE (without DCHP Server configure) 2. Installs appropriate drivers (NVIDIA, Huawei Ascend, etc.) 3. Sets up containers and inference engines (vLLM, MindIE or OpenVINO - picks the right one) 4. Deploys models and runs performance benchmarks 5. Can also deploy apps like OpenWebUI, Dify alongside the models

The whole process runs unattended. What used to take me 2-3 days of tweaking now finishes in 1-2 hours.

Technical bits:

We avoid Kubernetes entirely - found it adds complexity without much benefit for single-node LLM deployments. Everything runs in Docker containers with custom orchestration.

The BMC integration was tricky. Different servers expose different Redfish capabilities, so we built adapters for some vendors, such as iDRAC from Dell and iBMC from Huawei.

Performance varies by hardware, but we've seen ~2200 tokens/sec on RTX 4090 with TensorRT-LLM backend, ~1900 with vLLM. The system runs Evalscope benchmarks automatically so you know what you're getting.

Why this exists:

We work with chip vendors and AI server resellers. When servers arrive at customer sites, instead of a multi-person support team spending days on deployment and debugging, one man can use this tool to get everything running.

While we focus on LLM deployment, the tech stack can actually deploy anything from bare OS to complex software stacks. The automation layer is generic enough for various workloads.

Current limitations:

For BMC support, we currently only support Dell iDRAC and Huawei iBMC. We're working on Supermicro support. We'd love to expand to other server vendors but need hardware access or Redfish Mock for testing and development.

SSH Mode and Apple Silicon Support is coming soon

Looking for feedback. Also, if you're a server vendor and can provide BMC access for testing, we'd appreciate the help expanding hardware support.

Rome is studded with cannon balls (2022)

https://essenceofrome.com/rome-is-studded-with-cannon-balls
1•thomassmith65•2m ago•0 comments

8-piece tablebase development on Lichess (op1 partial)

https://lichess.org/@/Lichess/blog/op1-partial-8-piece-tablebase-available/1ptPBDpC
1•somethingp•4m ago•0 comments

US to bankroll far-right think tanks in Europe against digital laws

https://www.brusselstimes.com/1957195/us-to-fund-far-right-forces-in-europe-tbtb
2•saubeidl•5m ago•0 comments

Ask HN: Have AI companies replaced their own SaaS usage with agents?

1•tuxpenguine•8m ago•0 comments

pi-nes

https://twitter.com/thomasmustier/status/2018362041506132205
1•tosh•10m ago•0 comments

Show HN: Crew – Multi-agent orchestration tool for AI-assisted development

https://github.com/garnetliu/crew
1•gl2334•10m ago•0 comments

New hire fixed a problem so fast, their boss left to become a yoga instructor

https://www.theregister.com/2026/02/06/on_call/
1•Brajeshwar•12m ago•0 comments

Four horsemen of the AI-pocalypse line up capex bigger than Israel's GDP

https://www.theregister.com/2026/02/06/ai_capex_plans/
1•Brajeshwar•12m ago•0 comments

A free Dynamic QR Code generator (no expiring links)

https://free-dynamic-qr-generator.com/
1•nookeshkarri7•13m ago•1 comments

nextTick but for React.js

https://suhaotian.github.io/use-next-tick/
1•jeremy_su•14m ago•0 comments

Show HN: I Built an AI-Powered Pull Request Review Tool

https://github.com/HighGarden-Studio/HighReview
1•highgarden•15m ago•0 comments

Git-am applies commit message diffs

https://lore.kernel.org/git/bcqvh7ahjjgzpgxwnr4kh3hfkksfruf54refyry3ha7qk7dldf@fij5calmscvm/
1•rkta•17m ago•0 comments

ClawEmail: 1min setup for OpenClaw agents with Gmail, Docs

https://clawemail.com
1•aleks5678•24m ago•1 comments

UnAutomating the Economy: More Labor but at What Cost?

https://www.greshm.org/blog/unautomating-the-economy/
1•Suncho•31m ago•1 comments

Show HN: Gettorr – Stream magnet links in the browser via WebRTC (no install)

https://gettorr.com/
1•BenaouidateMed•32m ago•0 comments

Statin drugs safer than previously thought

https://www.semafor.com/article/02/06/2026/statin-drugs-safer-than-previously-thought
1•stareatgoats•34m ago•0 comments

Handy when you just want to distract yourself for a moment

https://d6.h5go.life/
1•TrendSpotterPro•35m ago•0 comments

More States Are Taking Aim at a Controversial Early Reading Method

https://www.edweek.org/teaching-learning/more-states-are-taking-aim-at-a-controversial-early-read...
2•lelanthran•37m ago•0 comments

AI will not save developer productivity

https://www.infoworld.com/article/4125409/ai-will-not-save-developer-productivity.html
1•indentit•42m ago•0 comments

How I do and don't use agents

https://twitter.com/jessfraz/status/2019975917863661760
1•tosh•48m ago•0 comments

BTDUex Safe? The Back End Withdrawal Anomalies

1•aoijfoqfw•50m ago•0 comments

Show HN: Compile-Time Vibe Coding

https://github.com/Michael-JB/vibecode
6•michaelchicory•53m ago•1 comments

Show HN: Ensemble – macOS App to Manage Claude Code Skills, MCPs, and Claude.md

https://github.com/O0000-code/Ensemble
1•IO0oI•56m ago•1 comments

PR to support XMPP channels in OpenClaw

https://github.com/openclaw/openclaw/pull/9741
1•mickael•57m ago•0 comments

Twenty: A Modern Alternative to Salesforce

https://github.com/twentyhq/twenty
1•tosh•58m ago•0 comments

Raspberry Pi: More memory-driven price rises

https://www.raspberrypi.com/news/more-memory-driven-price-rises/
2•calcifer•1h ago•0 comments

Level Up Your Gaming

https://d4.h5go.life/
1•LinkLens•1h ago•1 comments

Di.day is a movement to encourage people to ditch Big Tech

https://itsfoss.com/news/di-day-celebration/
4•MilnerRoute•1h ago•0 comments

Show HN: AI generated personal affirmations playing when your phone is locked

https://MyAffirmations.Guru
4•alaserm•1h ago•3 comments

Show HN: GTM MCP Server- Let AI Manage Your Google Tag Manager Containers

https://github.com/paolobietolini/gtm-mcp-server
1•paolobietolini•1h ago•0 comments