frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: LLMOne – Deploy LLMs from bare metal to production in hours

https://github.com/EM-GeekLab/LLMOne
5•pescn•7mo ago
I spent days trying to deploy DeepSeek on a server this year. Install Ubuntu, NVIDIA drivers, CUDA, Docker, configure vLLM, debug memory issues, tune performance settings. Every deployment was different. Every server had its own quirks. Worse still, these issues are more pronounced on non-NVIDIA accelerators, such as Ascend or Intel NPU.

So, we made LLMOne, which will automates this. You can use it at bare metal (via BMC) or SSH (coming soon) into an existing server, select models, and it handles everything: OS installation, driver setup, inference engine configuration, model deployment, and deploy applications such as Open WebUI or Dify.

The code is open source (Mulan PSL v2, like Apache 2.0). No vendor lock-in.

There is a User Tutorial Video: https://youtu.be/P4MgIPW5K70

How it works:

1. Uses BMC (Redfish) to remotely install OS on bare metal, but not PXE (without DCHP Server configure) 2. Installs appropriate drivers (NVIDIA, Huawei Ascend, etc.) 3. Sets up containers and inference engines (vLLM, MindIE or OpenVINO - picks the right one) 4. Deploys models and runs performance benchmarks 5. Can also deploy apps like OpenWebUI, Dify alongside the models

The whole process runs unattended. What used to take me 2-3 days of tweaking now finishes in 1-2 hours.

Technical bits:

We avoid Kubernetes entirely - found it adds complexity without much benefit for single-node LLM deployments. Everything runs in Docker containers with custom orchestration.

The BMC integration was tricky. Different servers expose different Redfish capabilities, so we built adapters for some vendors, such as iDRAC from Dell and iBMC from Huawei.

Performance varies by hardware, but we've seen ~2200 tokens/sec on RTX 4090 with TensorRT-LLM backend, ~1900 with vLLM. The system runs Evalscope benchmarks automatically so you know what you're getting.

Why this exists:

We work with chip vendors and AI server resellers. When servers arrive at customer sites, instead of a multi-person support team spending days on deployment and debugging, one man can use this tool to get everything running.

While we focus on LLM deployment, the tech stack can actually deploy anything from bare OS to complex software stacks. The automation layer is generic enough for various workloads.

Current limitations:

For BMC support, we currently only support Dell iDRAC and Huawei iBMC. We're working on Supermicro support. We'd love to expand to other server vendors but need hardware access or Redfish Mock for testing and development.

SSH Mode and Apple Silicon Support is coming soon

Looking for feedback. Also, if you're a server vendor and can provide BMC access for testing, we'd appreciate the help expanding hardware support.

Go 1.22, SQLite, and Next.js: The "Boring" Back End

https://mohammedeabdelaziz.github.io/articles/go-next-pt-2
1•mohammede•4m ago•0 comments

Laibach the Whistleblowers [video]

https://www.youtube.com/watch?v=c6Mx2mxpaCY
1•KnuthIsGod•5m ago•1 comments

I replaced the front page with AI slop and honestly it's an improvement

https://slop-news.pages.dev/slop-news
1•keepamovin•10m ago•1 comments

Economists vs. Technologists on AI

https://ideasindevelopment.substack.com/p/economists-vs-technologists-on-ai
1•econlmics•12m ago•0 comments

Life at the Edge

https://asadk.com/p/edge
1•tosh•18m ago•0 comments

RISC-V Vector Primer

https://github.com/simplex-micro/riscv-vector-primer/blob/main/index.md
2•oxxoxoxooo•22m ago•1 comments

Show HN: Invoxo – Invoicing with automatic EU VAT for cross-border services

2•InvoxoEU•22m ago•0 comments

A Tale of Two Standards, POSIX and Win32 (2005)

https://www.samba.org/samba/news/articles/low_point/tale_two_stds_os2.html
2•goranmoomin•26m ago•0 comments

Ask HN: Is the Downfall of SaaS Started?

3•throwaw12•27m ago•0 comments

Flirt: The Native Backend

https://blog.buenzli.dev/flirt-native-backend/
2•senekor•29m ago•0 comments

OpenAI's Latest Platform Targets Enterprise Customers

https://aibusiness.com/agentic-ai/openai-s-latest-platform-targets-enterprise-customers
1•myk-e•31m ago•0 comments

Goldman Sachs taps Anthropic's Claude to automate accounting, compliance roles

https://www.cnbc.com/2026/02/06/anthropic-goldman-sachs-ai-model-accounting.html
2•myk-e•34m ago•4 comments

Ai.com bought by Crypto.com founder for $70M in biggest-ever website name deal

https://www.ft.com/content/83488628-8dfd-4060-a7b0-71b1bb012785
1•1vuio0pswjnm7•35m ago•1 comments

Big Tech's AI Push Is Costing More Than the Moon Landing

https://www.wsj.com/tech/ai/ai-spending-tech-companies-compared-02b90046
4•1vuio0pswjnm7•37m ago•0 comments

The AI boom is causing shortages everywhere else

https://www.washingtonpost.com/technology/2026/02/07/ai-spending-economy-shortages/
2•1vuio0pswjnm7•38m ago•0 comments

Suno, AI Music, and the Bad Future [video]

https://www.youtube.com/watch?v=U8dcFhF0Dlk
1•askl•40m ago•2 comments

Ask HN: How are researchers using AlphaFold in 2026?

1•jocho12•43m ago•0 comments

Running the "Reflections on Trusting Trust" Compiler

https://spawn-queue.acm.org/doi/10.1145/3786614
1•devooops•48m ago•0 comments

Watermark API – $0.01/image, 10x cheaper than Cloudinary

https://api-production-caa8.up.railway.app/docs
1•lembergs•50m ago•1 comments

Now send your marketing campaigns directly from ChatGPT

https://www.mail-o-mail.com/
1•avallark•53m ago•1 comments

Queueing Theory v2: DORA metrics, queue-of-queues, chi-alpha-beta-sigma notation

https://github.com/joelparkerhenderson/queueing-theory
1•jph•1h ago•0 comments

Show HN: Hibana – choreography-first protocol safety for Rust

https://hibanaworks.dev/
5•o8vm•1h ago•1 comments

Haniri: A live autonomous world where AI agents survive or collapse

https://www.haniri.com
1•donangrey•1h ago•1 comments

GPT-5.3-Codex System Card [pdf]

https://cdn.openai.com/pdf/23eca107-a9b1-4d2c-b156-7deb4fbc697c/GPT-5-3-Codex-System-Card-02.pdf
1•tosh•1h ago•0 comments

Atlas: Manage your database schema as code

https://github.com/ariga/atlas
1•quectophoton•1h ago•0 comments

Geist Pixel

https://vercel.com/blog/introducing-geist-pixel
2•helloplanets•1h ago•0 comments

Show HN: MCP to get latest dependency package and tool versions

https://github.com/MShekow/package-version-check-mcp
1•mshekow•1h ago•0 comments

The better you get at something, the harder it becomes to do

https://seekingtrust.substack.com/p/improving-at-writing-made-me-almost
2•FinnLobsien•1h ago•0 comments

Show HN: WP Float – Archive WordPress blogs to free static hosting

https://wpfloat.netlify.app/
1•zizoulegrande•1h ago•0 comments

Show HN: I Hacked My Family's Meal Planning with an App

https://mealjar.app
1•melvinzammit•1h ago•0 comments