frontpage.

TL;DR An SDK for running AI on-device with even the most non-standard hardware.

Hey, I’m one of the maintainers of RamaLama[1] which is part of the containers ecosystem (podman, buildah, skopeo). It’s a runtime-agnostic tool for coordinating local AI inference with containers.

I put together a python SDK for programmatic control over local AI using ramalama under the hood. Being runtime agnosti you can use ramalama with llama.cpp, vLLM, mlx, etc… so long as the underlying service exposes an OpenAI compatible endpoint. This is especially powerful for users deploying to edge or other devices with atypical hardware/software configuration that, for example, requires custom runtime compilations.

``` from ramalama_sdk import RamalamaModel

runtime_image = "quay.io/ramalama/ramalama:latest" model = "huggingface://ggml-org/gpt-oss-20b-GGUF"

with RamalamaModel(model, base_image=runtime_image) as model:

    response = model.chat("How tall is Michael Jordan?")

    print(response["content"])

```

This SDK manages:

  - Pulling and verifying runtime images
  - Downloading models (HuggingFace, Ollama, ModelScope, OCI registries)
  - Managing the runtime process

It works with air-gapped deployments and private registries and also has async support.

If you want to learn more the documentation is available here: https://docs.ramalama.com/sdk/introduction.

Otherwise, I hope this is useful to people out there and would appreciate feedback about where to prioritize next whether that’s specific language support, additional features (speech to text? RAG? MCP?), or something else.

1. github.com/containers/ramalama

From Offloading to Engagement (Study on Generative AI)

AI for People

Rome is studded with cannon balls (2022)

8-piece tablebase development on Lichess (op1 partial)

US to bankroll far-right think tanks in Europe against digital laws

Ask HN: Have AI companies replaced their own SaaS usage with agents?

pi-nes

Show HN: Crew – Multi-agent orchestration tool for AI-assisted development

New hire fixed a problem so fast, their boss left to become a yoga instructor

Four horsemen of the AI-pocalypse line up capex bigger than Israel's GDP

A free Dynamic QR Code generator (no expiring links)

nextTick but for React.js

Show HN: I Built an AI-Powered Pull Request Review Tool

Git-am applies commit message diffs

ClawEmail: 1min setup for OpenClaw agents with Gmail, Docs

UnAutomating the Economy: More Labor but at What Cost?

Show HN: Gettorr – Stream magnet links in the browser via WebRTC (no install)

Statin drugs safer than previously thought

Handy when you just want to distract yourself for a moment

More States Are Taking Aim at a Controversial Early Reading Method

AI will not save developer productivity

How I do and don't use agents

BTDUex Safe? The Back End Withdrawal Anomalies

Show HN: Compile-Time Vibe Coding

Show HN: Ensemble – macOS App to Manage Claude Code Skills, MCPs, and Claude.md

PR to support XMPP channels in OpenClaw

Twenty: A Modern Alternative to Salesforce

Raspberry Pi: More memory-driven price rises

Level Up Your Gaming

Di.day is a movement to encourage people to ditch Big Tech