frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Ask HN: Anyone Using a Mac Studio for Local AI/LLM?

49•UmYeahNo•2d ago•30 comments

Ask HN: Opus 4.6 ignoring instructions, how to use 4.5 in Claude Code instead?

2•Chance-Device•7h ago•0 comments

Ask HN: Ideas for small ways to make the world a better place

21•jlmcgraw•1d ago•22 comments

Ask HN: 10 months since the Llama-4 release: what happened to Meta AI?

45•Invictus0•1d ago•11 comments

Ask HN: Who wants to be hired? (February 2026)

139•whoishiring•5d ago•527 comments

Ask HN: Non AI-obsessed tech forums

34•nanocat•1d ago•28 comments

Ask HN: Who is hiring? (February 2026)

313•whoishiring•5d ago•515 comments

LLMs are powerful, but enterprises are deterministic by nature

5•prateekdalal•16h ago•7 comments

AI Regex Scientist: A self-improving regex solver

7•PranoyP•1d ago•1 comments

Tell HN: Another round of Zendesk email spam

105•Philpax•3d ago•54 comments

Ask HN: Non-profit, volunteers run org needs CRM. Is Odoo Community a good sol.?

2•netfortius•1d ago•1 comments

Ask HN: Is Connecting via SSH Risky?

19•atrevbot•2d ago•37 comments

Ask HN: Has your whole engineering team gone big into AI coding? How's it going?

18•jchung•2d ago•14 comments

Ask HN: Is there anyone here who still uses slide rules?

123•blenderob•4d ago•122 comments

Ask HN: How does ChatGPT decide which websites to recommend?

5•nworley•2d ago•11 comments

Kernighan on Programming

171•chrisjj•5d ago•62 comments

Ask HN: Mem0 stores memories, but doesn't learn user patterns

9•fliellerjulian•3d ago•6 comments

Ask HN: Why LLM providers sell access instead of consulting services?

5•pera•1d ago•13 comments

Ask HN: Is it just me or are most businesses insane?

8•justenough•2d ago•7 comments

Ask HN: What is the most complicated Algorithm you came up with yourself?

3•meffmadd•1d ago•7 comments

Ask HN: Anyone Seeing YT ads related to chats on ChatGPT?

2•guhsnamih•2d ago•4 comments

Ask HN: Does global decoupling from the USA signal comeback of the desktop app?

5•wewewedxfgdf•2d ago•3 comments

We built a serverless GPU inference platform with predictable latency

5•QubridAI•2d ago•1 comments

Ask HN: Does a good "read it later" app exist?

8•buchanae•3d ago•18 comments

Ask HN: Have you been fired because of AI?

17•s-stude•4d ago•15 comments

Ask HN: Anyone have a "sovereign" solution for phone calls?

12•kldg•4d ago•1 comments

Ask HN: Cheap laptop for Linux without GUI (for writing)

15•locusofself•4d ago•16 comments

GitHub Actions Have "Major Outage"

53•graton•5d ago•17 comments

Ask HN: Has anybody moved their local community off of Facebook groups?

23•madsohm•5d ago•20 comments

Ask HN: OpenClaw users, what is your token spend?

14•8cvor6j844qw_d6•5d ago•6 comments
Open in hackernews

Ask HN: What's a standard way for apps to request text completion as a service?

50•nvader•1mo ago
If I'm writing a new lightweight application that requires LLM-based text completion to power a feature, is there a standard way to request the user's operating system to provide a completion?

For instance, imagine I'm writing a small TUI that allows you to browse jsonl files, and want to create a feature to enable natural language parsing. Is there an emerging standard for an implementation agnostic, "Translate this natural query to jq {natlang-query}: response here: "?

If we don't have this yet, what would it take to get this built and broadly available?

Comments

billylo•1mo ago
Windows and macOS does come with a small model for generating text completion. You can write a wrapper for your own TUI to access them platform agnostically.

For consistent LLM behaviour, you can use ollama api with your model of choice to generate. https://docs.ollama.com/api/generate

Chrome has a built-in Gemini Nano too. But there isn't an official way to use it outside chrome yet.

nvader•1mo ago
Is there a Linux-y standard brewing?
billylo•1mo ago
Each distro is doing their own thing. If you are targeting Linux mainly, I would suggest to code it on top of ollama or LiteLLM
vintagedave•4w ago
Do you know what it’s called, at least on Windows? I’m struggling to find API docs.

When I asked AI it said no such inbuilt model exists (possibly a knowledge date cutoff issue.)

bredren•4w ago
Yes. I am not aware of a model shipping with Windows nor announced plans to do so. Microsoft’s been focused on cloud based LLM services.
usefulposter•4w ago
This thread is full of hallucinations ;)
billylo•4w ago
https://learn.microsoft.com/en-us/windows/ai/apis/phi-silica
vintagedave•3w ago
Thankyou!
tony_cannistra•4w ago
These are the on-device model APIs for apple: https://developer.apple.com/documentation/foundationmodels
1bpp•4w ago
Windows doesn't?
WilcoKruijer•4w ago
MCP has a feature called sampling which does this, but this might not be too useful for your context. [0]

In a project I’m working on I simply present some data and a prompt, the user can then pipe this into a LLM CLI such as Claude Code.

[0] https://modelcontextprotocol.io/specification/2025-06-18/cli...

brumar•4w ago
Sampling seemed so promising, but do we know if some MCPs managed to leverage this feature successfully?
lurking_swe•3w ago
if i recall the issue is that most mcp capable client APPs (Cursor, Claude Code, etc) don’t yet support it! VSCode is an exception.

Example: https://github.com/anthropics/claude-code/issues/1785

lcian•4w ago
When I'm writing a script that requires some kind of call to an LLM, I use this: https://github.com/simonw/llm.

This is of course cross-platform and works with both models accessible through an API and local ones.

I'm afraid this might not solve your problem though, as this is not an out of the box solution, it requires the user to either provide their own API key or to install Ollama and wire it up on their own.

kristopolous•4w ago
I've been working on a more unixy version of his tool I call llcat. Composable, stateless, agnostic, and generic:

https://github.com/day50-dev/llcat

It might help things get closer..

It's under 2 days old and it's already really fundamentally changing how I do things.

Also for edge running look into the LFM 2.5 class of models: https://huggingface.co/LiquidAI/LFM2.5-1.2B-Instruct

mirror_neuron•4w ago
I love this concept. Looks great, I will definitely check it out.
kristopolous•4w ago
Please use it and give me feedback. I'm going to give a lightning talk on it tonight at sfvlug
nvader•4w ago
I think this is definitely a step in the right direction, and is exactly the kind of answer I was looking for. Thank you!

`llm` gives my tool a standard bin to call to invoke completions, and configuring and managing it is the user's responsibility.

If more tools started expecting something like this, it could become a defacto standard. Then maybe the OS would begin to provide it.

cjonas•4w ago
I asked a similar question a while back and didn't get any response. Some type of service is needed for applications that want to be AI enabled but not deal with usage based pricing that comes with it. Right now the only option is for the user to provide a token/endpoint from one of the services. This is fine for local apps, but less ideal for we apps.
netsharc•4w ago
That's interesting, on Linux there's the $EDITOR variable (a quick search of the 3 distros Arch, Ubuntu, Fedora show me they respect it) for the terminal text editor.

Maybe you can trailblaze and tell users your application will support the $LLM or $LLM_AUTOCOMPLETE variables (convene the committee for naming for better names).

joshribakoff•4w ago
I have been using an open source program “handy”, it is a cross platform rust tauri app that does speech recognition and handles inputting text into programs. It works by piggybacking off the OS’s text input or copy and paste features.

You could fork this, and shell out to an LLM before finally pasting the response.

TZubiri•4w ago
Not at all natural language, but linux has readline for exact character matches, it's what powers tab completion in the command line.

Maybe it can be repurposed for natural language in a specific implementation

Sevii•4w ago
Small models are getting good but I don't think they are quite there yet for this use case. For ok results we are looking at 12-14GB of vram committed to models to make this happen. My MacBook with 24GB of total ram runs fine with a 14B model running but I don't think most people have quite enough ram yet. Still I think it's something we are going to need.

We are also going to want the opposite. A way for an LLM to request tool calls so that it can drive an arbitrary application. MCP exists, but it expects you to preregister all your MCP servers. I am not sure how well preregistering would work at the scale of every application on your PC.

tpae•4w ago
You can check out my project here: https://github.com/dinoki-ai/osaurus

I'm focused on building it for the macOS ecosystem

jiehong•4w ago
This might work through a LSP server?

It’s not exactly the intended use case, but it could be coerced to do that.

I’ve seen something else like that, though: voice transcription software that have access to the context the text is in, and can interact with it and modify it.

Like how some people use super whisper modes [0] to do some actions with their voice in any app.

It works because you can say "rewrite this text, and answer the questions it asks", and the dictation app first transcribes this to text, extract the whole text from the focused app, send both to an AI Model, get an answer back and paste the output.

[0]: https://superwhisper.com/docs/common-issues/context