frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Can application layer improve local model output quality?

1•acro-v•1h ago
Hi -

I am building a terminal-native tool for code generation, and one of the recent updates was to package a local model (Qwen 2.5 Coder 7B, downloads on the first try) for those users who do not want their code uploaded to third-party servers.

Initial response from users to this addition was favorable - but I have my doubts: the model is fairly basic and does not compare in quality to online offerings.

So - I am planning to improve RAG capabilities for building a message with relevant source file chunks, add a planning call, add validation loop, maybe have a multi-sample with re-ranking, etc.: all those techniques that are common and when implemented properly - could improve quality of output.

So - the question: I believe (hope?) that with all those things implemented - 7B can be bumped approximately to quality of a 20B, do you agree that's possible or do you think it would be a wasted effort and that kind of improvement would not happen?

The source is here - give it a star if you like what you see: https://github.com/acrotron/aye-chat

Comments

acro-v•1h ago
Someone pointed me to this post from Cline engineer - below is my response to that

Post: https://cline.bot/blog/why-cline-doesnt-index-your-codebase-...

That post however does not apply to offline processing use case. Here are his 3 main problem points they re trying to solve:

Code Doesn't Think in Chunks

But then he is describing follow semantic links through imports, etc. -> that technique is still hierarchical chunking, and I am planning to implement that as well: it's straightforward.

2. Indexes Decay While Code Evolves

This is just not true - there are multiple ways to solve it. One, for example, is continuous indexing at low priority in the background. Another one - monitoring for file changes and reindexing only differences, etc. I already implemented first iteration for this: index remains current.

3. Security Becomes a Liability (and then goes into embeddings to be stored somewhere)

We are talking about offline mode of operation. Not with Aye Chat: it implements embedding store locally - with ChromaDB and ONNXMiniLM_L6_V2 model.

So as you can see - none of his premises apply here.

And then as part of solution he claims that "context window does not matter because Claude and ChatGPT models are now into 1M context window" - but once again that does not apply to locally hosted models: I am getting 32K context with Qwen 2.5 Coder 7B on my non-optimized setup with 8Gb VRAM.

The main thing why I think it may work is the following: answering a question includes "planning for what to do", and then "doing it". Models are good at "doing it" if they are given all necessary info, so if we unload that "planning" into application itself - I think it may work.

Optique 0.7.0: Smarter error messages and validation library integrations

https://hackers.pub/@hongminhee/2025/optique-070
1•todsacerdoti•35s ago•0 comments

State Healthcare Rankings: The Methods Behind the Metrics

https://news.gallup.com/poll/698219/state-healthcare-rankings-methods-behind-metrics.aspx
1•hn_acker•1m ago•0 comments

What our data says about timed coding problems

https://www.otherbranch.com/shared/blog/what-our-data-says-about-timed-coding
1•rachofsunshine•2m ago•0 comments

How Long Does Reddit Account Warm-Up Take?

https://awesome-directories.com/blog/reddit-marketing-account-warmup-time-investment/
1•meysamazad•5m ago•0 comments

Continuous Batching from First Principles

https://huggingface.co/blog/continuous_batching
1•jxmorris12•7m ago•0 comments

AI Smells on Medium

https://rmoff.net/2025/11/25/ai-smells-on-medium/
2•rmoff•8m ago•0 comments

MiniMax-M2 Deep Research Agent

https://github.com/dair-ai/m2-deep-research
1•omarsar•10m ago•0 comments

German 'hammer gang' trial for seven accused of extreme-left violence

https://www.bbc.com/news/articles/cn091g7dreyo
1•onemoresoop•11m ago•0 comments

Quantum Consciousness: Building the Architecture of a Shared Reality

https://www.neuroba.com/post/quantum-consciousness-and-the-internet-of-minds-building-the-archite...
1•andsoitis•13m ago•0 comments

Why is climate action stalling, not ramping up as Earth gets hotter?

https://www.newscientist.com/article/2505361-why-is-climate-action-stalling-not-ramping-up-as-ear...
3•Brajeshwar•14m ago•1 comments

Voyager 1 approaches one light day from Earth

https://newatlas.com/space/voyager-approaches-1-light-day-from-earth/
4•Brajeshwar•14m ago•1 comments

Synthetic tongue rates chillies' heat – and spares human tasters

https://www.nature.com/articles/d41586-025-03767-1
1•Brajeshwar•14m ago•0 comments

Klarna to launch dollar-backed stablecoin

https://www.reuters.com/business/finance/klarna-launch-dollar-backed-stablecoin-race-digital-paym...
2•thm•15m ago•1 comments

Show HN: I vibe-coded a tool to decode a legacy system nobody understood

https://github.com/PearlThoughts/CodeCompass
3•seng•15m ago•0 comments

We nearly had power profiling in Chromium

https://fershad.com/writing/almost-chrome-power-profiler/
1•speckx•17m ago•0 comments

3D Models in PDF Documents

https://nibblestew.blogspot.com/2025/11/3d-models-in-pdf-documents.html
1•ibobev•18m ago•0 comments

Software Never Fails

https://entropicthoughts.com/software-never-fails
1•ibobev•18m ago•0 comments

The console wars have ended – is this a new era in gaming?

https://www.rte.ie/culture/2025/1125/1545047-the-console-wars-have-ended-is-this-a-new-era-in-gam...
1•austinallegro•19m ago•0 comments

A Brief, Incomplete, and Mostly Wrong History of Programming Languages

http://james-iry.blogspot.com/2009/05/brief-incomplete-and-mostly-wrong.html
3•ibobev•19m ago•0 comments

Show HN: A Tool for Extracting Local B2B Leads

https://www.gmbscraper.org
1•yiyiyayo•19m ago•0 comments

Investigating a Possible Scammer in Journalism's AI Era

https://thelocal.to/investigating-scam-journalism-ai/
1•superfunny•20m ago•0 comments

Show HN: A modern Papaparse for big remote CSV

https://github.com/severo/csv-range
1•severo_bo•20m ago•0 comments

A Word on Scalability

https://www.allthingsdistributed.com/2006/03/a_word_on_scalability.html
1•EPendragon•21m ago•0 comments

The Promise of P-Graphs

https://pavpanchekha.com/blog/p-graphs.html
1•todsacerdoti•21m ago•0 comments

Towards Pen-and-Paper-Style Equational Reasoning in Interactive Theorem Provers [pdf]

https://steuwer.info/files/publications/2026/POPL-Lean-Egg.pdf
1•todsacerdoti•23m ago•0 comments

In Praise of DHH

https://okayfail.com/2025/in-praise-of-dhh.html
6•debo_•23m ago•2 comments

Ask HN: Opinions on facial recognition at air ports?

2•bjourne•23m ago•1 comments

Harumi.io – bringing operations-research optimization to business problems

https://harumi.io/
1•miriam_koga•24m ago•1 comments

When Will the US Get $15K EVs?

https://www.wired.com/story/when-will-the-us-finally-get-dollar15k-evs/
4•voxadam•24m ago•0 comments

Rails 8 enhances ActiveStorage:Blob#open to work without a block

https://blog.saeloun.com/2025/11/25/rails-8-activestorage-blob-open-without-block/
1•unripe_syntax•25m ago•0 comments