frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Can application layer improve local model output quality?

1•acro-v•2mo ago
Hi -

I am building a terminal-native tool for code generation, and one of the recent updates was to package a local model (Qwen 2.5 Coder 7B, downloads on the first try) for those users who do not want their code uploaded to third-party servers.

Initial response from users to this addition was favorable - but I have my doubts: the model is fairly basic and does not compare in quality to online offerings.

So - I am planning to improve RAG capabilities for building a message with relevant source file chunks, add a planning call, add validation loop, maybe have a multi-sample with re-ranking, etc.: all those techniques that are common and when implemented properly - could improve quality of output.

So - the question: I believe (hope?) that with all those things implemented - 7B can be bumped approximately to quality of a 20B, do you agree that's possible or do you think it would be a wasted effort and that kind of improvement would not happen?

The source is here - give it a star if you like what you see: https://github.com/acrotron/aye-chat

Comments

acro-v•2mo ago
Someone pointed me to this post from Cline engineer - below is my response to that

Post: https://cline.bot/blog/why-cline-doesnt-index-your-codebase-...

That post however does not apply to offline processing use case. Here are his 3 main problem points they re trying to solve:

Code Doesn't Think in Chunks

But then he is describing follow semantic links through imports, etc. -> that technique is still hierarchical chunking, and I am planning to implement that as well: it's straightforward.

2. Indexes Decay While Code Evolves

This is just not true - there are multiple ways to solve it. One, for example, is continuous indexing at low priority in the background. Another one - monitoring for file changes and reindexing only differences, etc. I already implemented first iteration for this: index remains current.

3. Security Becomes a Liability (and then goes into embeddings to be stored somewhere)

We are talking about offline mode of operation. Not with Aye Chat: it implements embedding store locally - with ChromaDB and ONNXMiniLM_L6_V2 model.

So as you can see - none of his premises apply here.

And then as part of solution he claims that "context window does not matter because Claude and ChatGPT models are now into 1M context window" - but once again that does not apply to locally hosted models: I am getting 32K context with Qwen 2.5 Coder 7B on my non-optimized setup with 8Gb VRAM.

The main thing why I think it may work is the following: answering a question includes "planning for what to do", and then "doing it". Models are good at "doing it" if they are given all necessary info, so if we unload that "planning" into application itself - I think it may work.

Dell support (and hardware) is so bad, I almost sued them

https://blog.joshattic.us/posts/2026-02-07-dell-support-lawsuit
1•radeeyate•53s ago•0 comments

Project Pterodactyl: Incremental Architecture

https://www.jonmsterling.com/01K7/
1•matt_d•1m ago•0 comments

Styling: Search-Text and Other Highlight-Y Pseudo-Elements

https://css-tricks.com/how-to-style-the-new-search-text-and-other-highlight-pseudo-elements/
1•blenderob•2m ago•0 comments

Crypto firm accidentally sends $40B in Bitcoin to users

https://finance.yahoo.com/news/crypto-firm-accidentally-sends-40-055054321.html
1•CommonGuy•3m ago•0 comments

Magnetic fields can change carbon diffusion in steel

https://www.sciencedaily.com/releases/2026/01/260125083427.htm
1•fanf2•4m ago•0 comments

Fantasy football that celebrates great games

https://www.silvestar.codes/articles/ultigamemate/
1•blenderob•4m ago•0 comments

Show HN: Animalese

https://animalese.barcoloudly.com/
1•noreplica•4m ago•0 comments

StrongDM's AI team build serious software without even looking at the code

https://simonwillison.net/2026/Feb/7/software-factory/
1•simonw•5m ago•0 comments

John Haugeland on the failure of micro-worlds

https://blog.plover.com/tech/gpt/micro-worlds.html
1•blenderob•5m ago•0 comments

Show HN: Velocity - Free/Cheaper Linear Clone but with MCP for agents

https://velocity.quest
1•kevinelliott•6m ago•1 comments

Corning Invented a New Fiber-Optic Cable for AI and Landed a $6B Meta Deal [video]

https://www.youtube.com/watch?v=Y3KLbc5DlRs
1•ksec•7m ago•0 comments

Show HN: XAPIs.dev – Twitter API Alternative at 90% Lower Cost

https://xapis.dev
1•nmfccodes•8m ago•0 comments

Near-Instantly Aborting the Worst Pain Imaginable with Psychedelics

https://psychotechnology.substack.com/p/near-instantly-aborting-the-worst
1•eatitraw•14m ago•0 comments

Show HN: Nginx-defender – realtime abuse blocking for Nginx

https://github.com/Anipaleja/nginx-defender
2•anipaleja•14m ago•0 comments

The Super Sharp Blade

https://netzhansa.com/the-super-sharp-blade/
1•robin_reala•15m ago•0 comments

Smart Homes Are Terrible

https://www.theatlantic.com/ideas/2026/02/smart-homes-technology/685867/
1•tusslewake•17m ago•0 comments

What I haven't figured out

https://macwright.com/2026/01/29/what-i-havent-figured-out
1•stevekrouse•18m ago•0 comments

KPMG pressed its auditor to pass on AI cost savings

https://www.irishtimes.com/business/2026/02/06/kpmg-pressed-its-auditor-to-pass-on-ai-cost-savings/
1•cainxinth•18m ago•0 comments

Open-source Claude skill that optimizes Hinge profiles. Pretty well.

https://twitter.com/b1rdmania/status/2020155122181869666
3•birdmania•18m ago•1 comments

First Proof

https://arxiv.org/abs/2602.05192
3•samasblack•20m ago•1 comments

I squeezed a BERT sentiment analyzer into 1GB RAM on a $5 VPS

https://mohammedeabdelaziz.github.io/articles/trendscope-market-scanner
1•mohammede•21m ago•0 comments

Kagi Translate

https://translate.kagi.com
2•microflash•22m ago•0 comments

Building Interactive C/C++ workflows in Jupyter through Clang-REPL [video]

https://fosdem.org/2026/schedule/event/QX3RPH-building_interactive_cc_workflows_in_jupyter_throug...
1•stabbles•23m ago•0 comments

Tactical tornado is the new default

https://olano.dev/blog/tactical-tornado/
2•facundo_olano•25m ago•0 comments

Full-Circle Test-Driven Firmware Development with OpenClaw

https://blog.adafruit.com/2026/02/07/full-circle-test-driven-firmware-development-with-openclaw/
1•ptorrone•25m ago•0 comments

Automating Myself Out of My Job – Part 2

https://blog.dsa.club/automation-series/automating-myself-out-of-my-job-part-2/
1•funnyfoobar•25m ago•1 comments

Dependency Resolution Methods

https://nesbitt.io/2026/02/06/dependency-resolution-methods.html
1•zdw•26m ago•0 comments

Crypto firm apologises for sending Bitcoin users $40B by mistake

https://www.msn.com/en-ie/money/other/crypto-firm-apologises-for-sending-bitcoin-users-40-billion...
1•Someone•26m ago•0 comments

Show HN: iPlotCSV: CSV Data, Visualized Beautifully for Free

https://www.iplotcsv.com/demo
2•maxmoq•28m ago•0 comments

There's no such thing as "tech" (Ten years later)

https://www.anildash.com/2026/02/06/no-such-thing-as-tech/
2•headalgorithm•28m ago•0 comments