frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Show HN: I saw this cool navigation reveal, so I made a simple HTML+CSS version

https://github.com/Momciloo/fun-with-clip-path
16•momciloo•2h ago•0 comments

Show HN: Look Ma, No Linux: Shell, App Installer, Vi, Cc on ESP32-S3 / BreezyBox

https://github.com/valdanylchuk/breezydemo
278•isitcontent•22h ago•38 comments

Show HN: Kappal – CLI to Run Docker Compose YML on Kubernetes for Local Dev

https://github.com/sandys/kappal
22•sandGorgon•2d ago•12 comments

Show HN: If you lose your memory, how to regain access to your computer?

https://eljojo.github.io/rememory/
351•eljojo•1d ago•216 comments

Show HN: I spent 4 years building a UI design tool with only the features I use

https://vecti.com
367•vecti•1d ago•169 comments

Show HN: I built a <400ms latency voice agent that runs on a 4gb vram GTX 1650"

https://github.com/pheonix-delta/axiom-voice-agent
2•shubham-coder•1h ago•0 comments

Show HN: Stacky – certain block game clone

https://www.susmel.com/stacky/
3•Keyframe•2h ago•0 comments

Show HN: A toy compiler I built in high school (runs in browser)

https://vire-lang.web.app
3•xeouz•3h ago•1 comments

Show HN: R3forth, a ColorForth-inspired language with a tiny VM

https://github.com/phreda4/r3
83•phreda4•22h ago•16 comments

Show HN: Smooth CLI – Token-efficient browser for AI agents

https://docs.smooth.sh/cli/overview
94•antves•2d ago•70 comments

Show HN: Nginx-defender – realtime abuse blocking for Nginx

https://github.com/Anipaleja/nginx-defender
3•anipaleja•4h ago•0 comments

Show HN: Artifact Keeper – Open-Source Artifactory/Nexus Alternative in Rust

https://github.com/artifact-keeper
154•bsgeraci•1d ago•64 comments

Show HN: BioTradingArena – Benchmark for LLMs to predict biotech stock movements

https://www.biotradingarena.com/hn
28•dchu17•1d ago•12 comments

Show HN: Slack CLI for Agents

https://github.com/stablyai/agent-slack
52•nwparker•1d ago•12 comments

Show HN: ARM64 Android Dev Kit

https://github.com/denuoweb/ARM64-ADK
18•denuoweb•2d ago•2 comments

Show HN: MCP App to play backgammon with your LLM

https://github.com/sam-mfb/backgammon-mcp
3•sam256•6h ago•1 comments

Show HN: Gigacode – Use OpenCode's UI with Claude Code/Codex/Amp

https://github.com/rivet-dev/sandbox-agent/tree/main/gigacode
21•NathanFlurry•1d ago•10 comments

Show HN: I'm 75, building an OSS Virtual Protest Protocol for digital activism

https://github.com/voice-of-japan/Virtual-Protest-Protocol/blob/main/README.md
7•sakanakana00•7h ago•1 comments

Show HN: I built Divvy to split restaurant bills from a photo

https://divvyai.app/
3•pieterdy•7h ago•1 comments

Show HN: Which chef knife steels are good? Data from 540 Reddit tread

https://new.knife.day/blog/reddit-steel-sentiment-analysis
2•p-s-v•3h ago•0 comments

Show HN: Micropolis/SimCity Clone in Emacs Lisp

https://github.com/vkazanov/elcity
173•vkazanov•2d ago•49 comments

Show HN: I Hacked My Family's Meal Planning with an App

https://mealjar.app
2•melvinzammit•10h ago•0 comments

Show HN: Slop News – HN front page now, but it's all slop

https://dosaygo-studio.github.io/hn-front-page-2035/slop-news
21•keepamovin•13h ago•6 comments

Show HN: I built a free UCP checker – see if AI agents can find your store

https://ucphub.ai/ucp-store-check/
2•vladeta•10h ago•2 comments

Show HN: Daily-updated database of malicious browser extensions

https://github.com/toborrm9/malicious_extension_sentry
14•toborrm9•1d ago•8 comments

Show HN: Falcon's Eye (isometric NetHack) running in the browser via WebAssembly

https://rahuljaguste.github.io/Nethack_Falcons_Eye/
7•rahuljaguste•22h ago•1 comments

Show HN: XAPIs.dev – Twitter API Alternative at 90% Lower Cost

https://xapis.dev
3•nmfccodes•4h ago•1 comments

Show HN: Horizons – OSS agent execution engine

https://github.com/synth-laboratories/Horizons
25•JoshPurtell•1d ago•5 comments

Show HN: Compile-Time Vibe Coding

https://github.com/Michael-JB/vibecode
10•michaelchicory•12h ago•3 comments

Show HN: Local task classifier and dispatcher on RTX 3080

https://github.com/resilientworkflowsentinel/resilient-workflow-sentinel
25•Shubham_Amb•1d ago•2 comments
Open in hackernews

Show HN: Open-source implementation of Stanford's self-learning agent framework

https://github.com/kayba-ai/agentic-context-engine
10•kayba•3mo ago
We implemented Stanford's Agentic Context Engineering paper which shows agents can improve their performance just by evolving their own context.

How it works: Agents execute tasks, reflect on what worked/failed, and curate a "playbook" of strategies. All from execution feedback - no training data needed.

Happy to answer questions about the implementation or the research!

Comments

vebgen•3mo ago
This is fascinating! The "evolving playbook" approach resonates with challenges we've been tackling building an AI agent for Django development.

A few questions about your implementation:

1. How do you handle the balance between delta updates and full context rewrites when the playbook grows large? We've found that keeping detailed history helps with debugging but can bloat context quickly.

2. The Generator/Reflector/Curator separation is elegant. Did you implement these as separate LLM calls or different prompting strategies on the same model? We use a similar dual-agent pattern (planner + executor) and the coordination overhead is non-trivial.

3. Most interesting part: "natural execution feedback without labeled supervision." How do you define success/failure signals for the Reflector in ambiguous cases? For code generation, it's easy (tests pass/fail), but for other domains it seems trickier.

The +10.6% improvement on agent tasks is impressive - definitely checking out the paper. The brevity bias problem you mention is real - we've noticed agents dropping important context details when trying to "summarize efficiently."

kayba•3mo ago
Thanks for the great questions! Here's how we're tackling these:

1. Context growth management:

We avoid full context rewrites entirely, they cause context collapse where the LLM compresses away important details. Instead, we use delta updates as the foundation and are exploring:

- Semantic de-duplication to remove redundancy - Keeping deltas as the source of truth with optional summarization layers on top - Pre-filtering the playbook to feed the model a more focused version, with tooling to let it explore further when needed

Delta updates remain our core principle, but we're actively working on preventing context bloat as playbooks scale.

2. Role separation:

Our library lets you select different models for each role, with prompts specifically tailored to each function. So far we've mostly used the same model for all three roles, but we're actively exploring model mixing as a promising direction.

3. Success signals:

The system shows strong self-assessment capabilities using execution feedback (code pass/fail, API responses, and model interactions with the environment). However, you're right that ambiguous domains are trickier, this is still an open challenge for us. Our vision is to pre-seed domain knowledge through curated playbooks or training samples, then let models self-explore and discover their own success patterns over time.

What I'm curious about:

- What feedback signals work for your Django agent?

- How do you handle planner-executor coordination overhead?

- Have you hit similar brevity bias issues?

Would love to continue this conversation on Discord if you're interested: https://discord.com/invite/mqCqH7sTyK

jimmySixDOF•3mo ago
this kind of DSpy-GEPA self improvement loop keeps popping up and adding a few points but the cost (API and wall clock)also means you use this where a repeatable task/prompt/context needs optimizing and you can afford to find better templates
kayba•3mo ago
You're right that cost and latency are important considerations. However, the research shows this isn't just about finding better templates, it's about enabling agentic systems to learn and improve from their previous attempts and failures.

We believe in-context learning is one of the missing pieces to make agentic systems feasible in production. The key is that systems can adapt without expensive fine-tuning or retraining. The paper shows *86.9% lower adaptation latency* and significant reductions in rollout costs compared to existing methods, making this approach more practical than previous optimization techniques.

The real value is in systems that progressively get better at their tasks through experience, not just one-time prompt optimization.

If you want to continue this conversation just hit me up on Discord: https://discord.com/invite/mqCqH7sTyK

jimmySixDOF•3mo ago
I did look into DataRobot's Syftr which points at the same problem but is a lot heavier I definitely like that the approach you take is at least easy to get a basic version up and can start checking the results right away!