frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

AI will make formal verification go mainstream

https://martin.kleppmann.com/2025/12/08/ai-formal-verification.html
434•evankhoury•7h ago•210 comments

alpr.watch

https://alpr.watch/
687•theamk•12h ago•338 comments

No Graphics API

https://www.sebastianaaltonen.com/blog/no-graphics-api
489•ryandrake•9h ago•90 comments

Announcing the Beta release of ty

https://astral.sh/blog/ty
409•gavide•8h ago•80 comments

Midjourney is alemwjsl

https://www.aadillpickle.com/blog/midjourney-is-alemwjsl
131•aadillpickle•6d ago•47 comments

GPT Image 1.5

https://openai.com/index/new-chatgpt-images-is-here/
365•charlierguo•10h ago•183 comments

Pricing Changes for GitHub Actions

https://resources.github.com/actions/2026-pricing-changes-for-github-actions/
550•kevin-david•11h ago•627 comments

CS 4973: Introduction to Software Development Tooling – Northeastern Univ (2024)

https://bernsteinbear.com/isdt/
40•vismit2000•3h ago•4 comments

I ported JustHTML from Python to JavaScript with Codex CLI and GPT-5.2 in hours

https://simonwillison.net/2025/Dec/15/porting-justhtml/
100•pbowyer•6h ago•57 comments

40 percent of fMRI signals do not correspond to actual brain activity

https://www.tum.de/en/news-and-events/all-news/press-releases/details/40-percent-of-mri-signals-d...
416•geox•15h ago•179 comments

Mozilla appoints new CEO Anthony Enzor-Demeo

https://blog.mozilla.org/en/mozilla/leadership/mozillas-next-chapter-anthony-enzor-demeo-new-ceo/
456•recvonline•15h ago•713 comments

No AI* Here – A Response to Mozilla's Next Chapter

https://www.waterfox.com/blog/no-ai-here-response-to-mozilla/
177•MrAlex94•6h ago•113 comments

Show HN: Titan – JavaScript-first framework that compiles into a Rust server

https://www.npmjs.com/package/@ezetgalaxy/titan
13•soham_byte•5d ago•6 comments

Sei AI (YC W22) Is Hiring

https://www.ycombinator.com/companies/sei/jobs/TYbKqi0-llm-engineer-mid-senior
1•ramkumarvenkat•4h ago

VA Linux: The biggest dotcom IPO

https://dfarq.homeip.net/va-linux-the-biggest-dotcom-ipo/
5•giuliomagnifico•5d ago•0 comments

Thin desires are eating life

https://www.joanwestenberg.com/thin-desires-are-eating-your-life/
381•mitchbob•1d ago•149 comments

Dafny: Verification-Aware Programming Language

https://dafny.org/
40•handfuloflight•6h ago•21 comments

Testing a cheaper laminar flow hood

https://chillphysicsenjoyer.substack.com/p/testing-a-cheaper-laminar-flow-hood
25•surprisetalk•4d ago•5 comments

Tesla Robotaxis in Austin Crash 12.5x More Frequently Than Humans

https://electrek.co/2025/12/15/tesla-reports-another-robotaxi-crash-even-with-supervisor/
92•hjouneau•2h ago•48 comments

Japan to revise romanization rules for first time in 70 years

https://www.japantimes.co.jp/news/2025/08/21/japan/panel-hepburn-style-romanization/
146•rgovostes•20h ago•128 comments

Show HN: Learn Japanese contextually while browsing

https://lingoku.ai/learn-japanese
37•englishcat•4h ago•17 comments

Sega Channel: VGHF Recovers over 100 Sega Channel ROMs (and More)

https://gamehistory.org/segachannel/
233•wicket•15h ago•38 comments

The World Happiness Report is beset with methodological problems

https://yaschamounk.substack.com/p/the-world-happiness-report-is-a-sham
97•thatoneengineer•1d ago•116 comments

Nvidia Nemotron 3 Family of Models

https://research.nvidia.com/labs/nemotron/Nemotron-3/
164•ewt-nv•1d ago•30 comments

Writing a blatant Telegram clone using Qt, QML and Rust. And C++

https://kemble.net/blog/provoke/
96•tempodox•13h ago•54 comments

Chat-tails: Throwback terminal chat, built on Tailscale

https://tailscale.com/blog/chat-tails-terminal-chat
66•nulbyte•7h ago•12 comments

Locked out: How a gift card purchase destroyed an Apple account

https://appleinsider.com/articles/25/12/13/locked-out-how-a-gift-card-purchase-destroyed-an-apple...
62•nonfamous•3h ago•28 comments

Twin suction turbines and 3-Gs in slow corners? Meet the DRG-Lola

https://arstechnica.com/cars/2025/11/an-electric-car-thats-faster-than-f1-around-monaco-thats-the...
8•PaulHoule•5d ago•3 comments

Meta's new A.I. superstars are chafing against the rest of the company

https://www.nytimes.com/2025/12/10/technology/meta-ai-tbd-lab-friction.html
83•furcyd•6d ago•115 comments

Show HN: Sqlit – A lazygit-style TUI for SQL databases

https://github.com/Maxteabag/sqlit
126•MaxTeabag•1d ago•18 comments
Open in hackernews

Letta Code

https://www.letta.com/blog/letta-code
63•ascorbic•8h ago

Comments

pacjam•6h ago
Thanks for sharing!! (Charles here from Letta) The original MemGPT (the starting point for Letta) was actually an agent CLI as well, so it's fun to see everything come full circle.

If you're a Claude Code user (I assume much of HN is) some context on Letta Code: it's a fully open source coding harness (#1 model-agnostic OSS on Terminal-Bench, #4 overall).

It's specifically designed to be "memory-first" - the idea is that you use the same coding agents perpetually, and have them build learned context (memory) about you / your codebase / your org over time. There are some built-in memory tools like `/init` and `/remember` to help guide this along (if your agent does something stupid, you can 'whack it' with /remember). There's also a `/clear` command, which resets the message buffer, but keeps the learned context / memory inside the context window.

We built this for ourselves - Letta Code co-authors the majority of PRs on the letta-code GitHub repo. I personally have been the same agent for ~2+ weeks (since the latest stable build) and it's fun to see its memory become more and more valuable over time.

LMK if you have any q's! The entire thing is OSS and designed to be super hackable, and can run completely locally when combined with the Letta docker image.

koakuma-chan•5h ago
Why can't I see Cursor on tbench? Is it that bad that it's not even on the leaderboard? I am trying to figure out if I can pitch your product to my company, and whether it is worth it.
pacjam•5h ago
Not sure why Cursor CLI isn't on the leaderboard... I'm guessing it's because Cursor is focused primarily on their IDE agent, not their CLI agent, and Terminal-Bench is an eval/benchmark for CLI agents exclusively.

If you're asking about why Letta Code isn't on the leaderboard, the TBench maintainers said it should be up later today (so probably refresh in a few hours!). The results are already public, you can see them on our blog (graphs linked in the OP). They are also verifiable, all data is available for the runs + Letta Code is open source, so you can replicate the results yourself.

koakuma-chan•5h ago
I mean, I understand that this is a terminal benchmark, but the point is to benchmark LLM harnesses, and whether the output is printed to the terminal, or displayed in the UI shouldn't matter. Are there alternative benchmarks where I can see how Letta Code performs compared to cursor?
pacjam•5h ago
Ah gotcha! In that case, I think Terminal-Bench is currently the best proxy for "how good is this harness+agent combo at coding (quantitatively)" question. I think it used to be SWE-Bench, but I think T-Bench is a better proxy for this now. Like you said though, unfortunately Cursor isn't listed (probably their choice to not list it, maybe because it doesn't place highly).
koakuma-chan•4h ago
Alright, I will try out Letta Code manually later then.
pacjam•4h ago
Cool, let us know what you think! Would recommend trying w/ Sonnet/Opus 4.5 or GPT-5.2 (those are the daily drivers we use internally w/ Letta Code)
shortlived•2h ago
I'm very interested in trying this out! I run Claude Code in sandbox with `--dangerously-skip-permissions`. Is that possible with Letta?
pacjam•2h ago
Yes! Letta Code also has a "danger" mode, it's `--yolo`. If you're running Claude Code in a sandbox in headless mode, Letta Code has that too, just do something like `letta -p "Do something dangerous (it's just a sandbox, after all)" --yolo`

More on permissions here: https://docs.letta.com/letta-code/permissions

Install is just `npm install -g @letta-ai/letta-code`

ascorbic•6h ago
Void is the greatest ad for Letta. I'm interested to see if it's as good at coding as it is at posting. https://bsky.app/profile/void.comind.network
pacjam•6h ago
I think Cameron (Void's handler) has some experience wiring up production Void to his computer via Letta Code
cpfiffer•6h ago
I do have some experience but haven't deployed Void on actual tasks, mostly because I want to keep Void focused on day-to-day social operations. I have considered giving Void subagents to handle coding tasks, which may be a good use case for Void-2: https://bsky.app/profile/void-2.comind.network
pacjam•6h ago
One cool option is having Void-2 run inside the Letta Code harness (in headless mode) on a sandbox to let is have free access over a computer, just to see what it will do while also connected to bluesky
jamilton•3h ago
What do you like about Void? It reads about how I would expect a base chat model to post.
Retr0id•2h ago
These kind of LLM bots can be fun to play with in a "try to make it say/do something silly" way, but beyond that I don't really get the point. The writing style is grating and I don't think I've ever seen one say anything genuinely useful.
tigranbs•6h ago
In my experience, "memory" is really not that helpful in most cases. For all of my projects, I keep the documentation files and feature specs up to date, so that LLMs are always aware of where to find what and which coding style guides the project is based on.

Maintaining the memory is a considerable burden, and make sure that simple "fix this linting" doesn't end up in the memory, as we always fix that type of issue in that particular way. That's also the major problem I have with ChatGPT's memory: it starts to respond from the perspective of "this is correct for this person".

I am curious who sees the benefits of the memory in coding? Is it like "learns how to code better" or it learns "how the project is structured". Either way, to me, this sounds like an easy project setup thing.

danieltanfh95•6h ago
context poisoning is a real problem that these memory providers only make worse.
pacjam•6h ago
IMO context poisoning is only fatal when you can't see what's going on (eg black box memory systems like ChatGPT memory). The memory system used in the OP is fully white box - you can see every raw LLM request (and see exactly how the memory influenced the final prompt payload).
handfuloflight•4h ago
That's significant, you can improve it in your own environment then.
pacjam•4h ago
Yeah exactly - it's all just tokens that you have full control over (you can run CRUD operations on). No hidden prompts / hidden memory.
pacjam•6h ago
I think it cuts both ways - for example I've definitely had the experience where when typing into ChatGPT I know ahead of time that whatever "memory" they're storing and injecting is likely going to degrade my answer, so I hop over to incognito mode. I've also had the experience where I've had a loosely related follow-up question to something and I didn't want to dig through my chat history to find the exact convo, so it's nice to know that ChatGPT will probably pull the relevant details into context.

I think similar concepts apply to coding - in some cases, you have all the context you need up front (good coding practices help with this), but in many cases, there's a lot of "tribal knowledge" scattered across various repos that a human vet working in the org would certainly know, but an agent wouldn't (of course, there's somewhat of a circular argument here that if the agent eventually learned this tribal knowledge, it could just write it down into a CLAUDE.md file ;)). I think there's also a clear separation between procedural knowledge and learned preferences, the former is probably better represented as skills committed to a repo, vs I view the latter more as a "system prompt learning" problem.

wooders•6h ago
I think the problem with ChatGPT / other RAG-based memory solutions is that it's not possible to collaborate with the agent on what it's memory should look like - so it makes sense that its much easier to just have a stateless system and message queue, to avoid mysterious pollution. But Letta's memory management is primarily text/files based so very transparent and controllable.

An example of how this kind of memory can help is learned skills https://www.letta.com/blog/skill-learning - if your agent takes the time to reflect/learn from experience and create a skill, that skills is much more effective at making it better next time than just putting the raw trajectory into context.

DrSiemer•5h ago
ChatGPTs implementation of Memory is terrible. It quickly fills up with useless garbage and sometimes even plain incorrect statements, that are usually only relevant to one obscure conversation I had with it months ago.

A local, project specific llm.md is absolutely something I require though. Without that, language models kept on "fixing" random things in my code that it considered to be incorrect, despite comments on those lines literally telling it to NOT CHANGE THIS LINE OR THIS COMMENT.

My llm.md is structured like this:

- Instructions for the LLM on how to use it

- Examples of a bad and a good note

- LLM editable notes on quirks in the project

It helps a lot with making an LLM understand when things are unusual for a reason.

Besides that file, I wrap every prompt in a project specific intro and outro. I use these to take care of common undesirable LLM behavior, like removing my comments.

I also tell it to use a specific format on its own comments, so I can make it automatically clean those up on the next pass, which takes care of most of the aftercare.

pacjam•5h ago
I'm curious - how do you currently manage this `llm.md` in the tooling you use? E.g., do you symlink `AGENTS/CLAUDE.md` to `llm.md`? Also, is there any information you duplicate across your project-specific `llm.md` files that could potentially be shared globally?
jstummbillig•6h ago
I find the long-term memory concepts with regards to AI curiously dubious.

On first glance, of course it's something we want. It's how we do it, after all! Learning on the job is what enables us to do our jobs and so many other things.

On the other hand humans are frustratingly stuck in their ways and not all that happy to change and that is something that societies or orgs fight a lot. Do I want to convince my coding agent to learn new behavior, conflicting with existing memory?

It's not at all obvious to me in how far memory is a bug or a feature. Does somebody have a clear case on why this is something that we should want and why it's not a problem?

pacjam•6h ago
> Does somebody have a clear case on why this is something that we should want

For coding agents, I think it's clear that nobody wants to repeat the same thing over an over again. If a coding agent makes a mistake once (like `git add .` instead of manually picking files), it should be able to "learn" and never make the same mistake again.

Though I definitely agree w/ you that we shouldn't aspire to 1:1 replicate human memory. We want to be able to make our machines "unlearn" easily when needed, and we also want them to be able to "share" memory with other agents in ways that simply isn't possible with humans (until we all get neuralinks I guess)

skybrian•5h ago
There are a variety of possible memory mechanisms including simple things recording a transcript (as a chatbot does) or having the LLM update markdown docs in a repo. So having memory isn't interesting. Instead, my question is: what does Letta's memory look like? Memory is a data structure. How is it structured and why is that good?

I'd be interested in hearing about how this approach compares with Beads [1].

[1] https://github.com/steveyegge/beads

pacjam•5h ago
Beads looks cool! I haven't tried it, but as far as I can tell, it's more of a "linear for agents" (memory as a tool), as opposed to baking long-term memory into the harness itself. In many ways, CLAUDE.md is a weak form of "baking memory into the harness", since AFAIK on bootup of `claude`, the CLAUDE.md gets "absorbed" and pinned in the system prompt.

Letta's memory system is designed off the MemGPT reference architecture, which is intentionally very simple - break the system prompt up into "memory blocks" (all pinned to the context window, since they are injected in system, which are modifiable via memory tools (the original MemGPT paper is still a good reference for what this looks like at a high level: https://research.memgpt.ai/). So it's more like a "living CLAUDE.md" that follows your agent around wherever it's deployed - ofc, it's also interoperable with CLAUDE.md. For example, when you boot up Letta Code and run `/init`, it will scan for AGENTS.md/CLAUDE.md, and will ingest the files into its memory blocks.

LMK if you have any other questions about how it works happy to explain more

handfuloflight•4h ago
Could Beads be additive to Letta's memory? Or could you anticipate conflict or confusion paths?
pacjam•2h ago
I think it's mostly complimentary, in the same way a linear MCP would be complementary to a MemGPT/Letta-style memory system

I guess the main potential point of confusion would arise if it's not clear to the LLM / agent which tool should be used for what. E.g. if you tell your agent to use Letta memory blocks as a scratchpad / TODO list, that functionality overlaps with Beads (I think?), so it's easy to imagine the agent getting confused due to stale data in either location. But as long as the instructions are clear about what context/memory to use for what task, it should be fine / complementary.

handfuloflight•2h ago
Great response, thank you. Will experiment then with projects that have already initialized Beads.
KingMob•4m ago
Bit of a tangent, but what's the codec used in your first video, https://iasi9yhacrkpgiie.public.blob.vercel-storage.com/lett... ?

Firefox says it can't play it.

I'd download and check it with ffprobe, but direct downloads seem blocked.