frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Moltbook isn't real but it can still hurt you

https://12gramsofcarbon.com/p/tech-things-moltbook-isnt-real-but
1•theahura•1m ago•0 comments

Take Back the Em Dash–and Your Voice

https://spin.atomicobject.com/take-back-em-dash/
1•ingve•2m ago•0 comments

Show HN: 289x speedup over MLP using Spectral Graphs

https://zenodo.org/login/?next=%2Fme%2Fuploads%3Fq%3D%26f%3Dshared_with_me%25253Afalse%26l%3Dlist...
1•andrespi•3m ago•0 comments

Teaching Mathematics

https://www.karlin.mff.cuni.cz/~spurny/doc/articles/arnold.htm
1•samuel246•5m ago•0 comments

3D Printed Microfluidic Multiplexing [video]

https://www.youtube.com/watch?v=VZ2ZcOzLnGg
2•downboots•6m ago•0 comments

Abstractions Are in the Eye of the Beholder

https://software.rajivprab.com/2019/08/29/abstractions-are-in-the-eye-of-the-beholder/
2•whack•6m ago•0 comments

Show HN: Routed Attention – 75-99% savings by routing between O(N) and O(N²)

https://zenodo.org/records/18518956
1•MikeBee•6m ago•0 comments

We didn't ask for this internet – Ezra Klein show [video]

https://www.youtube.com/shorts/ve02F0gyfjY
1•softwaredoug•7m ago•0 comments

The Real AI Talent War Is for Plumbers and Electricians

https://www.wired.com/story/why-there-arent-enough-electricians-and-plumbers-to-build-ai-data-cen...
2•geox•10m ago•0 comments

Show HN: MimiClaw, OpenClaw(Clawdbot)on $5 Chips

https://github.com/memovai/mimiclaw
1•ssslvky1•10m ago•0 comments

I Maintain My Blog in the Age of Agents

https://www.jerpint.io/blog/2026-02-07-how-i-maintain-my-blog-in-the-age-of-agents/
2•jerpint•10m ago•0 comments

The Fall of the Nerds

https://www.noahpinion.blog/p/the-fall-of-the-nerds
1•otoolep•12m ago•0 comments

I'm 15 and built a free tool for reading Greek/Latin texts. Would love feedback

https://the-lexicon-project.netlify.app/
2•breadwithjam•15m ago•1 comments

How close is AI to taking my job?

https://epoch.ai/gradient-updates/how-close-is-ai-to-taking-my-job
1•cjbarber•15m ago•0 comments

You are the reason I am not reviewing this PR

https://github.com/NixOS/nixpkgs/pull/479442
2•midzer•17m ago•1 comments

Show HN: FamilyMemories.video – Turn static old photos into 5s AI videos

https://familymemories.video
1•tareq_•18m ago•0 comments

How Meta Made Linux a Planet-Scale Load Balancer

https://softwarefrontier.substack.com/p/how-meta-turned-the-linux-kernel
1•CortexFlow•19m ago•0 comments

A Turing Test for AI Coding

https://t-cadet.github.io/programming-wisdom/#2026-02-06-a-turing-test-for-ai-coding
2•phi-system•19m ago•0 comments

How to Identify and Eliminate Unused AWS Resources

https://medium.com/@vkelk/how-to-identify-and-eliminate-unused-aws-resources-b0e2040b4de8
3•vkelk•20m ago•0 comments

A2CDVI – HDMI output from from the Apple IIc's digital video output connector

https://github.com/MrTechGadget/A2C_DVI_SMD
2•mmoogle•20m ago•0 comments

CLI for Common Playwright Actions

https://github.com/microsoft/playwright-cli
3•saikatsg•21m ago•0 comments

Would you use an e-commerce platform that shares transaction fees with users?

https://moondala.one/
1•HamoodBahzar•23m ago•1 comments

Show HN: SafeClaw – a way to manage multiple Claude Code instances in containers

https://github.com/ykdojo/safeclaw
3•ykdojo•26m ago•0 comments

The Future of the Global Open-Source AI Ecosystem: From DeepSeek to AI+

https://huggingface.co/blog/huggingface/one-year-since-the-deepseek-moment-blog-3
3•gmays•26m ago•0 comments

The Evolution of the Interface

https://www.asktog.com/columns/038MacUITrends.html
2•dhruv3006•28m ago•1 comments

Azure: Virtual network routing appliance overview

https://learn.microsoft.com/en-us/azure/virtual-network/virtual-network-routing-appliance-overview
3•mariuz•28m ago•0 comments

Seedance2 – multi-shot AI video generation

https://www.genstory.app/story-template/seedance2-ai-story-generator
2•RyanMu•32m ago•1 comments

Πfs – The Data-Free Filesystem

https://github.com/philipl/pifs
2•ravenical•35m ago•0 comments

Go-busybox: A sandboxable port of busybox for AI agents

https://github.com/rcarmo/go-busybox
3•rcarmo•36m ago•0 comments

Quantization-Aware Distillation for NVFP4 Inference Accuracy Recovery [pdf]

https://research.nvidia.com/labs/nemotron/files/NVFP4-QAD-Report.pdf
2•gmays•37m ago•0 comments
Open in hackernews

Incremental Backups of Gmail Takeouts

https://baecher.dev/stdout/incremental-backups-of-gmail-takeouts/
131•pbhn•1mo ago

Comments

pbhn•1mo ago
Gmail takeouts come in an arbitrarily-ordered mbox file; I wanted something a bit more backup friendly so I created a small tool for that purpose and wrote about it.
venusenvy47•1mo ago
I've been meaning to try this tool for backing up. I'm curious if anyone else has tried this.

https://github.com/rustmailer/bichon

SanjayMehta•1mo ago
Serious question: have you ever needed an email from even 5 years ago?

I only save financial statements and contact information. Everything else gets deleted as soon as possible.

raybb•1mo ago
I've enjoyed digging up an old flight itinerary to see how much I paid back in 2015 or just looking at the messages a company replied in support and realizing I'm not buying from them again because they didn't fix the problem.
mantra2•1mo ago
Looking up how much things used to cost? I too like being depressed.
xnx•1mo ago
> Everything else gets deleted as soon as possible.

What's the advantage to deleting? It's easy to ignore anything old and disk space is cheap. Do you delete old photos?

SanjayMehta•1mo ago
I've always run my own backup systems, from 150Mb QIC tape days. It became a habit to keep critical things to the minimum to reduce the number of tapes required.

As for photos, I print maybe 1 out of 100 and don't bother with the rest.

ifh-hn•1mo ago
My point of view is what's the advantages of keeping it?

I keep photos, though I don't keep all photos. But emails and messages? Why...

stephenhuey•1mo ago
How strictly do you define need? I've been living as an adult long enough that there have been countless times I've searched for photos and emails from one or two decades ago. I distinctly remember the first time I met an Inbox Zero person. It was so important to her to militantly delete everything she had dealt with, and to me, the disadvantages from that practice far outweigh the advantages.
jawns•1mo ago
Inbox Zero just means to deal with messages as they come in, then move them out of the inbox, generally to an archive section.

If she was hard-deleting everything, she wasn't just Inbox Zero, she was F---s Zero, too.

mantra2•1mo ago
I have, but very rarely. I could count on one hand how often I’ve needed to dig back more than half a decade ago.

Back when I used Gmail I just kept everything personal and work related but when I moved away and started paying for email storage I took a different approach. It didn’t make sense for me to pay considerably more storage for something I almost never use.

I ended up backing up all of my emails outside of the last 5 years and stored them on an offline drive where I can reference them as eml files if I ever need it.

Going forward once a year I’ll export and purge the oldest year in my account.

omoikane•1mo ago
I backed up lots of emails that I deemed precious, but I still search through email first, because sometimes it's just easier to search email than to search my backups.

Also, oftentimes I search email not so much for the content, but to find the timestamp associated with a particular event. I have had to search old email metadata a few times when I get an unexpected question related to time (for example, gmail will ask when you created the account as part of its account recovery process).

viraptor•1mo ago
Yes, looking for old documents proving things for the government.
Brajeshwar•1mo ago
All the time. I read an interesting thing about someone online, and that name strikes me as someone I have interacted with. I search my email archive, then reply to that thread or start a new one to catch up. All of them have been super happy, “wow! You replied to our email from 10 years ago!”

I do have “Clean Inbox”[1] because I don’t see or interact with them, but I keep them. The only emails I see are the actionable “Unread OR Flagged.”

1. https://brajeshwar.com/2024/email/

cosmic_cheese•1mo ago
Maybe not “need” in the strictest sense, but there have been more times than I can count where digging up old mail has either made things much faster and easier or helped me answer a random question that popped into my head about something that happened ages ago.

Old SMS, iMessage, Telegram etc messages have been useful from time to time too for similar reasons.

Both can also serve as exceptional time capsules that provide windows into past “eras” of life. I occasionally kick myself for not having archived mail and messages from a couple of defunct email addresses and chat apps… without them there’s a hole spanning a few years where visibility is limited.

ifh-hn•1mo ago
Seriously? I keep nothing unless I have to. No chat history no emails. Even if I did I'd never look back at them, what's the point?
nijave•1mo ago
At least a few times a year. Usually looking for old orders either model number or "how old is this thing" or how much it cost over time
icedchai•1mo ago
It has happened, but it's rare.

I let it pile up, rarely delete anything except marketing emails. Over 30K emails in my gmail inbox.

Fire-Dragon-DoL•1mo ago
When I was moving to Canada I had to dig stuff back from 6 years before!

That's why I keep all of them

prmoustache•1mo ago
Not on gmail but a company I worked for sent me my pay slips over email. While I also printed them, I also forwarded them to my private email address and kept them to this day on a separate mbox file.

Also when I can't remember the age of my nephews or the postal address of my siblings I just dig the birth / move announcements in my emails.

PunchyHamster•1mo ago
I routinely use it to look what a given product I bought and paid for, and by extension how old it is.

Also to reminisce how cheap stuff was.

So, yes

xnx•1mo ago
Unfortunately, many ecommerce sites have nerfed their receipt emails to make this information harder for you to find.
ggambetta•1mo ago
I switched to Gmail in 2007 or so. I used to have a gzipped mbox of my previous emails, dating back to maybe 1996 or 1997 when I got my first email account. This file was lost at some point, and I'm really sad about it. In some ways, it's like losing years and years of a journal, conversations I had with people, how I thought about the world at that age, etc. It's a huge loss to me.

About OP's tool, I also back up my Google account to an external disk periodically. Gmail is ~8 GB so it's manageable. But Google Photos is a pain. They recently removed most of the useful APIs, so AFAIK the only way to backup is via Takeout. It's terrible. Pictures in multiple albums are included as copies every time, so I had to make a script to find duplicates and replace them with symlinks. Just downloading the whole thing is a PITA (multiple 50 GB zip files). I get that Google has little incentive to make this better, in fact they might have an incentive to make it as inconvenient as possible, but I really wish they made it easier.

buu709•1mo ago
I'm really hoping you're wrong about the removed APIs as I recently tried doing a takeout and about 1/3rd of every album I checked was missing. Was really hoping I could find another tool to get my photos downloaded and moved out/backed up.
ggambetta•1mo ago
Pretty sure they're gone, that's why gphotos-sync and the like have stopped working (https://github.com/gilesknap/gphotos-sync-discussion/discuss...)
cxr•1mo ago
"Five years ago" was 2020. What you're asking is, "Have you, at some point in 2025, needed an email from 2019, 2018, etc. (i.e. from some time before that)?"

The answer: Yes, of course. (And I don't understand why anyone other than, say, a university undergrad or someone younger should find that answer surprising.)

ifh-hn•1mo ago
Near 50 and I can't say I need any email from 3 years ago; this is conservative, I'm probably ok with not needing emails from a year ago. Just like I don't need chat history either. What are you storing in your emails that you need to keep it beyond this? Genuinely interested. Genuinely have no clue why you'd be storing emails any length of time.
ifh-hn•1mo ago
Despite your down votes I'm completely in agreement. Emails and chat are transient and expire. Unless I need to keep it for, like you say, a good reason, then it's deleted.
SanjayMehta•1mo ago
The downvotes are due to my name. I have a fan club for pointing out hypocrisy in the past.
yooogurt•1mo ago
> if you want to back this file up regularly with something like restic, then you will quickly end up in a world of pain: since new mails are not even appended to the end of the file, each cycle of takeout-then-backup essentially produces a new giant file.

As I'm sure the author is aware, Restic will do hash-based chunking so that similar files can be efficiently be backed up.

How similar are two successive Takeout mboxes?

If the order of messages within an mbox is stable, and new emails are inserted somewhere, the delta update might be tiny.

Even if the order of the mbox's messages are ~random, Restic's delta updates will forego large attachments.

It would be great to see empirical figures here: how large is the incremental backup after after a month's emails. How does that compare for each backup strategy?

The pro of sticking with restic is simplicity, and also avoiding the risk of your tool managing to screw up the data.

This risk isn't so bad if it's a mature tool that canonicalises mboxes (e.g. order them by time), but seems risky for something handrolled.

Intralexical•1mo ago
> As I'm sure the author is aware, Restic will do hash-based chunking so that similar files can be efficiently be backed up.

> Even if the order of the mbox's messages are ~random, Restic's delta updates will forego large attachments.

I forget the exact number, but the rolling hashes for Restic and Borg are tuned to produce chunks sizes on the order of an entire megabyte.

Which means attachment file sizes need to be many megabytes in order for Restic to be much use, since the full chunk has to fall within the attachment. — You'd lose 0.5MB at both ends of each attachment on average, so a 5MB file would only be 80% deduped.

Nothing against Restic, but it's tuned for file-level backup, and I'm sure it wouldn't be as performant if it used chunks that were small enough to pick apart individual e-mails.

I suggested the author check out ZPAQ, which has a user-tunable average fragment size, and is arguably even simpler than Restic.

The ZPAQ file can then itself be efficiently backed up by Restic.

Brajeshwar•1mo ago
For emails, here is my current simple backup setup. Of course, I’m also looking to do this without having to open Thunderbird, or I might have an old laptop running it. So, work-in-progress.

For the email accounts I want a backup, I set it to spew out POP3 without doing anything (don’t mark read or delete). I set up Thunderbird with that POP3. It has a backup copy of all the emails. I’ve had searchable emails since like 2004/2005, and I’ve occasionally replied to people and gotten back in touch with very old friends from the Internet.

I saw an open-source tool sometime back (I think, here on Hacker News) that backs up your IMAP mails with a nicely done interface. That would be nice to have.

Edit: Perhaps Bichon,[1] mentioned somewhere in the other comment threads[2] was the one.

1. https://github.com/rustmailer/bichon

2. https://news.ycombinator.com/item?id=46429250

tehlike•1mo ago
Wouldn't it be nice if Google just dumped the takeout into a sqlite file?
jeffbee•1mo ago
How would that help 99.5% of their user base?
tehlike•1mo ago
As it currently stands, it's still not user friendly.
PunchyHamster•1mo ago
making it even less is not helpful
jeffbee•1mo ago
You literally just open mbox files with Apple Mail.app, it's the easiest action that the (now long-established) WIMP user interface offers.
nothrabannosir•1mo ago
Instead of the native format for e-mail archives? Supported by every MUA ever made? Not really, no.
PunchyHamster•1mo ago
absolutely not
pabs3•1mo ago
Why doesn't Google use zip/tar of a Maildir instead? Much better format than mbox. Converting the mbox to Maildir using standard tools would work too.
Intralexical•1mo ago
`zpaq add archive.zpaq new.mbox -fragment 0 -method 3` is great for this. It splits the input into fragments averaging 1024 bytes in size [0], which catches up to ~90% of redundancy. The remaining ~10% is packed and compressed into 64MB (max) blocks that are added to the .zpaq.

The resulting artifact is a single .zpaq file on disk. This file is only ever appended to, never overwritten, so it plays nice with Restic's own chunked deduplication. Plus it won't flood the filesystem with inodes and it suffers less small files overhead than TFA's solution.

Granted I suspect TFA splitting on the e-mail headers may be chunking more efficiently. Though, unless I skimmed the linked GitHub too fast, it looks like TFA's solution also doesn't use any solid compression to exploit redundancy across chunks. And I trust zpaq as a general purpose tool more than a one-off just for a single use case. The code does look clean, though, nice work.

[0] Average fragment size is 1024*2^N. If the most of the data is attachments that don't change, you can probably use a higher `-fragment N` to have less overhead keeping track of hashes. `-method 3` is a good middle ground for backups. `-m5` gets crazy high compression ratios, but also crazy slow speed. Old versions of ingested files are shadowed by default; use `-all` when you want to list/extract them.

csb6•1mo ago
Have you looked into using a full MIME/mbox parser library, e.g. GMime [0] or MimeKit [1]? Both support parsing mbox files directly, and they should be able to handle the intricacies of parsing any messages/attachments you throw at them. Then you could write out the MIME representation of each message (including any attachments) into its own file and then check for new messages. That way you can be sure each “chunk” represents a single message in its entirety. Not sure if this is any better since your solution seems to work pretty well.

[0] https://github.com/jstedfast/gmime

[1] https://github.com/jstedfast/MimeKit

marwis•1mo ago
Isn't it easier to just backup via IMAP to maildir versioned with git?

Does takeout include any metadata not accessible via IMAP? Does it even include labels?

nutjob2•1mo ago
I've read somewhere that a GMail user was banned or the like for using IMAP for backups, trying to find the reference...
faust201•1mo ago
Total BS. Lots of people use Thunderbird IMAP for Gmail.
actuallyalys•1mo ago
I think you're right that it's very unlikely to be a common thing. However, so many people use Gmail (including with setups like Thunderbird like you note) that it's totally possible someone really did get banned due to a total fluke.
PunchyHamster•1mo ago
You can just unpack it to file-per-email (format used by most sane email clients). There are dozen of programs to do it, one is included in Debian (and so most other distros), called mb2md
jl6•1mo ago
I have the same requirement and I solved it as follows:

Apply a label to emails dated after the last backup, using an “after:YYYY-MM-DD” search. Takeout then offers the option to export only that label. I do an annual backup so the amount of manual effort here is acceptable.

jsrozner•1mo ago
What about using the Gmail API and listening for recent changes? I suppose it wouldn't be in a mailbox format that could be easily exported to another provider, though?
jclulow•1mo ago
This is what I do! The API itself is not particularly amazing (the way it handles batch requests as a MIME multipart formatted POST body where each part is itself a HTTP request is particularly obscene).

The underlying data model is kind of OK though: messages are immutable, they get a long lived unique ID that survives changes to labels, etc. There is a history of changes you can grab incrementally. You can download the full message body that you can then parse as mail, and I save each one into a separate file and then index them in SQLite.

rigtorp•1mo ago
I have a tool that saves each mail as a single file using the Gmail API: https://github.com/rigtorp/gmbackup
neilv•1mo ago
That imprecise chunking (with MD5 sums, and distributed across an FS tree, and using hash IDs, then separately storing the assembly information) seems like it would be a headache to restore, or to use with other mail programs.

An alternative idea is to properly parse it and store in smaller mbox files, such as one file month, with the idea that any month in the past usually will not change. (And if it changes because they are storing frequently changing attributes in a faux header, like an atime, then maybe strip that header.) Then your incremental `restic` backups work fine, and you can also use it easily with a variety of mail programs (MUAs, impromptu IMAP servers for migration, quick text editor, etc.).

jinnko•1mo ago
I haven't used it for a while, but imapsync[0] still supports Gmail. With an approach like this you can regularly sync and get your messages in a standard format. Plus you don't have to wrangle those 50GB takeout dumps.

0: https://imapsync.lamiral.info/FAQ.d/FAQ.Gmail.txt

jokoon•1mo ago
I realized my inbox takes a lot of memory, even after a manual cleanup, it was still taking 5GB, despite regularly removing automated things and others.

I tried using takeout to have a more accurate listing. I thought I could open it with thunderbird, I failed, I then tried to open it with some python lib, also failed.

mcny•1mo ago
Memory or storage?

5GB of storage sounds not so bad.

jokoon•1mo ago
after cleaning up? I don't think so

I don't know how large is a single mail without image or attachment, though

hasperdi•1mo ago
I just setup Gmail backup the other day. Using getmail + cron. The emails get stored as maildir (1 mail = 1 file). It's incremental backup friendly
eterps•1mo ago
Maybe this could be helpful? -> https://github.com/pimalaya/neverest
crazygringo•1mo ago
I just use gmvault:

https://github.com/gaubert/gmvault

There's no reason to go through Takeout when IMAP exists.

WhyNotHugo•1mo ago
Extracting the attachment will break signatures. This likely won’t be a problem for 99% or the people out there, but it’s important to be aware of the caveat.

If at some point in the future you need to prove that you received a given message, having the signatures (eg: DKIM) intact makes all the difference.

rigtorp•1mo ago
Better to use the Gmail API to incrementally backup your mail: https://github.com/rigtorp/gmbackup
Cockbrand•1mo ago
Not sure if I'm missing something when I look at the discussion here, but there's Got Your Back [0], which backs up all emails in a Gmail account as .eml files.

This seems to be the easiest and most straightforward way for me.

[0] https://github.com/GAM-team/got-your-back