frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: I built a persistent LSM-Tree storage engine in Go from scratch

1•Jyotishmoy•1h ago
GO-LSM: BUILDING A LOG-STRUCTURED MERGE-TREE ENGINE

INTRODUCTION:

I've always been curious about the internals of databases, so I decided to build my own Log-Structured Merge-Tree (LSM-Tree) engine in Go to understand the "magic" behind write-optimized storage.

Go-LSM is a persistent Key-Value engine that manages the full lifecycle of data from volatile RAM to immutable disk storage.

TECHNICAL HIGHLIGHTS:

1. THE DURABILITY LAYER (WAL)

To ensure zero data loss, I implemented an append-only Write-Ahead Log.

   • Custom binary protocol: [Type][KeyLen][ValLen][Key][Value]
   • File.Sync() to force kernel flushes to physical hardware
   • Ensures absolute durability on system crashes

2. SKIPLIST MEMTABLE

Instead of a standard tree, I used a SkipList for the in-memory layer.

   • Provides O(log n) search and insertion without the rebalancing complexity of Red-Black trees
   • Keeps keys lexicographically sorted for the eventual SSTable flush
   • Enables efficient memory-to-disk transitions

3. SSTABLE FOOTER-BASED INDEXING

My SSTables are binary-searchable on disk using a tail-first reading strategy:

   • Jump to the last 8 bytes of a file to find the Index Block pointer
   • Avoid full file scans by reading directly to the index
   • Execute binary search on sorted keys for O(log n) disk lookups

4. MAINTENANCE LAYER: COMPACTION

Built a K-Way Merge compaction engine that handles performance optimization:

   • Processes multiple SSTable layers and merges them into single, optimized files
   • Handles "Read Amplification" by reducing the number of files to check per query
   • Processes "Tombstones" to finalize deletions and reclaim disk space

5. TOOLING & DEBUGGING

Custom binary parsers transform raw binary files into human-readable tables:

   • lsm-dump: View sorted SSTable contents
   • lsm-wal-dump: Inspect unflushed Write-Ahead Log entries
   • Enables deep inspection of internal storage layers

KEY ENGINEERING LESSONS:

Moving from standard application development to systems programming required a fundamental shift in how I think about memory and I/O:

   • ENDIANNESS LOGIC: Handling Big-Endian vs. Little-Endian conversions for cross-platform compatibility
   • FILE OFFSET MANAGEMENT: Manually managing byte offsets and file pointer positioning
   • CONCURRENCY & THREAD SAFETY: Implementing thread-safe mechanisms for MemTable flushing
   • BINARY PROTOCOL DESIGN: Creating efficient, compact data encodings for durability

REPOSITORY:

https://github.com/Jyotishmoy12/LSM-Tree-in-Golang

A White House Staffer Appears to Run Pro-Trump X Account

https://www.wired.com/story/a-white-house-staffer-appears-to-run-massive-pro-trump-meme-page/
1•doener•40s ago•0 comments

Show HN: Onera – Private LLM Inference Inside AMD SEV-SNP Enclaves

https://onera.chat
1•shreyaspapi•1m ago•1 comments

Next-Token Predictor Is an AI's Job, Not Its Species

https://www.astralcodexten.com/p/next-token-predictor-is-an-ais-job
1•bananaflag•1m ago•0 comments

Tests Are the New Moat

https://saewitz.com/tests-are-the-new-moat
1•vinhnx•5m ago•0 comments

'Access to Insight' is shutting down

https://www.accesstoinsight.org/
1•bifftastic•5m ago•0 comments

The next batch of fixed Epstein files links and notes is live

https://xcancel.com/IAmAnonLegion/status/2026853415863615662?s=20
1•doener•5m ago•0 comments

Programming has changed dramatically due to AI in the last 2 months (Karpathy)

https://twitter.com/karpathy/status/2026731645169185220
2•bakigul•8m ago•0 comments

Demo of an indie AI collaboration app – beyond Codex and Claude Code desktop

1•seeksky•10m ago•1 comments

AIQuotaBar – macOS menu bar app that shows Claude and ChatGPT usage limits

https://github.com/yagcioglutoprak/AIQuotaBar
1•toprak123•15m ago•0 comments

Git City – Your GitHub as a 3D City

https://www.thegitcity.com/
1•duck•15m ago•1 comments

Mumsnet campaign demands ban on social media for under-16s

https://www.theguardian.com/society/2026/feb/26/mumsnet-campaign-demands-ban-social-media-under-16s
2•pmg101•18m ago•0 comments

Shipcast – Turn your Git commits into tweets, automatically

https://shipcast.dev/
1•guoyu•18m ago•0 comments

Show HN: LucidExtractor – Extract web data in plain English, no selectors

https://lucidextractor.liceron.in
1•yukendiran_j•23m ago•0 comments

A larger cage: about the ongoing calls for "digital sovereignty"

https://www.structural-integrity.eu/a-larger-cage-about-the-ongoing-calls-for-digital-sovereignty/
1•doener•24m ago•0 comments

Earth's heat to power 10k homes in renewable energy first for UK

https://www.bbc.co.uk/news/articles/cewzg77k721o
2•RobinL•24m ago•0 comments

Show HN: Snaplake – Query past database states without restoring backups

https://snaplake.clroot.io
1•clroot•24m ago•0 comments

Show HN: Context Harness – Local first context engine for AI tools

https://github.com/parallax-labs/context-harness
1•__parallaxis•24m ago•0 comments

Perplexity Computer

https://www.perplexity.ai/hub/blog/introducing-perplexity-computer
1•kamaal•24m ago•0 comments

Show HN: I Made an AI Skill to Help Write Tlaps Proofs

https://github.com/younes-io/agent-skills/blob/main/skills/tlaps-workbench/SKILL.md
1•youio•24m ago•0 comments

Implementing a Clear Room Z80 / ZX Spectrum Emulator with Claude Code

https://antirez.com/news/160
2•boyter•26m ago•0 comments

RUS-Pat Bringing Optical Color to Ultrasound

https://www.caltech.edu/about/news/bringing-optical-color-to-ultrasound
1•Liquidity•26m ago•0 comments

Show HN: SendView – Mail merge from Airtable/GSheets, sends through your email

https://sendview.app/
2•jbrake•33m ago•0 comments

Evidence for the weak Sapir-Whorf hypothesis

https://twitter.com/colingorrie/status/2026658482959565246
2•MrBuddyCasino•33m ago•0 comments

Apple's Touch-Screen Laptop to Have Dynamic Island, New Mac Interface

https://www.bloomberg.com/news/articles/2026-02-24/apple-s-touch-screen-macbook-pro-to-have-dynam...
3•mpweiher•34m ago•0 comments

Show HN: Trust-gated developer communities with portable identity (AT Protocol)

https://github.com/JohannaWeb/ProjectFalcon
2•JohannaWeb•36m ago•1 comments

A Logic Named Joe(1946)

https://www.baen.com/chapters/W200506/0743499107___2.htm
2•largbae•37m ago•0 comments

Open-Source Discord Alternatives

https://lwn.net/SubscriberLink/1058319/7f10cd1d82956e9f/
2•mkesper•38m ago•0 comments

Burned $250 in tokens on Day 1 with OpenClaw

2•aposded•38m ago•0 comments

You are likely unable to connect to http://archive.ph

https://twitter.com/pberrini/status/2026884672584867986
2•petethomas•39m ago•1 comments

Show HN: Sleeping LLM – A language model that remembers by sleeping

https://github.com/vbario/sleeping-llm
2•vbaranov87•41m ago•0 comments