frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Fixing a single pointer bug unlocked 1M+ row JSON parsing on Windows

4•hilti•2mo ago
I've been building a cross-platform JSONL viewer app that handles multi-GB files. It worked perfectly on macOS (my development machine), but consistently crashed on Windows at exactly 2,650 KB. Here's the debugging journey and the tiny fix that made all the difference.

The Problem

- macOS: Handles 5GB+ files effortlessly - Windows: Crashes at 2,650 KB every time - Same codebase, cross-compiled from Mac Silicon to Windows using MinGW

The Investigation

Added detailed logging to track execution. The crash happened during string interning after successfully parsing ~6,000 rows. Not during parsing, not during file I/O, but during the merge phase.

The Root Cause

My StringPool class used std::unordered_map<std::string_view, uint32_t> to deduplicate strings. The string_views pointed into a std::vector<std::string>.

When the vector grew and reallocated, all the string_view keys became dangling pointers. The hash map was full of invalid references.

Why did it work on macOS? Different memory allocator behavior, different default stack sizes (8MB vs 1MB), different reallocation patterns.

The Fix

Before (broken):

    uint32_t intern(std::string_view str) {
        auto it = indices_.find(str);
        if (it != indices_.end()) return it->second;
        
        uint32_t idx = strings_.size();
        strings_.push_back(std::string(str));
        indices_[std::string_view(strings_.back())] = idx;  // DANGER!
        return idx;
    }
After (fixed):

    uint32_t intern(const std::string& str) {
        auto it = indices_.find(std::string_view(str));
        if (it != indices_.end()) return it->second;
        
        // Preemptively rebuild if we're about to reallocate
        if (strings_.size() >= strings_.capacity()) {
            strings_.reserve(strings_.capacity() * 2);
            rebuildIndices();  // Fix all string_views!
        }
        
        uint32_t idx = strings_.size();
        strings_.push_back(str);
        indices_[std::string_view(strings_.back())] = idx;
        return idx;
    }
    
    void rebuildIndices() {
        indices_.clear();
        for (size_t i = 0; i < strings_.size(); i++) {
            indices_[std::string_view(strings_[i])] = i;
        }
    }
The Result

- 1 million rows: 6 seconds on Windows - Multi-GB files: No crashes - ~166,000 rows/second throughput - Cross-platform stability

Lessons Learned

1. std::string_view is powerful but dangerous - It's a non-owning reference. When the underlying storage moves, you're holding garbage.

2. Cross-platform testing is essential - The bug was invisible on macOS due to different allocator behavior and larger default stack sizes.

3. Structured logging beats debuggers for cross-compilation - I was cross-compiling from Mac to Windows. Adding timestamped logging to a file made the crash point obvious immediately.

4. Small changes, huge impact - One function, ~15 lines of code, turned "crashes at 2MB" into "handles 5GB+ files"

5. Performance stayed excellent - The rebuild only happens during vector reallocation (exponential growth), so amortized cost is negligible.

The Tech Stack

- simdjson (v4.2.2) for parsing - Multi-threaded parsing (20 threads on my test machine) - Columnar storage for memory efficiency - C++17, cross-compiled with MinGW-w64

This was a humbling reminder that the most critical bugs are often the simplest ones, hiding in plain sight behind platform differences.

Happy to discuss the implementation details, simdjson usage, or cross-platform C++ debugging techniques!

Eight More Months of Agents

https://crawshaw.io/blog/eight-more-months-of-agents
1•archb•2m ago•0 comments

From Human Thought to Machine Coordination

https://www.psychologytoday.com/us/blog/the-digital-self/202602/from-human-thought-to-machine-coo...
1•walterbell•2m ago•0 comments

The new X API pricing must be a joke

https://developer.x.com/
1•danver0•3m ago•0 comments

Show HN: RMA Dashboard fast SAST results for monorepos (SARIF and triage)

https://rma-dashboard.bukhari-kibuka7.workers.dev/
1•bumahkib7•3m ago•0 comments

Show HN: Source code graphRAG for Java/Kotlin development based on jQAssistant

https://github.com/2015xli/jqassistant-graph-rag
1•artigent•8m ago•0 comments

Python Only Has One Real Competitor

https://mccue.dev/pages/2-6-26-python-competitor
2•dragandj•10m ago•0 comments

Tmux to Zellij (and Back)

https://www.mauriciopoppe.com/notes/tmux-to-zellij/
1•maurizzzio•11m ago•1 comments

Ask HN: How are you using specialized agents to accelerate your work?

1•otterley•12m ago•0 comments

Passing user_id through 6 services? OTel Baggage fixes this

https://signoz.io/blog/otel-baggage/
1•pranay01•13m ago•0 comments

DavMail Pop/IMAP/SMTP/Caldav/Carddav/LDAP Exchange Gateway

https://davmail.sourceforge.net/
1•todsacerdoti•13m ago•0 comments

Visual data modelling in the browser (open source)

https://github.com/sqlmodel/sqlmodel
1•Sean766•15m ago•0 comments

Show HN: Tharos – CLI to find and autofix security bugs using local LLMs

https://github.com/chinonsochikelue/tharos
1•fluantix•16m ago•0 comments

Oddly Simple GUI Programs

https://simonsafar.com/2024/win32_lights/
1•MaximilianEmel•16m ago•0 comments

The New Playbook for Leaders [pdf]

https://www.ibli.com/IBLI%20OnePagers%20The%20Plays%20Summarized.pdf
1•mooreds•17m ago•0 comments

Interactive Unboxing of J Dilla's Donuts

https://donuts20.vercel.app
1•sngahane•18m ago•0 comments

OneCourt helps blind and low-vision fans to track Super Bowl live

https://www.dezeen.com/2026/02/06/onecourt-tactile-device-super-bowl-blind-low-vision-fans/
1•gaws•20m ago•0 comments

Rudolf Vrba

https://en.wikipedia.org/wiki/Rudolf_Vrba
1•mooreds•20m ago•0 comments

Autism Incidence in Girls and Boys May Be Nearly Equal, Study Suggests

https://www.medpagetoday.com/neurology/autism/119747
1•paulpauper•21m ago•0 comments

Wellness Hotels Discovery Application

https://aurio.place/
1•cherrylinedev•22m ago•1 comments

NASA delays moon rocket launch by a month after fuel leaks during test

https://www.theguardian.com/science/2026/feb/03/nasa-delays-moon-rocket-launch-month-fuel-leaks-a...
1•mooreds•22m ago•0 comments

Sebastian Galiani on the Marginal Revolution

https://marginalrevolution.com/marginalrevolution/2026/02/sebastian-galiani-on-the-marginal-revol...
2•paulpauper•26m ago•0 comments

Ask HN: Are we at the point where software can improve itself?

1•ManuelKiessling•26m ago•1 comments

Binance Gives Trump Family's Crypto Firm a Leg Up

https://www.nytimes.com/2026/02/07/business/binance-trump-crypto.html
1•paulpauper•26m ago•1 comments

Reverse engineering Chinese 'shit-program' for absolute glory: R/ClaudeCode

https://old.reddit.com/r/ClaudeCode/comments/1qy5l0n/reverse_engineering_chinese_shitprogram_for/
1•edward•26m ago•0 comments

Indian Culture

https://indianculture.gov.in/
1•saikatsg•29m ago•0 comments

Show HN: Maravel-Framework 10.61 prevents circular dependency

https://marius-ciclistu.medium.com/maravel-framework-10-61-0-prevents-circular-dependency-cdb5d25...
1•marius-ciclistu•29m ago•0 comments

The age of a treacherous, falling dollar

https://www.economist.com/leaders/2026/02/05/the-age-of-a-treacherous-falling-dollar
2•stopbulying•29m ago•0 comments

Ask HN: AI Generated Diagrams

1•voidhorse•32m ago•0 comments

Microsoft Account bugs locked me out of Notepad – are Thin Clients ruining PCs?

https://www.windowscentral.com/microsoft/windows-11/windows-locked-me-out-of-notepad-is-the-thin-...
8•josephcsible•32m ago•3 comments

Show HN: A delightful Mac app to vibe code beautiful iOS apps

https://milq.ai/hacker-news
6•jdjuwadi•35m ago•1 comments