frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: pqry – A fast, lightweight CLI tool to diagnose Parquet datasets

https://github.com/symblic/pqry
4•setzeno•4h ago
Hi HN,

I’ve spent a lot of time debugging large Parquet datasets on S3 where “something is wrong”, but figuring out what usually means either accessing each file individually or even spinning up Spark just to inspect metadata.

In practice, it’s often things like:

- schema drift across partitions

- columns silently disappearing

- timestamp precision changes

- files written by different pipeline versions

- row groups with bad stats or empty data

By the time you notice, the dataset is already messy and hard to reason about.

So I built pqry, a Rust-based CLI tool that scans Parquet metadata at the dataset/prefix level and surfaces issues like schema drift, unstable columns, partition hotspots, and row-group health.

It works entirely from metadata, so you can point it at tens of thousands of files and get results fast.

Example:

- pqry drift s3://bucket/events/

- pqry columns s3://bucket/events/

- pqry quality s3://bucket/events/

Repo: https://github.com/symblic/pqry

I originally built this for debugging production pipelines where writers and schemas evolved over time and problems only showed up weeks later.

Would love feedback from anyone working with large Parquet datasets in production.

Show HN: A MitM proxy to see what your LLM tools are sending

https://github.com/jmuncor/sherlock
47•jmuncor•4h ago•22 comments

Show HN: The HN Arcade

https://andrewgy8.github.io/hnarcade/
292•yuppiepuppie•12h ago•76 comments

Show HN: Cursor for Userscripts

https://github.com/chebykinn/browser-code
27•mifydev•4h ago•10 comments

Show HN: SHDL – A minimal hardware description language built from logic gates

https://github.com/rafa-rrayes/SHDL
25•rafa_rrayes•11h ago•10 comments

Show HN: Frame – Managing projects, tasks, and context for Claude Code

2•kozhan•39m ago•0 comments

Show HN: Pinecone Explorer – Desktop GUI for the Pinecone vector database

https://www.pinecone-explorer.com
7•arsentjev•22h ago•0 comments

Show HN: Dwm.tmux – a dwm-inspired window manager for tmux

https://github.com/saysjonathan/dwm.tmux
85•saysjonathan•4d ago•16 comments

Show HN: Lendy – Keep track of books you have lended

https://lendy.viraat.dev/
7•viraatdas•22h ago•3 comments

Show HN: I built a small browser engine from scratch in C++

https://github.com/beginner-jhj/mini_browser
117•crediblejhj•9h ago•38 comments

Show HN: Config manager for Claude Code (and others) – rules, MCPs, permissions

https://github.com/regression-io/coder-config
9•jtr101•7h ago•0 comments

Show HN: Sandbox Agent SDK – unified API for automating coding agents

https://github.com/rivet-dev/sandbox-agent
17•NathanFlurry•9h ago•0 comments

Show HN: WordRE, Wordle for Real Estate

https://reidsherman.com/wordre/
6•reidjs•18h ago•0 comments

Show HN: Cua-Bench – a benchmark for AI agents in GUI environments

https://github.com/trycua/cua
34•someguy101010•2d ago•6 comments

Show HN: Build Web Automations via Demonstration

https://www.notte.cc/launch-week-i/demonstrate-mode
27•ogandreakiro•1d ago•10 comments

Show HN: I'm building an AI-proof writing tool. How would you defeat it?

https://auth-auth.vercel.app/
7•callmeed•5h ago•8 comments

Show HN: Spar – Built a tool to help improve store conversion rates

https://spar.cuped.ai
2•6farer•2h ago•0 comments

Show HN: Extracting React apps from Figma Make's undocumented binary format

https://albertsikkema.com/ai/development/tools/reverse-engineering/2026/01/23/reverse-engineering...
50•albertsikkema•5d ago•13 comments

Show HN: Record and share your coding sessions with CodeMic

https://codemic.io/#
10•seansh•9h ago•2 comments

Show HN: LemonSlice – Upgrade your voice agents to real-time video

114•lcolucci•1d ago•123 comments

Show HN: pqry – A fast, lightweight CLI tool to diagnose Parquet datasets

https://github.com/symblic/pqry
4•setzeno•4h ago•0 comments

Show HN: One Human + One Agent = One Browser From Scratch in 20K LOC

https://emsh.cat/one-human-one-agent-one-browser/
305•embedding-shape•1d ago•146 comments

Show HN: Fuzzy Studio – Apply live effects to videos/camera

https://fuzzy.ulyssepence.com/
52•ulyssepence•1d ago•19 comments

Show HN: SharpAPI – Real-time sports odds API with +EV and arbitrage detection

https://sharpapi.io
3•MykLaz•4h ago•0 comments

Show HN: We Built the 1. EU-Sovereignty Audit for Websites

https://lightwaves.io/en/eu-audit/
101•cmkr•1d ago•78 comments

Show HN: I wrapped the Zorks with an LLM

https://infocom.tambo.co/
104•alecf•1d ago•57 comments

Show HN: Ghostly: The Ultimate Platform for Ghosting Candidates (Satire)

https://staticfile-25978.wasmer.app/
2•dw1014•5h ago•0 comments

Show HN: PNANA - A TUI Text Editor

https://github.com/Cyxuan0311/PNANA
7•Frameser•10h ago•7 comments

Show HN: A header-only C++20 compile-time assembler for x86/x64 instructions

https://github.com/mahmoudimus/static_asm
2•mahmoudimus•7h ago•0 comments

Show HN: Is this the perfect 404 page? [CSS only]

https://github.com/AntiKippi/errorpages
3•AntiKippi•8h ago•0 comments

Show HN: We built a type-safe Python ORM for RedisGraph/FalkorDB

5•hello-tmst•8h ago•3 comments