frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: pqry – A fast, lightweight CLI tool to diagnose Parquet datasets

https://github.com/symblic/pqry
4•setzeno•2h ago
Hi HN,

I’ve spent a lot of time debugging large Parquet datasets on S3 where “something is wrong”, but figuring out what usually means either accessing each file individually or even spinning up Spark just to inspect metadata.

In practice, it’s often things like:

- schema drift across partitions

- columns silently disappearing

- timestamp precision changes

- files written by different pipeline versions

- row groups with bad stats or empty data

By the time you notice, the dataset is already messy and hard to reason about.

So I built pqry, a Rust-based CLI tool that scans Parquet metadata at the dataset/prefix level and surfaces issues like schema drift, unstable columns, partition hotspots, and row-group health.

It works entirely from metadata, so you can point it at tens of thousands of files and get results fast.

Example:

- pqry drift s3://bucket/events/

- pqry columns s3://bucket/events/

- pqry quality s3://bucket/events/

Repo: https://github.com/symblic/pqry

I originally built this for debugging production pipelines where writers and schemas evolved over time and problems only showed up weeks later.

Would love feedback from anyone working with large Parquet datasets in production.

Software Company Bonds Drop as Investors' AI Worries Mount

https://www.bloomberg.com/news/articles/2026-01-28/software-company-bonds-drop-as-investors-ai-wo...
1•koolhead17•47s ago•0 comments

Resilient growth as technology and adaptability offset trade policy headwinds

https://www.imf.org/en/publications/weo/issues/2026/01/19/world-economic-outlook-update-january-2026
1•nis0s•1m ago•0 comments

LiteRT: The Universal Framework for On-Device AI

https://developers.googleblog.com/litert-the-universal-framework-for-on-device-ai//
1•pretext•2m ago•0 comments

A compressed generative PHYSICS framework for AI-Higher accuracy speed 500×

https://www.vms-institute.org/AI/
1•VirgilH2Oss•3m ago•1 comments

We Studied 150 Developers Using AI (Here's What's Changed) [video]

https://www.youtube.com/watch?v=b9EbCb5A408
1•tomwphillips•5m ago•0 comments

Europe must act urgently and stop outsourcing defence, says EU's Kallas

https://www.bbc.com/news/articles/czej2z3zz9jo
1•breve•6m ago•0 comments

Meta Q4 2025 Earnings Call

https://investor.atmeta.com/investor-events/event-details/2026/Q4-2025-Earnings-Call/default.aspx
1•SilverElfin•6m ago•0 comments

A read-only Linux MCP server for safe LLM troubleshooting

https://www.thefactorysystem.ai/blog/building-secure-linux-mcp-server-gemini-cli
1•michael-elias•6m ago•1 comments

Designing programming languages beyond AI comprehension

1•mr_bob_sacamano•7m ago•0 comments

40 years after the Challenger disaster, spaceflight remains far from routine

https://www.space.com/space-exploration/human-spaceflight/40-years-after-the-space-shuttle-challe...
1•1659447091•9m ago•0 comments

Goodbye Perl

https://github.com/dotnet/fsharp/pull/19226
1•DASD•11m ago•0 comments

Show HN: Open-source alternative to Vercel, Render, Netlify

https://www.shorlabs.com/
12•a_cormance•12m ago•0 comments

UAE launches 'sovereign' open AI model to counter Chinese rivals

https://www.ft.com/content/465c717b-af26-48c1-a530-e9e6d313f96a
1•Anon84•12m ago•2 comments

Behind the Scenes of Metropolis (1927): Old Photos from a Cinematic Masterpiece

https://rarehistoricalphotos.com/metropolis-behind-the-scenes/
2•llm_nerd•13m ago•0 comments

New UK ruling makes stealing virtual currency an actual crime

https://metro.co.uk/2026/01/27/new-uk-ruling-makes-stealing-virtual-currency-actual-crime-26554985/
1•Vaslo•13m ago•0 comments

I used Claude to vibe-code my overcomplicated smart home

https://www.theverge.com/report/869318/claude-vibe-coding-home-assistant-smart-home
1•balloob•16m ago•0 comments

Attention Is Not What You Need

https://arxiv.org/abs/2512.19428
1•hnmouse•19m ago•0 comments

Kprotect: Kernel-Level Security Engine

https://github.com/khoinp1012/kprotect
2•sunshine-o•20m ago•0 comments

Show HN: Supercheck.io – Open-Source AI-Powered Test Automation and Monitoring

https://supercheck.io/
1•krish_kant•21m ago•0 comments

Show HN: Ziva – Cursor for Godot Game Engine

https://ziva.sh
1•OsrsNeedsf2P•23m ago•0 comments

Inverse Rendering for High-Genus 3D Surface Meshes from Multi-View Images

https://arxiv.org/abs/2601.12155
1•PaulHoule•24m ago•0 comments

Show HN: President or Asshole?

https://president.alephz.com/
2•ishener•24m ago•0 comments

Meta Reports Q4 and Full Year 2025 Results

https://investor.atmeta.com/investor-news/press-release-details/2026/Meta-Reports-Fourth-Quarter-...
3•mfiguiere•25m ago•0 comments

VP of Eng thinks Vibe Coding is "Cute" [video]

https://www.youtube.com/watch?v=puVtC9SNA2A
1•yummyelephant8•25m ago•1 comments

Alpine: The modern AI-native productivity suite

https://www.alpine.inc/
3•calebmer•26m ago•1 comments

State of Mozilla 2025

https://stateof.mozilla.org/
1•Levitz•27m ago•0 comments

I wrote a minimal PyTorch FSDP to understand how it works (~240 LOC)

https://github.com/0xNaN/edufsdp
1•xnan•29m ago•0 comments

The privacy risks of Google's Personal Intelligence

https://www.washingtonpost.com/technology/2026/01/27/google-personal-intelligence-privacy/
1•bookofjoe•31m ago•1 comments

UPS to cut additional 30k jobs in Amazon unwind, turnaround plan

https://www.cnbc.com/2026/01/27/ups-job-cuts-amazon-unwind-turnaround-plan.html
3•Noaidi•32m ago•0 comments

Apple to Soon Take Up to 30% Cut from All Patreon Creators in iOS App

https://www.macrumors.com/2026/01/28/patreon-apple-tax/
21•pier25•33m ago•6 comments