frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Parqeye – A CLI tool to visualize and inspect Parquet files

https://github.com/kaushiksrini/parqeye
167•kaushiksrini•2mo ago
I built a Rust-based CLI/terminal UI for inspecting Parquet files—data, metadata, and row-group-level structure—right from the terminal. If someone sent me a Parquet file, I used to open DuckDB or Polars just to see what was inside. Now I can do it with one command.

Repo: https://github.com/kaushiksrini/parqeye

Comments

WorldPeas•2mo ago
thank you so much! this was an annoyance of mine for so long. edit: any chance you make a brew package? if you'd like I'd be happy to PR it in.
kaushiksrini•2mo ago
yep! it’s available as a homebrew tap — you can install it with: `brew install kaushiksrini/parqeye/parqeye`
WorldPeas•2mo ago
wonderous.
dacox•2mo ago
awesome! i was just looking at a bucket full of parquet files from last year trying to recall some things about them.

i tried to install with brew, but it told me my cli tools were "too out of date". Never seen that before! and also just upgraded.

Will try again tomorrow

papers1010•2mo ago
It’s crazy how long we’ve gone without a tool like this. This is huge. Thank you for finally building this!
0cf8612b2e1e•2mo ago
It is really incredible how poor the parquet tooling has been for years. The cornerstone of data engineering, yet just inspecting a file is needlessly clunky.
lolive•2mo ago
Apart from some visual glitches, this is an INSTANT BUY !

Note: must the Windows binary really be 78MB ?

ch2026•2mo ago
CLIs are bulky
lolive•2mo ago
Can DuckDB be included in the tool, so you can run queries directly from the UI? [that would avoid opening DBeaver whenever you need that kind of feature]
lolive•2mo ago
Hu huuum... https://harlequin.sh/
mrasong•2mo ago
This tool actually feels pretty solid too.
banga•2mo ago
Looks like a nice tool, but failed for me when reading a geoparquet file created using duckdb.
kylebarron•2mo ago
Looks great!

Another seemingly extremely similar project released in the last few days: https://github.com/raulcd/datanomy

kaushiksrini•2mo ago
a growing need to look inside columnar data files!
jspanos2•2mo ago
This is very impressive. Look forward to using this
bigshik•2mo ago
Nice work—this hits a real pain point with Parquet. My main use case is debugging partitioned datasets on S3 with schema drift and skew, where I care about: which files/partitions have schema mismatches, weird row-group stats (all-null, out-of-range, huge skew), and doing that via metadata only.

Right now parqeye looks mainly single-file focused. Do you have plans for a “dataset mode” that takes a dir/S3 prefix and surfaces per-file/row-group summaries (row counts, min/max, null %, schema diffs vs a reference file) using just Parquet stats so it scales to tens of GB? Or do you see parqeye intentionally staying a single-file inspector?

swety101•2mo ago
Such a cool idea!! So helpful
dionian•2mo ago
tried it out. love it.
jasonjmcghee•2mo ago
Yours looks much better for your use case, but fwiw you can do it in a single command with duckdb too (but not interactive etc.):

    duckdb -c "from 'foo.parquet'"

but maybe still useful for other formats or multi-file or remote situations
llimllib•2mo ago
I use a little shell alias that drops me into duckdb with the file loaded into a table for interactive querying:

https://github.com/llimllib/personal_code/blob/c1a74b1b9527f...

joelthelion•2mo ago
What is really missing for parquet's wide adoption is support in Excel.
alentred•2mo ago
Very nice that it can show the metadata. If you rather focus on the data itself, a Swiss army knife in the terminal is VisiData [1] . It works with many formats from CSV to Parquet. You'd need to install Pyarrow I think to read Parquet files. VisiData is great to not only peek into the file but filter it, sort, compute simple metrics and even can plot a histogram or scatterplot for ex. I avoided a lot of Jupyter notebooks by using VisiData :)

[1] https://www.visidata.org/

nathanscully•2mo ago
I found a similar tool called nail-parquet[1] which has some nice query functions. I packaged[2] it up for nixpkgs but it’s stuck in merge limbo…

[1] https://github.com/Vitruves/nail-parquet [2] https://github.com/NixOS/nixpkgs/pull/449066

hilti•2mo ago
Similar tool for JSONL files: I built JSONL Viewer Pro after repeatedly crashing VS Code trying to inspect multi-GB training datasets and IoT device logs with nested objects.

Native Mac/Windows app with multi-threaded parsing (simdjson), automatic nested object flattening, and handles 10M+ rows instantly.

For HN: Use code HN100 for free access

https://iotdatasystems.gumroad.com/

Built with C++ for native performance (~6MB app, not Electron).

Would love feedback from folks working with large JSONL files.

tomtom1337•2mo ago
Super quick feedback - opening that link on my phone shows me two options next to each other, seemingly with the same name / description (followed by …) and same pricetag. I had to turn my phone sideways to see that there is a windows and a Mac version.

I think you can afford the extra characters to show the whole page in portrait mode. (iPhone 16 pro Safari)

https://imgur.com/a/aTxO3sp

hilti•2mo ago
I will change the description. Thank you!
hilti•2mo ago
Quick update: Mac ZIP had a corruption issue that's now fixed. Anyone who downloaded in the last few hours - please re-download!

Also just added a Data Plot feature for visualizing numeric columns.

Thanks to everyone who reported the issue!

el_oni•2mo ago
Beautiful, I'm currently deep into getting our data into iceberg from firehose and I'm really curious what metadata is written, are bloomfilters being written for the columns i want? Has my compaction and sort jobs helped min-max statistics on those columns?

Will take a look when i get to my laptop!

MayeulC•2mo ago
This looks very handy, thank you for working on this and making it open source.

I did submit a feature request for vi keybindings; though I could look into contributing this myself if I find a bit of spare time.

The other thing that surprised me was the size of the binaries: 90MB for a TUI tool (x64 Linux)? I wonder what the bulk of that is? Is there an issue with LTO? An other commenter noticed as well.

It also looks like you are building against a relatively recent glibc (2.34), which limits compatibility with older systems. Building against an older glibc can be hard to do, so I am not faulting you here, and you do provide a musl fallback, which is appreciated (mandatory notice that the musl allocator can dramatically degrade the performance of rust programs, just in case you were not aware of this).

A few more ideas for improvements (you probably already have your own laundry list):

- Mouse support?

- Seeing that you do have graphs, it would be fun to see a scatter plot as well as a distribution plot under statistics in the "Row Groups" tab (though you probably pull these from the metadata, so that would require further processing, which may be out of scope).

mgaunard•2mo ago
what was wrong with using a python repl with pyarrow/polars/duckdb for this?
pratio•2mo ago
This looks beautiful but we're heavily invested in s3 so I'll wait for remote support
amelius•2mo ago
Isn't this what we have spreadsheets for?

Also allows you to do computations on the data in place.

fluffet•2mo ago
Great! I worked a lot with parquet like 5 years ago. The frustration and tilt working with the tooling was immense. Thank you for building this, it feels like resolving some old knot in my soul.

Some kind soul made this repository then, and I found it on like the 13th page of Google while in the depths of despair. It is my most treasured GitHub star, a the shining beacon that saved me. I see it has saved 17 other people too.

https://github.com/casidiablo/parquet-tools-for-dumb-people-...

otsaloma•2mo ago
It's unfortunate that Python and R don't really have any out-of-the-box means of opening data files from arguments, but if you do this kind of stuff on a daily basis it's something that you can set up. My not directly usable examples below.

Python (uv + dataiter, but easy to modify for pandas or polars): https://github.com/otsaloma/dataiter/blob/master/bin/di-open

R (as per comment, requires also ~/.Rprofile code, nanoparquet in this case): https://github.com/otsaloma/R-tools/blob/master/r-load

seeg•2mo ago
Nice tool!

BTW, you can use duckdb with their ui plugin to have an interactive view of your data, not only parquet.

France's homegrown open source online office suite

https://github.com/suitenumerique
429•nar001•4h ago•203 comments

British drivers over 70 to face eye tests every three years

https://www.bbc.com/news/articles/c205nxy0p31o
134•bookofjoe•1h ago•112 comments

Start all of your commands with a comma (2009)

https://rhodesmill.org/brandon/2009/commands-with-comma/
438•theblazehen•2d ago•157 comments

Leisure Suit Larry's Al Lowe on model trains, funny deaths and Disney

https://spillhistorie.no/2026/02/06/interview-with-sierra-veteran-al-lowe/
26•thelok•1h ago•2 comments

Hoot: Scheme on WebAssembly

https://www.spritely.institute/hoot/
86•AlexeyBrin•5h ago•17 comments

OpenCiv3: Open-source, cross-platform reimagining of Civilization III

https://openciv3.org/
778•klaussilveira•19h ago•241 comments

Stories from 25 Years of Software Development

https://susam.net/twenty-five-years-of-computing.html
35•vinhnx•3h ago•4 comments

First Proof

https://arxiv.org/abs/2602.05192
38•samasblack•2h ago•24 comments

Software Factories and the Agentic Moment

https://factory.strongdm.ai/
19•mellosouls•2h ago•17 comments

Reinforcement Learning from Human Feedback

https://arxiv.org/abs/2504.12501
56•onurkanbkrc•4h ago•3 comments

The Waymo World Model

https://waymo.com/blog/2026/02/the-waymo-world-model-a-new-frontier-for-autonomous-driving-simula...
1027•xnx•1d ago•584 comments

Coding agents have replaced every framework I used

https://blog.alaindichiappari.dev/p/software-engineering-is-back
172•alainrk•4h ago•230 comments

Vocal Guide – belt sing without killing yourself

https://jesperordrup.github.io/vocal-guide/
168•jesperordrup•10h ago•62 comments

A Fresh Look at IBM 3270 Information Display System

https://www.rs-online.com/designspark/a-fresh-look-at-ibm-3270-information-display-system
24•rbanffy•4d ago•5 comments

StrongDM's AI team build serious software without even looking at the code

https://simonwillison.net/2026/Feb/7/software-factory/
18•simonw•2h ago•15 comments

Unseen Footage of Atari Battlezone Arcade Cabinet Production

https://arcadeblogger.com/2026/02/02/unseen-footage-of-atari-battlezone-cabinet-production/
103•videotopia•4d ago•27 comments

Vinklu Turns Forgotten Plot in Bucharest into Tiny Coffee Shop

https://design-milk.com/vinklu-turns-forgotten-plot-in-bucharest-into-tiny-coffee-shop/
5•surprisetalk•5d ago•0 comments

72M Points of Interest

https://tech.marksblogg.com/overture-places-pois.html
13•marklit•5d ago•0 comments

Show HN: Look Ma, No Linux: Shell, App Installer, Vi, Cc on ESP32-S3 / BreezyBox

https://github.com/valdanylchuk/breezydemo
265•isitcontent•20h ago•33 comments

Making geo joins faster with H3 indexes

https://floedb.ai/blog/how-we-made-geo-joins-400-faster-with-h3-indexes
152•matheusalmeida•2d ago•42 comments

Monty: A minimal, secure Python interpreter written in Rust for use by AI

https://github.com/pydantic/monty
277•dmpetrov•20h ago•147 comments

Ga68, a GNU Algol 68 Compiler

https://fosdem.org/2026/schedule/event/PEXRTN-ga68-intro/
35•matt_d•4d ago•10 comments

Hackers (1995) Animated Experience

https://hackers-1995.vercel.app/
546•todsacerdoti•1d ago•263 comments

Sheldon Brown's Bicycle Technical Info

https://www.sheldonbrown.com/
418•ostacke•1d ago•110 comments

What Is Ruliology?

https://writings.stephenwolfram.com/2026/01/what-is-ruliology/
65•helloplanets•4d ago•69 comments

Show HN: I spent 4 years building a UI design tool with only the features I use

https://vecti.com
364•vecti•22h ago•164 comments

Show HN: If you lose your memory, how to regain access to your computer?

https://eljojo.github.io/rememory/
338•eljojo•22h ago•207 comments

Show HN: Kappal – CLI to Run Docker Compose YML on Kubernetes for Local Dev

https://github.com/sandys/kappal
16•sandGorgon•2d ago•4 comments

An Update on Heroku

https://www.heroku.com/blog/an-update-on-heroku/
457•lstoll•1d ago•301 comments

Microsoft open-sources LiteBox, a security-focused library OS

https://github.com/microsoft/litebox
372•aktau•1d ago•195 comments