frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Fahmatrix – A Lightweight, Pandas-Like DataFrame Library for Java

https://github.com/moustafa-nasr/fahmatrix
46•mousomashakel•10mo ago
Hey HN, I’ve built Fahmatrix, a minimal, fast Java library for working with tabular data — inspired by Python’s pandas, but designed for performance and simplicity on the JVM.

After working extensively with Python’s data stack, I often ran into limitations related to speed, especially in larger or long-running data workflows. So I built Fahmatrix from scratch to offer similar APIs for manipulating CSVs, performing summary statistics, slicing rows/columns, and more — but all in Java.

Features:

Lightweight and dependency-free

CSV/TSV import with auto-headers

Series/DataFrame structures (like pandas)

describe(), mean(), stdDev(), percentile() and more

Fast parallel operations on numeric columns

Java 17+ support

Docs: https://moustafa-nasr.github.io/Fahmatrix/ GitHub: https://github.com/moustafa-nasr/fahmatrix

I’d love feedback from the Java and data communities — especially if you’ve ever wanted a simple dataframe utility in Java without needing full-scale ML libraries.

Happy to answer any questions!

Comments

rickette•10mo ago
Congrats on putting this out there. There isn't a de facto pandas-like library in Java like you said. But for Kotlin there is: https://github.com/Kotlin/dataframe
mousomashakel•10mo ago
Thanks so much! Yep, I’ve seen the Kotlin DataFrame lib — very elegant. Fahmatrix is meant for plain Java users who want similar capabilities without switching ecosystems. Appreciate the support!
uwemaurer•10mo ago
Always great to see efforts to make working with data frames easier. Here are some similar data frame libraries for Java:

https://github.com/jtablesaw/tablesaw

https://github.com/dflib/dflib

My preferred way is just use duckdb java API. I didn't see anything better in performance/efficiency. Also a SQL query is often easier to write

theanonymousone•10mo ago
Yes. It has bothered me for a long time too. Maybe the best mix is a dataframe library with basic operations (column select, non-null etc), which also allows SQL for more complex stuff?
radus•10mo ago
Polars and duckdb interoperate nicely and can enable this flexibility
theanonymousone•10mo ago
Does Polars have a Java library?
mousomashakel•10mo ago
Totally agree that SQL can be the best tool for many jobs. My goal with Fahmatrix is to serve the opposite niche: where devs want something that's Java-native, procedural, and simple without reaching for an external engine. SQL support or DSL might come later though — I see the appeal.
theanonymousone•10mo ago
Sure. So maybe notehr comment would be to make it (particularly the Series class), as compatible with Java Streams as possible.

Next step would likely be compatibility with popular libraries such as Apache Commons Math: https://commons.apache.org/proper/commons-math/userguide/sta...

mousomashakel•10mo ago
Thanks! I'm aware of those great projects. Fahmatrix aims to offer a lightweight, dependency-free alternative that’s easy to embed in any Java app. DuckDB is super impressive, especially for SQL-heavy tasks — but my goal is more about a native, fluent API for those who prefer direct Java code over SQL.
skanga•10mo ago
What about Tablesaw, Apache Arrow? How does this compare ...
mousomashakel•10mo ago
Good question. I’ll publish benchmarks soon, but the core difference is that Fahmatrix is fully Java, no JNI, and minimalistic — ideal for small projects or environments like Android. Tablesaw and Arrow are more powerful, but heavier. Fahmatrix aims to be the “just enough” middle ground.
owlstuffing•10mo ago
Nice!

I’m currently using manifold-sql with duckdb for this.

mousomashakel•10mo ago
Thanks! That’s a great combo — manifold-sql + duckdb gives you strong typing with powerful SQL under the hood. Fahmatrix is aiming to complement that approach for cases where you want quick, native Java code without SQL — e.g., when building data flows or custom logic inline. Would love to hear if you’ve hit any pain points that a Java-native approach could help with.

Show HN: Sub-millisecond VM sandboxes using CoW memory forking

https://github.com/adammiribyan/zeroboot
121•adammiribyan•17h ago•25 comments

Show HN: The Lottery of Life

https://claude.ai/public/artifacts/a62c4bac-3c05-4443-9d0a-50a9bd3f9d8d
5•atulvi•1h ago•2 comments

Show HN: Fatal Core Dump – A debugging murder mystery played with GDB

https://www.robopenguins.com/fatal_core_dump/
51•axlan•4d ago•1 comments

Show HN: I built an interactive 3D three-body problem simulator in the browser

https://structuredlabs.github.io/threebodyproblem/
45•amrutha_•4d ago•16 comments

Show HN: N0x – LLM inference, agents, RAG, Python exec in browser, no back end

https://n0xth.vercel.app/
2•redhanuman•1h ago•0 comments

Show HN: Crust – A CLI framework for TypeScript and Bun

https://github.com/chenxin-yan/crust
75•jellyotsiro•1d ago•33 comments

Show HN: Horizon – GPU-accelerated infinite-canvas terminal in Rust

https://github.com/peters/horizon
62•petersunde•12h ago•22 comments

Show HN: Bank Parser – Convert US Bank Statement PDFs to QuickBooks-Ready Excel

https://bank-parser.com
2•zetbaur•1h ago•0 comments

Show HN: Antfly: Distributed, Multimodal Search and Memory and Graphs in Go

https://github.com/antflydb/antfly
85•kingcauchy•15h ago•36 comments

Show HN: CollabMD – Real-time multiplayer for local and Git-backed Markdown

https://github.com/andes90/collabmd
2•ndezt•3h ago•3 comments

Show HN: Claude Code skills that build complete Godot games

https://github.com/htdt/godogen
310•htdt•1d ago•192 comments

Show HN: Dump – easily share context with AI

https://www.dump.page
4•vochsel•4h ago•0 comments

Show HN: Thermal Receipt Printers – Markdown and Web UI

https://github.com/sadreck/ThermalMarky
114•howlett•4d ago•45 comments

Show HN: Hat v0.7.0 – Fast, local automatic file compression and conversion

https://github.com/bittere/hat
2•_bittere•1h ago•0 comments

Show HN: I built a message board where you pay to be the homepage

https://saythat.sh
13•SayThatSh•18h ago•11 comments

Show HN: Soros – AI for geopolitical macro investing

https://www.asksoros.com
8•muggermuch•9h ago•7 comments

Show HN: Sonder – self-hosted AI social simulation engine

https://github.com/RedsonNgwira/sonder
3•RedsonNgwira•6h ago•3 comments

Show HN: March Madness Bracket Challenge for AI Agents Only

https://www.Bracketmadness.ai
62•bwade818•18h ago•40 comments

Show HN: CodeLedger – deterministic context and guardrails for AI

https://codeledger.dev
2•ashmivante•7h ago•0 comments

Show HN: Score your GitHub repo for AI coding agents

https://twill.ai/scorecard
5•danoandco•8h ago•3 comments

Show HN: GitGlimpse – GitHub Action that generates UI/UX demos for your PRs

https://github.com/DeDuckProject/git-glimpse
4•fatach•9h ago•0 comments

Show HN: Zeroboot – sub-millisecond VM sandboxes using CoW memory forking

https://github.com/adammiribyan/zeroboot
20•adammiribyan•16h ago•8 comments

Show HN: Grape – AI note taking app

https://grape.cool
3•ozgrozer•10h ago•1 comments

Show HN: M68k assembly emulator that runs in the browser

https://github.com/gianlucarea/m68k-interpreter
13•aldino97•20h ago•2 comments

Show HN: Signet – Autonomous wildfire tracking from satellite and weather data

https://signet.watch
123•mapldx•2d ago•32 comments

Show HN: A 4-layer self-audit system for AI behavioral evolution

https://github.com/oscarsterling/reasoning-loop
4•jhaugh•10h ago•0 comments

Show HN: Hackerbrief – Top posts on Hacker News summarized daily

https://hackerbrief.vercel.app/
75•p0u4a•1d ago•46 comments

Show HN: GDSL – 800 line kernel: Lisp subset in 500, C subset in 1300

https://firthemouse.github.io/
89•FirTheMouse•2d ago•20 comments

Show HN: Sulcus Reactive AI Memory

https://sulcus.dforge.ca
4•mcdoolz•11h ago•0 comments

Show HN: TerraShift: What does +2°C (or -20°C) look like on Earth?

https://terrashift.io
4•ttruett•11h ago•2 comments