frontpage.
newsnewestaskshowjobs

Open Source @Github

fp.

Open in hackernews

Show HN: Fahmatrix – A Lightweight, Pandas-Like DataFrame Library for Java

https://github.com/moustafa-nasr/fahmatrix
46•mousomashakel•1y ago
Hey HN, I’ve built Fahmatrix, a minimal, fast Java library for working with tabular data — inspired by Python’s pandas, but designed for performance and simplicity on the JVM.

After working extensively with Python’s data stack, I often ran into limitations related to speed, especially in larger or long-running data workflows. So I built Fahmatrix from scratch to offer similar APIs for manipulating CSVs, performing summary statistics, slicing rows/columns, and more — but all in Java.

Features:

Lightweight and dependency-free

CSV/TSV import with auto-headers

Series/DataFrame structures (like pandas)

describe(), mean(), stdDev(), percentile() and more

Fast parallel operations on numeric columns

Java 17+ support

Docs: https://moustafa-nasr.github.io/Fahmatrix/ GitHub: https://github.com/moustafa-nasr/fahmatrix

I’d love feedback from the Java and data communities — especially if you’ve ever wanted a simple dataframe utility in Java without needing full-scale ML libraries.

Happy to answer any questions!

Comments

rickette•1y ago
Congrats on putting this out there. There isn't a de facto pandas-like library in Java like you said. But for Kotlin there is: https://github.com/Kotlin/dataframe
mousomashakel•1y ago
Thanks so much! Yep, I’ve seen the Kotlin DataFrame lib — very elegant. Fahmatrix is meant for plain Java users who want similar capabilities without switching ecosystems. Appreciate the support!
uwemaurer•1y ago
Always great to see efforts to make working with data frames easier. Here are some similar data frame libraries for Java:

https://github.com/jtablesaw/tablesaw

https://github.com/dflib/dflib

My preferred way is just use duckdb java API. I didn't see anything better in performance/efficiency. Also a SQL query is often easier to write

theanonymousone•1y ago
Yes. It has bothered me for a long time too. Maybe the best mix is a dataframe library with basic operations (column select, non-null etc), which also allows SQL for more complex stuff?
radus•1y ago
Polars and duckdb interoperate nicely and can enable this flexibility
theanonymousone•1y ago
Does Polars have a Java library?
mousomashakel•1y ago
Totally agree that SQL can be the best tool for many jobs. My goal with Fahmatrix is to serve the opposite niche: where devs want something that's Java-native, procedural, and simple without reaching for an external engine. SQL support or DSL might come later though — I see the appeal.
skanga•1y ago
What about Tablesaw, Apache Arrow? How does this compare ...
mousomashakel•1y ago
Good question. I’ll publish benchmarks soon, but the core difference is that Fahmatrix is fully Java, no JNI, and minimalistic — ideal for small projects or environments like Android. Tablesaw and Arrow are more powerful, but heavier. Fahmatrix aims to be the “just enough” middle ground.
owlstuffing•1y ago
Nice!

I’m currently using manifold-sql with duckdb for this.

mousomashakel•1y ago
Thanks! That’s a great combo — manifold-sql + duckdb gives you strong typing with powerful SQL under the hood. Fahmatrix is aiming to complement that approach for cases where you want quick, native Java code without SQL — e.g., when building data flows or custom logic inline. Would love to hear if you’ve hit any pain points that a Java-native approach could help with.
theanonymousone
•
1y ago
Sure. So maybe notehr comment would be to make it (particularly the Series class), as compatible with Java Streams as possible.

Next step would likely be compatibility with popular libraries such as Apache Commons Math: https://commons.apache.org/proper/commons-math/userguide/sta...

mousomashakel•1y ago
Thanks! I'm aware of those great projects. Fahmatrix aims to offer a lightweight, dependency-free alternative that’s easy to embed in any Java app. DuckDB is super impressive, especially for SQL-heavy tasks — but my goal is more about a native, fluent API for those who prefer direct Java code over SQL.

Show HN: Extend UI – open-source UI kit for modern document apps

https://www.extend.ai/ui
183•kbyatnal•13h ago•43 comments

Show HN: HelixDB – A graph database built on object storage

https://github.com/HelixDB/helix-db/tree/main
106•GeorgeCurtis•13h ago•33 comments

Show HN: I built a Red Flag Warning zone-check tool for the East Bay in 48h

https://redflag-check.info
4•vedant28t•1h ago•2 comments

Show HN: Atlasphere – Live Infrastructure Diagrams

23•andreygrehov•1d ago•14 comments

Show HN: Artie – Real-time data replication to your warehouse, now self-serve

https://www.artie.com
23•tang8330•23h ago•5 comments

Show HN: Gravity – Interactive solar-system simulator, from Newton to Einstein

https://qunabu.github.io/Gravity/
198•qunabu•1d ago•48 comments

Show HN: macOS menu bar gauges for your Claude Code quota

https://github.com/grzegorz-raczek-unit8/claude-quota
59•grzracz•19h ago•37 comments

Show HN: Ustps (UDP Speedy Transmission Protocol Secure) and USSH

https://github.com/x1colegal/USTP-Secure
11•x1colegal•2d ago•4 comments

Show HN: Catalyst Maze: biotech trading game

https://rnpv.baybridgebio.com/maze/
4•aaavl2821•4h ago•0 comments

Show HN: Pacman AI – Generated with Claude Fable 5

https://pacmanai.com/
4•javierluraschi•4h ago•3 comments

Show HN: GentleOS – A pair of hobby OSes for vintage 32-bit and 16-bit PCs

https://github.com/luke8086/gentleos32
122•luke8086•3d ago•104 comments

Show HN: Jailbreak this model to get 3B tokens

https://opir.ai/challenge
3•copypirate•5h ago•0 comments

Show HN: Performative-UI – A react component library of design tropes

https://vorpus.github.io/performativeUI/
1165•lizhang•2d ago•208 comments

Show HN: Resonate – Low-latency, high-resolution spectral analysis

https://alexandrefrancois.org/Resonate/
44•arjf•4d ago•18 comments

Show HN: NBSDgames – 21 new, improved, original text games for Unix, DOS, Plan9

https://github.com/abakh/nbsdgames
10•abakh•14h ago•1 comments

Show HN: Llmbuffer – Python library for cache-optimized LLM conversation history

https://github.com/scottpurdy/llmbuffer
5•scottmp10•6h ago•0 comments

Show HN: Magenta Real-Time Music Generation Locally on iPhone, Without the GPU

https://github.com/mattmireles/magenta-realtime-2-iphone
8•MediaSquirrel•6h ago•0 comments

Show HN: Meadow Notes – extract and publish microsites from your Markdown graphs

https://meadow-notes.com
3•gmccreight2•7h ago•0 comments

Show HN: I built a microlearning app to learn personal finance

https://finance.usescroll.app
5•maclinz•11h ago•4 comments

Show HN: Nucleus – A security-hardened, Nix-native container runtime

https://github.com/sig-id/nucleus
36•0kenx•1d ago•13 comments

Show HN: Kctx – A read-only Kubernetes context engine for SREs and AI Agents

https://github.com/lucasepe/kctx
5•lucasepe•8h ago•0 comments

Show HN: Social network where inviting someone makes you accountable for them

https://chirpper.com
9•Chirpper•14h ago•16 comments

Show HN: Gitdot – A better GitHub. Open-source, written in Rust

https://gitdot.io/
321•baepaul•2d ago•303 comments

Show HN: Camel Mono – a monospace font that makes camelCase easier to read

https://github.com/TJHdev/camel-mono
6•tjhdev•15h ago•0 comments

Show HN: AI Mime - Use a screen recording for context instead of prompting

https://github.com/prakhar1114/ai_mime
2•prakharjain•10h ago•0 comments

Show HN: Amanuensis – a local-first AI persona that won't fabricate facts

https://github.com/msalsas/amanuensis
2•msalsas•11h ago•0 comments

Show HN: Learn from 30 historical figures, open source, nonprofit, self-hosted

https://github.com/chipmates/agoracosmica
43•micstradev•1d ago•23 comments

Show HN: Meadow Mind – a 7B diffusion LLM plays Gym games with zero training

https://github.com/Hey-Meadow/meadow-mind
3•akaiHuang•11h ago•0 comments

Show HN: Drift – an embedding-model upgrade should be a rotation, not a reindex

https://github.com/aayush4vedi/drift-spark
6•aayush4vedi•17h ago•3 comments

Show HN: Learn while you wait for your agents to code

https://github.com/get-foyer/foyer
5•dennis3124•12h ago•1 comments