frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Fahmatrix – A Lightweight, Pandas-Like DataFrame Library for Java

https://github.com/moustafa-nasr/fahmatrix
46•mousomashakel•1y ago
Hey HN, I’ve built Fahmatrix, a minimal, fast Java library for working with tabular data — inspired by Python’s pandas, but designed for performance and simplicity on the JVM.

After working extensively with Python’s data stack, I often ran into limitations related to speed, especially in larger or long-running data workflows. So I built Fahmatrix from scratch to offer similar APIs for manipulating CSVs, performing summary statistics, slicing rows/columns, and more — but all in Java.

Features:

Lightweight and dependency-free

CSV/TSV import with auto-headers

Series/DataFrame structures (like pandas)

describe(), mean(), stdDev(), percentile() and more

Fast parallel operations on numeric columns

Java 17+ support

Docs: https://moustafa-nasr.github.io/Fahmatrix/ GitHub: https://github.com/moustafa-nasr/fahmatrix

I’d love feedback from the Java and data communities — especially if you’ve ever wanted a simple dataframe utility in Java without needing full-scale ML libraries.

Happy to answer any questions!

Comments

rickette•1y ago
Congrats on putting this out there. There isn't a de facto pandas-like library in Java like you said. But for Kotlin there is: https://github.com/Kotlin/dataframe
mousomashakel•12mo ago
Thanks so much! Yep, I’ve seen the Kotlin DataFrame lib — very elegant. Fahmatrix is meant for plain Java users who want similar capabilities without switching ecosystems. Appreciate the support!
uwemaurer•1y ago
Always great to see efforts to make working with data frames easier. Here are some similar data frame libraries for Java:

https://github.com/jtablesaw/tablesaw

https://github.com/dflib/dflib

My preferred way is just use duckdb java API. I didn't see anything better in performance/efficiency. Also a SQL query is often easier to write

theanonymousone•1y ago
Yes. It has bothered me for a long time too. Maybe the best mix is a dataframe library with basic operations (column select, non-null etc), which also allows SQL for more complex stuff?
radus•1y ago
Polars and duckdb interoperate nicely and can enable this flexibility
theanonymousone•1y ago
Does Polars have a Java library?
mousomashakel•12mo ago
Totally agree that SQL can be the best tool for many jobs. My goal with Fahmatrix is to serve the opposite niche: where devs want something that's Java-native, procedural, and simple without reaching for an external engine. SQL support or DSL might come later though — I see the appeal.
theanonymousone•12mo ago
Sure. So maybe notehr comment would be to make it (particularly the Series class), as compatible with Java Streams as possible.

Next step would likely be compatibility with popular libraries such as Apache Commons Math: https://commons.apache.org/proper/commons-math/userguide/sta...

mousomashakel•12mo ago
Thanks! I'm aware of those great projects. Fahmatrix aims to offer a lightweight, dependency-free alternative that’s easy to embed in any Java app. DuckDB is super impressive, especially for SQL-heavy tasks — but my goal is more about a native, fluent API for those who prefer direct Java code over SQL.
skanga•1y ago
What about Tablesaw, Apache Arrow? How does this compare ...
mousomashakel•12mo ago
Good question. I’ll publish benchmarks soon, but the core difference is that Fahmatrix is fully Java, no JNI, and minimalistic — ideal for small projects or environments like Android. Tablesaw and Arrow are more powerful, but heavier. Fahmatrix aims to be the “just enough” middle ground.
owlstuffing•1y ago
Nice!

I’m currently using manifold-sql with duckdb for this.

mousomashakel•12mo ago
Thanks! That’s a great combo — manifold-sql + duckdb gives you strong typing with powerful SQL under the hood. Fahmatrix is aiming to complement that approach for cases where you want quick, native Java code without SQL — e.g., when building data flows or custom logic inline. Would love to hear if you’ve hit any pain points that a Java-native approach could help with.

Show HN: Semble – Code search for agents that uses 98% fewer tokens than grep

https://github.com/MinishLab/semble
18•Bibabomas•4h ago•14 comments

Show HN: Rocksky – Music scrobbling and discovery on the AT Protocol

https://tangled.org/rocksky.app/rocksky
104•tsiry•1d ago•43 comments

Show HN: I made a printable graph papaer templates website

https://printablegraphpaper.org/
6•atharvtathe•5h ago•7 comments

Show HN: Needle: We Distilled Gemini Tool Calling into a 26M Model

https://github.com/cactus-compute/needle
764•HenryNdubuaku•5d ago•210 comments

Show HN: Watch a neural net learn to play Snake

https://ppo.gradexp.xyz/
195•c1b•3d ago•45 comments

Show HN: Epiq – Distributed Git based issue tracker TUI

https://ljtn.github.io/epiq/
88•jolaflow•1d ago•46 comments

Show HN: Burn, baby, burn (those tokens)

https://github.com/dtnewman/burn-baby-burn
129•dtnewman•2d ago•28 comments

Show HN: Forecasting my backyard weather with a 22M time-series model

https://huggingface.co/spaces/bitsofchris/time-series-ai-weather-forecast
3•chrisdevs•5h ago•0 comments

Show HN: Gigacatalyst – Extend your SaaS with an embedded AI builder

60•namanyayg•5d ago•27 comments

Show HN: Sx – an open-source package manager for AI skills, MCPs, and commands

https://github.com/sleuth-io/sx
49•detkin•2d ago•27 comments

Show HN: Running the second public ODoH relay

https://numa.rs/blog/posts/odoh-anonymous-dns-without-an-account.html
124•rdme•3d ago•41 comments

Show HN: Serene Bach – a Go weblog engine that runs as CGI or HTTP

https://github.com/serendipitynz/serenebach
3•takkyun•16h ago•0 comments

Show HN: TikTok but for scientific papers

https://andreaturchet.github.io/website/index.html
196•ciwrl•6d ago•77 comments

Show HN: Nibble

https://github.com/glouw/nibble
101•glouwbug•3d ago•24 comments

Show HN: GridTravel – A community based travel app for users to share routes

https://www.gridtravel.app
60•knuaym9•2d ago•39 comments

Show HN: Hermes-agentmemory, pull-model episodic memory with real deletes

https://github.com/MukundaKatta/hermes-agentmemory
5•mukundakatta•1d ago•0 comments

Show HN: Browser based sythesizer, drum machine and squencer

https://github.com/madmonk13/modal-16
19•madmonk•2d ago•4 comments

Show HN: Agentic interface for mainframes and COBOL

https://www.hypercubic.ai/hopper
97•sai18•5d ago•50 comments

Show HN: Statewright – Visual state machines that make AI agents reliable

https://github.com/statewright/statewright
126•azurewraith•5d ago•55 comments

Show HN: Built a verifiable, open-source SoC 2 readiness scanner

https://loxeai.com
2•arjavmehta•20h ago•0 comments

Show HN: Got ghosted by tech companies so I built a tool to track ghost jobs

https://csvfirst.pythonanywhere.com/insights/hiring-data/job-listings-that-stay-open-for-years/
6•ktmartin•23h ago•3 comments

Show HN: I built a screen recorder that captures console logs, requests and more

https://userplane.io/
2•wizenheimer•1d ago•0 comments

Show HN: I made a Clojure-like language in Go, boots in 7ms

https://github.com/nooga/let-go
290•marcingas•1w ago•85 comments

Show HN: MIT OSS LinkedIn DMs for Agents (CLI and Example TUI)

https://allman.sh
5•toobulkeh•1d ago•1 comments

Show HN: Strava for AI coding – analytics on your Copilot/Claude/Codex usage

https://github.com/microsoft/AI-Engineering-Coach
8•aymenfurter•1d ago•1 comments

Show HN: Infinite Swap – Trade a bottle cap up to a house

https://infiniteswap.app/
6•dansquizsoft•1d ago•3 comments

Show HN: TRUST – Coding Rust like it's 1989

https://github.com/wojtczyk/trust
177•wojtczyk•1w ago•87 comments

Show HN: A modern Music Player Daemon based on Rockbox firmware

https://github.com/tsirysndr/rockbox-zig
122•tsiry•1w ago•28 comments

Show HN: Rust but Lisp

https://github.com/ThatXliner/rust-but-lisp
216•thatxliner•1w ago•73 comments

Show HN: An index of indie web/blog indexes

https://theindex.fyi
154•rocketpastsix•1w ago•39 comments