Apache Arrow is 10 years old

https://arrow.apache.org/blog/2026/02/12/arrow-anniversary/

66•tosh•3h ago

Comments

actionfromafar•1h ago

I had to look up what Arrow actually does, and I might have to run some performance comparisons vs sqlite.

It's very neat for some types of data to have columns contiguous in memory.

nu11ptr•1h ago

If I recall, Arrow is more or less a standardized representation in memory of columnar data. It tends to not be used directly I believe, but as the foundation for higher level libraries (like Polars, etc.). That said, I'm not an expert here so might not have full info.

tormeh•1h ago

You can absolutely use it directly, but it is painful. The USP of Arrow ist that you can pass bits of memory between Polars, Datafusion, DuckDB, etc. without copying. It's Parquet but for memory.

skeeter2020•59m ago

This is true, and as a result IME the problem space is much smaller than Parquet, but it can be really powerful. The reality is most of us don't work in environments where Arrow is needed.

data_ders•1h ago

yeah not necessarily compute (though it has a kernel)!

it's actually many things IPC protocol wire protocol, database connectivity spec etc etc.

in reality it's about an in-memory tabular (columnar) representation that enables zero copy operations b/w languages and engines.

and, imho, it all really comes down to standard data types for columns!

skeeter2020•1h ago

>> some performance comparisons vs sqlite.

That's not really the purpose; it's really a language-independent format so that you don't need to change it for say, a dataframe or R. It's columnar because for analytics (where you do lots of aggregations and filtering) this is way more performant; the data is intentionally stored so the target columns are continuous. You probably already know, but the analytics equivalent of SQLite is DuckDB. Arrow can also eliminate the need to serialize/de-serialize data when sharing (ex: a high performance data pipeline) because different consumers / tools / operations can use the same memory representation as-is.

tosh•14m ago

Take a look at parquet.

You can also store arrow on disk but it is mainly used as in-memory representation.

data_ders•1h ago

if I could tell myself in 2015 who had just found the feather library and was using it to power my unhinged topic modeling for power point slides work, and explained what feather would become (arrow) and the impact it would have on the date ecosystem. I would have looked at 2026 me like he was a crazy person.

Yet today I feel it was 2016 dataders who is the crazy one lol

ayhanfuat•1h ago

Indeed. feather was a library to exchange data between R and pandas dataframes. People tend to bash pandas but its creator (Wes McKinney) has changed the data ecosystem for the better with the learnings coming from pandas.

0xcafefood•54m ago

Do people bash pandas? If so, it reminds me of Bjarne's quip that the two types of programming languages are the ones people complain about and the ones nobody uses.

postexitus•45m ago

polars people do - although I wouldn't call polars something that nobody uses.

ayhanfuat•40m ago

I also use polars in new projects. I think Wes McKinney also uses it. If I remember correctly I saw him commenting on some polars memory related issues on GitHub. But a good chunk of polars' success can be attributed to Arrow which McKinney co-created. All the gripes people have with pandas, he had them too and built something powerful to overcome those.

mistrial9•24m ago

I saw Wes speak in the early days of Pandas, in Berkeley. He solved problems that others just worked around for decades. His solutions are quirky but the work was very solid. His career advanced a lot IMHO for substantial reasons.. Wes personally marched through swamps and reached the other side.. others complain and do what they always have done.. I personally agree with the criticisms of the syntax, but Pandas is real and it was not easy to build it.

An AI Agent Published a Hit Piece on Me

Email is tough: Major European Payment Processor's Emails rejected by GWorkspace

Improving 15 LLMs at Coding in One Afternoon. Only the Harness Changed

The "Crown of Nobles" Noble Gas Tube Display (2024)

A brief history of barbed wire fence telephone networks (2024)

Culture Is the Mass-Synchronization of Framings

The Future for Tyr, a Rust GPU Driver for Arm Mali Hardware

Apache Arrow is 10 years old

Warcraft III Peon Voice Notifications for Claude Code

I Wrote a Scheme in 2025

Apple patches decade-old iOS zero-day, possibly exploited by commercial spyware

AI agents can now create their own bank accounts

Discord/Twitch/Snapchat age verification bypass

TikTok is tracking you, even if you don't use the app

Lines of Code Are Back (and It's Worse Than Before)

AI agent opens a PR write a blogpost to shames the maintainer who closes it

Run Pebble OS in Browser via WASM

Carl Sagan's Baloney Detection Kit: Tools for Thinking Critically (2025)

The missing digit of Stela C

“Nothing” is the secret to structuring your work

So many trees planted in Taklamakan Desert that it's turned into a carbon sink

Using an engineering notebook

HeyWhatsThat

How to make a living as an artist

Byte magazine artist Robert Tinney, who illustrated the birth of PCs, dies at 78

Show HN: Inamate – Open-source 2D animation tool (alternative to Adobe Animate)

Text classification with Python 3.14's ZSTD module

US businesses and consumers pay 90% of tariff costs, New York Fed says

Hologram v0.7.0: Milestone release for Elixir-to-JavaScript porting initiative

NetNewsWire Turns 23

An AI Agent Published a Hit Piece on Me

Email is tough: Major European Payment Processor's Emails rejected by GWorkspace

Improving 15 LLMs at Coding in One Afternoon. Only the Harness Changed

The "Crown of Nobles" Noble Gas Tube Display (2024)

A brief history of barbed wire fence telephone networks (2024)

Culture Is the Mass-Synchronization of Framings

The Future for Tyr, a Rust GPU Driver for Arm Mali Hardware

Apache Arrow is 10 years old

Warcraft III Peon Voice Notifications for Claude Code

I Wrote a Scheme in 2025

Apple patches decade-old iOS zero-day, possibly exploited by commercial spyware

AI agents can now create their own bank accounts

Discord/Twitch/Snapchat age verification bypass

TikTok is tracking you, even if you don't use the app

Lines of Code Are Back (and It's Worse Than Before)

AI agent opens a PR write a blogpost to shames the maintainer who closes it

Run Pebble OS in Browser via WASM

Carl Sagan's Baloney Detection Kit: Tools for Thinking Critically (2025)

The missing digit of Stela C

“Nothing” is the secret to structuring your work

So many trees planted in Taklamakan Desert that it's turned into a carbon sink

Using an engineering notebook

HeyWhatsThat

How to make a living as an artist

Byte magazine artist Robert Tinney, who illustrated the birth of PCs, dies at 78

Show HN: Inamate – Open-source 2D animation tool (alternative to Adobe Animate)

Text classification with Python 3.14's ZSTD module

US businesses and consumers pay 90% of tariff costs, New York Fed says

Hologram v0.7.0: Milestone release for Elixir-to-JavaScript porting initiative

NetNewsWire Turns 23

Apache Arrow is 10 years old

Comments