frontpage.
newsnewestaskshowjobs

Open Source @Github

fp.

Many Let's Encrypt renewals had errors today

https://letsencrypt.status.io/#2026
107•widdakay•1h ago•58 comments

Ice Water Drowning Survival After 147-Minute Submersion and Hypothermic Arrest

https://www.jacc.org/doi/10.1016/j.jaccas.2025.104885
78•js2•2h ago•19 comments

To study how chips work, MIT researchers built their own operating system

https://news.mit.edu/2026/to-study-how-chips-really-work-mit-researchers-built-their-own-operatin...
120•speckx•3d ago•10 comments

DuckDB Internals: Why Is DuckDB Fast? (Part 1)

https://www.greybeam.ai/blog/duckdb-internals-part-1
61•marklit•2d ago•32 comments

Gribouille 0.3.0: A Grammar of Graphics for Typst

https://mickael.canouil.fr/posts/2026-06-15-gribouille-0-3/
24•mcanouil•3d ago•1 comments

Zero-Touch OAuth for MCP

https://blog.modelcontextprotocol.io/posts/enterprise-managed-auth/
165•niyikiza•8h ago•58 comments

I found 10k GitHub repositories distributing Trojan malware

https://orchidfiles.com/github-repositories-distributing-malware/
735•theorchid•18h ago•174 comments

Building a robotics research setup that lives next to my desk

https://dfdxlabs.com/research/2026/robotics-setup/
50•mplappert•15h ago•14 comments

How Japan's railways stayed one while splitting apart

https://arun.is/blog/jr-logo/
77•ddrmaxgt37•1d ago•58 comments

Datasette Apps: Host custom HTML applications inside Datasette

https://simonwillison.net/2026/Jun/18/datasette-apps/
58•lumpa•4h ago•18 comments

Ubiquiti: Enterprise NAS, Built on ZFS

https://blog.ui.com/article/introducing-enterprise-nas
308•ksec•15h ago•269 comments

Cell-based architecture for resilient payment systems

https://americanexpress.io/cell-based-architecture-for-resilient-payment-systems/
114•birdculture•3d ago•43 comments

CS 6120: Advanced Compilers: The Self-Guided Online Course (2020)

https://www.cs.cornell.edu/courses/cs6120/2025fa/self-guided/
346•ibobev•18h ago•49 comments

Show HN: Talos – Open-source WASM interpreter for Lean

https://github.com/cajal-technologies/talos
32•mfornet•16h ago•3 comments

Horizons JPL Solar System Data Demo and NASA DSN Updates: Datastar, Common Lisp

https://horizons.lambda-combine.net/
44•adityaathalye•4d ago•1 comments

Flexport (YC W14) Is Hiring in Indonesia, India, and Thailand

https://www.flexport.com/company/careers/
1•thedogeye•4h ago

Hospitals and universities repurposing drugs at lower cost

https://www.kcl.ac.uk/news/hospitals-and-universities-repurposing-drugs-at-90-lower-cost
302•giuliomagnifico•19h ago•130 comments

.gitignore Isn't the only way to ignore files in Git

https://nelson.cloud/.gitignore-isnt-the-only-way-to-ignore-files-in-git/
357•FergusArgyll•19h ago•116 comments

Show HN: Are You in the Weights?

https://www.intheweights.com/
293•turtlesoup•9h ago•152 comments

I told them forced consent was unlawful. 5 years later it cost Elkjop €1.8M

https://www.thatprivacyguy.com/blog/elkjop-forced-consent-fine/
305•speckx•11h ago•151 comments

If your product is Great, it doesn't need to be Good (2010)

http://paulbuchheit.blogspot.com/2010/02/if-your-product-is-great-it-doesnt-need.html
54•skogstokig•3d ago•33 comments

Zork name origin got an update on Wikipedia

https://www.dpolakovic.space/blogs/zork-part2#update
76•dpola•9h ago•12 comments

Launch HN: TesterArmy (YC P26) – Agents that test web and mobile apps

https://tester.army
112•okwasniewski•15h ago•48 comments

W Social, public institutions and the theater of European digital sovereignty

https://blog.elenarossini.com/w-social-public-institutions-and-the-theater-of-european-digital-so...
192•nemoniac•17h ago•127 comments

Noam Shazeer Joins OpenAI

https://twitter.com/NoamShazeer/status/2067400851438932297
318•lukasgross•1d ago•309 comments

Modos Color Monitor Pushes E-Paper Displays Further

https://spectrum.ieee.org/modos-e-paper-monitor
253•Vinnl•18h ago•67 comments

Swiss parliament lifts ban on new nuclear power plants

https://www.bluewin.ch/en/news/switzerland/parliament-lifts-ban-on-new-nuclear-power-plants-32575...
738•leonidasrup•15h ago•647 comments

How Alberta Eradicated Rats

https://worksinprogress.co/issue/albertas-war-on-rats/
160•tzury•16h ago•113 comments

The Token Compression Illusion: Why I'm Skeptical of RTK

https://mroczek.dev/articles/the-token-compression-illusion-why-im-skeptical-of-rtk/
95•lackoftactics•12h ago•99 comments

Show HN: Gerrymandle - Daily puzzle game where you redraw electoral districts

https://gerrymandle.cc/
162•realmofthemad•15h ago•68 comments
Open in hackernews

DuckDB Internals: Why Is DuckDB Fast? (Part 1)

https://www.greybeam.ai/blog/duckdb-internals-part-1
61•marklit•2d ago

Comments

steve_adams_86•1h ago
> DuckDB has received widespread adoption because it's just so damn easy to use.

This was a major factor in my initial adoption. Since then it has stuck because it’s also absurdly capable, versatile, and fast.

If it wasn’t so easy to use I suspect I wouldn’t have adopted it when I did. The ergonomics are crazy. It still impresses me regularly.

jkubicek•1h ago
What do you use it for? I’m perpetually interested in using DuckDB, but it doesn’t seem to do anything I need.
edweis•38m ago
I personally find it useful to search logs with AI
steve_adams_86•14m ago
Yes, it’s amazing for giving rails and structure to data so you can be sure an LLM is making more sense than it might with grep and jq. It also allows a little more sanity at scale with jobs like this. You can get pretty crazy with parquet in S3 with an engine like duckdb. And it’s dirt cheap to keep that stuff hanging around for future reference and sanity checking your understanding of things.

For data I reference frequently, and especially which I know will grow over time, I’ve started using Rill because it makes ad-hoc exploration very smooth and low-friction.

My process tends to be something like:

1. Explore logs or some other at least somewhat structured dataset

2. Use Claude to find useful patterns and determine how I might benefit from this data in ways I wasn’t yet aware

3. See how often it’s useful for decision making

4. If it’s frequently useful, formalize it as a view in my Rill instance and refine the models to maximize their utility

medvezhenok•14m ago
Basically like a locally hosted Snowflake - it only shines if you have enough data to analyze (100 MB - 100 GB is probably the sweet-spot range - less than that and the benefits are small, more than that and you risk flying too close to the sun with memory usage).

It has connectors for Postgres & other stores, so I find it faster to connect to a Postgres instance, pull all of the data from a table (even if the table is like 50GB - if you have 30 cores on the machine it will pull from Postgres using 30 cores in parallel, so it will only take a minute or two) - and then any analytical queries on the data are 10+ times faster in DuckDB over native Postgres (GROUP BY, regexp_replace, count(distinct...) etc).

orthoxerox•11m ago
All kinds of data processing. For example, you download a million rows of metrics and load them in Excel to build pivot tables. It works, but now it's a billion rows. If you know SQL, it's a snap to point DuckDB at the source CSV or JSON and get the results in a second.
steve_adams_86•7m ago
The most interesting use case lately has been using it as the transformation and validation engine for a CLI that handles scientific data. Some datasets are small and could have been handled at the application layer, but some are quite massive (especially genomic data). DuckDB bundles with the CLI and travels around any platform, is super lightweight, allows for easily running in CI, on a user’s machine, against datasets of all sizes, and so on.

There are other embeddable options out there but I found DuckDb fit better for the potentially massive datasets, and also because of how naturally it ingests the types of data we work with, some of its unique features, and how trivial it was to learn and integrate with the project.

Otherwise I use it almost daily for doing guardrailed data exploration with LLMs. I prefer SQL over random DSLs in AWS or Sentry or what have you. I’ll ingest the data I need and just run SQL against it. I mentioned in another comment that I’ll tend to store more useful data (especially data I export routinely, like infra cost reports) on S3 and use a Rill instance to do basic exploration in a GUI (it will query remote parquet files).

thefourthchime•1h ago
I’m a huge fan, I’ve been wanting to know into the internals. Look forward to digging in.
anitil•1h ago
DuckDb makes so much of my life easier, though I've never used it for large problems. The ability to run `select * from 'data.json'` is just lovely. The fact that it's also a powerhouse is so impressive, I'd usually expect a project to be good at small problems (like mine) xor large problems, but not both
medvezhenok•8m ago
Yup. And an extra benefit that you can treat any file like a table, so you can also do something like

  UPDATE my_table
  SET x = file1.x,
      y = file2.y
  FROM 'first_file.csv' file1
  LEFT JOIN 's3://my_bucket/second_file.parquet' file2
    ON file1.id = file2.id
  WHERE mytable.id = file1.id;
jdw64•43m ago
The data scientists I work with use this. Why do they use it? I don't really know much about it, but I've noticed they use it quite often. I mainly use MySQL or PostgreSQL. What are the advantages of DuckDB? It seems like they usually use it as an alternative to Pandas.
bdcravens•37m ago
Primarily the ability to work directly with data in its native format (CSV for example) without needing ETL.
jdw64•32m ago
Then it definitely makes sense. Scientists usually handle a lot of CSV files. Thank you
throwaway7783•31m ago
How does this work in a production setup? Can this be set up like a server, or is it mostly for individual users to play around with data?
blackoil•27m ago
It is an OLAP db. So you can have a pipeline storing data in parquet files in S3. And then use DuckDB to directly query on it.
orthoxerox•16m ago
The idea is that you treat data storage and data processing as two distinct tasks. You have your data in S3 or HDFS or a local directory and you run DuckDB on whatever single-node compute you have: a local machine or a container in a cluster.

There are companies that write cluster computing engines with duckdb as the byte-cruncher at their heart, but usually it's more like NumPy, Pandas or Polars on steroids. Or SQLite, but for running OLAP queries.

smithclay•36m ago
If you're reading this and curious: consider writing a duckdb community extension* or contributing to an existing one*

duckdb is becoming a kind of data superglue between a lot of data ecosystems (GIS, observability, analytics, lakehouses, object storage, etc) that don't talk to each other typically, and it's worth checking out in 2026.

* https://github.com/duckdb/extension-template * https://duckdb.org/community_extensions/

pknerd•6m ago
Just curious whether one can earn money making these exts?
codingbear•33m ago
duckdb is so nice coupled with claude code. It extensive file support and some very interesting decisions on local caching data (like from S3 or snowflake) makes it easy to slice and dice almost any kind of tabular data.
blackoil•25m ago
> duckdb is so nice coupled with claude code

Can you expand upon it? You mean claude code use it to store its memory/state or it can do business queries using DuckDB.

medvezhenok•12m ago
Claude code can write exploratory queries for you to give you a quick rundown on the shape of the data set, frequencies, missing values, etc etc (without having to load it into a more persistent data store or writing custom python scripts). I also find SQL snippets inherently more re-usable than custom python code.

You can also write a skill that CC can re-use if you're analyzing a lot of similar data sets with minor variance.

0xferruccio•22m ago
DuckDB is amazing for any sort of fast data analysis when the data is small enough that it can fit on your laptop

Recently at work I've been using it to analyse the Claude code sessions of every engineer at our company (that we upload to S3) and it's been extremely helpful to help us find gaps in devex and have clear metrics to back up the impact of fixing them

Another thing it's been really useful for has been getting metrics on Claude skills usage and then dive into use-cases by looking at the transcripts

Other engineers that had never touched DuckDB were so impressed with how easy it is for AI agents to write queries on our dataset

holografix•9m ago
Why is DuckDB so popular when one can use Python + Pandas?

Better perf + SQL is that mostly it?

RobinL•1m ago
I wrote a blog post a while back to address this question here: https://www.robinlinacre.com/recommend_duckdb/
pknerd•8m ago
FTA:

> ..In-process means there's no server. You don't connect to DuckDB; you load it as a library inside your program, the same way you'd load NumPy or Polars

Does it mean it can perform all statistical computations as well if I want to use for algo trading?

Panzerschrek•8m ago
If DuckDB is so fast and has no data transfer overheads, does it need all this typical SQL machinery with filtering and joining via SELECT queries? Wouldn't it be simpler and faster to return all data to the caller code (all table rows, but only requested columns) and let it perform all other necessary data processing logic?
pknerd•5m ago
umm can we say it can replace SQLite?
3eb7988a1663•3m ago
OLAP vs OLTP. Sure you could use one for the other, but they have ideal use cases. Updating a single record in SQLite is going to be more efficient than doing the same in DuckDB.
snissn•1m ago
I'm just curious - is duckdb too slow for people? This benchmark from clickhouse shows it being fairly slow compared to some options: https://jsonbench.com/
Demiurge•26m ago
Here is the thing, it’s a write only single file format. If you need to run analytical queries it’s optimized for reading, you just open a file and query for the parts you want. If you have multiple clients that read and write data to the database, you should use postgresql.

It’s not really a database in the traditional sense, there is no ACID complexity, it’s a library that lets use write SQL to query a tabular data file.

medvezhenok•25m ago
DuckDB has been probably my most used tool in 2026 - if you're comfortable with SQL it's incredible at quickly prototyping and slicing / dicing data.

I do a lot of experiments with regexes, and if you get used to the RE2 syntax that DuckDB uses, you can see up to 10-100x uplift in terms of speed compared to Postgres on things like regexp_matches(), regexp_extract(), etc (depending on query/table/machine specifics). It has quite powerful scripting with custom Macros, fixes a lot of annoyances of SQL for me compared to Postgres.

I think if you have access to a machine with a lot of RAM / cores and a beefy data set, then it's basically like a RAMdisk version of Snowflake running locally on your machine.

(and of course the fact that it makes it convenient to read CSV/parquet, read/write from S3, etc) - it's a very ergonomic tool.

jdw64•21m ago
Thank you for your kind reply. I should look into it too. In my case, knowing various libraries is directly related to my livelihood. Have a good day.