frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

The surprise deprecation of GPT-4o for ChatGPT consumers

https://simonwillison.net/2025/Aug/8/surprise-deprecation-of-gpt-4o/
62•tosh•26m ago•25 comments

Ultrathin business card runs a fluid simulation

https://github.com/Nicholas-L-Johnson/flip-card
616•wompapumpum•6h ago•145 comments

Tor: How a Military Project Became a Lifeline for Privacy

https://thereader.mitpress.mit.edu/the-secret-history-of-tor-how-a-military-project-became-a-lifeline-for-privacy/
82•anarbadalov•2h ago•57 comments

I Want Everything Local – Building My Offline AI Workspace

https://instavm.io/blog/building-my-offline-ai-workspace
11•mkagenius•11m ago•0 comments

I clustered four Framework Mainboards to test LLMs

https://www.jeffgeerling.com/blog/2025/i-clustered-four-framework-mainboards-test-huge-llms
37•bobajeff•1h ago•11 comments

GPT-5 vs. Sonnet: Complex Agentic Coding

https://elite-ai-assisted-coding.dev/p/copilot-agentic-coding-gpt-5-vs-claude-4-sonnet
141•intellectronica•2h ago•111 comments

Google's Genie is more impressive than GPT5

https://theahura.substack.com/p/tech-things-genies-lamp-openai-cant
152•theahura•3h ago•49 comments

AI must RTFM: Why tech writers are becoming context curators

https://passo.uno/from-tech-writers-to-ai-context-curators/
97•theletterf•3h ago•43 comments

Astronomy Photographer of the Year 2025 shortlist

https://www.rmg.co.uk/whats-on/astronomy-photographer-year/galleries/2025-shortlist
86•speckx•4h ago•8 comments

HorizonDB, a geocoding engine in Rust that replaces Elasticsearch

https://radar.com/blog/high-performance-geocoding-in-rust
115•j_kao•5h ago•32 comments

Apple's history is hiding in a Mac font

https://www.spacebar.news/apple-history-hiding-in-mac-font/
72•rbanffy•4d ago•3 comments

HRT's Python Fork: Leveraging PEP 690 for Faster Imports

https://www.hudsonrivertrading.com/hrtbeat/inside-hrts-python-fork/
23•davidteather•2h ago•11 comments

Getting good results from Claude code

https://www.dzombak.com/blog/2025/08/getting-good-results-from-claude-code/
120•ingve•4h ago•74 comments

We built an open-source asynchronous coding agent

https://blog.langchain.com/introducing-open-swe-an-open-source-asynchronous-coding-agent/
25•palashshah•2h ago•11 comments

Window Activation

https://blog.broulik.de/2025/08/on-window-activation/
134•LorenDB•4d ago•72 comments

Linear sent me down a local-first rabbit hole

https://bytemash.net/posts/i-went-down-the-linear-rabbit-hole/
361•jcusch•12h ago•166 comments

Voice Controlled Swarms

https://jasonfantl.com/posts/Voice-Controlled-Swarms/
15•jfantl•3d ago•2 comments

Overengineering my homelab so I don't pay cloud providers

https://ergaster.org/posts/2025/08/04-overegineering-homelab/
126•JNRowe•3d ago•105 comments

Telefon Hírmondó: Listen to news and music electronically, in 1893

https://en.wikipedia.org/wiki/Telefon_H%C3%ADrmond%C3%B3
51•csense•4d ago•6 comments

How Attention Sinks Keep Language Models Stable

https://hanlab.mit.edu/blog/streamingllm
117•pr337h4m•9h ago•22 comments

Show HN: Trayce – “Burp Suite for developers”

https://trayce.dev?resubmit=hn
47•ev_dev3•1d ago•9 comments

Show HN: Synchrotron, a real-time DSP engine in pure Python

https://synchrotron.thatother.dev/
45•andromedaM31•5h ago•3 comments

My commitment to you and our company

https://newsroom.intel.com/corporate/my-commitment-to-you-and-our-company
12•rntn•42m ago•2 comments

AI is impressive because we've failed at personal computing

https://rakhim.exotext.com/ai-is-impressive-because-we-ve-failed-at-semantic-web-and-personal-computing
166•ambigious7777•3h ago•126 comments

GPT-5

https://openai.com/gpt-5/
1989•rd•1d ago•2367 comments

FLUX.1-Krea and the Rise of Opinionated Models

https://www.dbreunig.com/2025/08/04/the-rise-of-opinionated-models.html
62•dbreunig•3d ago•24 comments

Show HN: Aha Domain Search

https://www.ahadomainsearch.com/
22•slig•3d ago•16 comments

Programming with AI: You're Probably Doing It Wrong

https://www.devroom.io/2025/08/08/programming-with-ai-youre-probably-doing-it-wrong/
12•ariejan•3h ago•6 comments

Open SWE by LangChain

https://swe.langchain.com/
11•dennisy•3h ago•2 comments

Virtual Linux Devices on ARM64

https://underjord.io/500-virtual-linux-devices-on-arm64.html
48•lawik•4d ago•3 comments
Open in hackernews

HorizonDB, a geocoding engine in Rust that replaces Elasticsearch

https://radar.com/blog/high-performance-geocoding-in-rust
115•j_kao•5h ago

Comments

maelito•5h ago
I wonder if this could help Photon, the open source ElasticSearch/OpenSearch search engine for OSM data.

It's a mini-revolution in the OSM world, where most apps have a bad search experience where typos aren't handled.

https://github.com/komoot/photon

sophia01•5h ago
They're not open sourcing it though?
pbowyer•4h ago
Doesn't sound like it, but it's a nice writeup of the tools they stitched together. For someone to copy and open source... hopefully :)
cicloid•3h ago
Tempted, specially for switching H3 instead of S2… I prototyped a similar solution a couple of weeks ago, so I could probably do a second pass
ellenhp•1h ago
What's wrong with S2? H3 is so much more complex for very little gain from what I can tell.
ellenhp•1h ago
There are a few piece of this that rely on proprietary data, especially the FastText training step, so that's a dead-end unfortunately (would love to be proven wrong). I'd consider subbing in a small bert model with a classifier head for something FOSS without access to tons of user data, but then you lose the ability to serve high qps.
j_kao•2h ago
It's a bit difficult at the moment, given we have a lot of proprietary data at the moment and a lot of the logic follows it. I'm hoping we can get it to a state where it can be indexed and serving OSM data but that is going to take some time.

That being said, we are currently working on getting our Google S2 Rust bindings open-sourced. This is a geo-hashing library that makes it very easy to write a reverse geocoder, even from a point-in-polygon or polygon-intersection perspective.

softwaredoug•4h ago
It’s interesting as someone in the search space how many companies are aiming to “replace Elasticsearch”
mikeocool•4h ago
In my experience, the care and feeding that goes into an Elastic Search cluster feels like it's often substantially higher than that involved in the primary data store, which has always struck me as a little odd (particularly in cases where the primary data store is an RDBMS).

I'd be very happy to use simpler more bulletproof solutions with a subset of ES's features for different use cases.

dewey•3h ago
To add another data point: After working with ES for the past 10 years in production I have to say that ES is never giving us any headaches. We've had issues with ScyllaDB, Redis etc. but ES is just chugging along and just works.

The one issue I remember is: On ES 5 we once had an issue early on where it regularly went down, turns out that some _very long_ input was being passed into the search by some scraper and killed the cluster.

everfrustrated•2h ago
How big is the team that looks after it?
dewey•2h ago
Nobody is actively looking after it. Good alerting + monitoring and if there's an alert like a node going down because of some Kubernetes node shuffling or a version upgrade that has to be performed one of our few infra people will do that.

It's really not something that needs much attention in my experience.

itpragmatik•2h ago
how many clusters, how many indexes and how many documents per index? do you use self hosted es or aws managed opensearch?
dewey•2h ago
12 nodes, 200 million documents / node, very high number of searches and indexing operations. Self-hosted ES on GCP managed Kubernetes.
binarymax•2h ago
Lots of other options here if you don't like managing. You can use Elastic cloud, Bonsai.io, and others
unsuitable•2h ago
In my experience Elastic Search lacks fundamental tooling, like a CLI that copies data between nodes.
j_kao•3h ago
Author here! We were really motivated to turn a "distributed system" problem into a "monolithic system" from an operations perspective and felt this was achievable with current hardware, which is why we went with in-process, embedded storage systems like RocksDB and Tantivy.

Memory-mapping lets us get pretty far, even with global coverage. We are always able to add more RAM, especially since we're running in the cloud.

Backfills and data updates are also trivial and can be performed in an "immutable" way without having to reason about what's currently in ES/Mongo, we just re-index everything with the same binary in a separate node and ship the final assets to S3.

pm90•4h ago
Slightly meta, but I find its a good sign that we're back to designing and blogging about in-house data storage systems/ Query engines again. There was an explosion of these in the 2010's which seemed to slow down/refocus on AI recently.
8n4vidtmkvmk•3h ago
Is it good? What's left to innovate on in this space? I don't really want experimental data stores. Give me something rock solid.
cfors•3h ago
I don't disagree that rock solid is a good choice, but there is a ton of innovation necessary for data stores.

Especially in the context of embedding search, which this article is also trying to do. We need database that can efficiently store/query high-dimensional embeddings, and handle the nuance of real-world applications as well such as filtered-ANN. There is a ton of innovation in this space and it's crucial to powering the next generation architectures of just about every company out there. At this point, data-stores are becoming a bottleneck for serving embedding search and I cannot understate that advancements in this are extremely important for enabling these solutions. This is why there is an explosion of vector-databases right now.

This article is a great example of where the actual data-providers are not providing the solutions companies need right now, and there is so much room for improvement in this space.

weego•2h ago
Agreed. The only caveat to that being a global rule is: 'At scale in a particular niche, even an excellent generalist platform might not be good enough'

But then the follow on question begs: "Am I really suffering the same problems that a niche already-scaled business is suffering"

A question that is relevant to all decision making. I'm looking at you, people who use the entire react ecosystem to deploy a blog page.

jothirams•4h ago
Is horizondb publicly available for us to try as well..
trimbo•3h ago
This article is lacking detail. For example, how is the data sharded, how much time between indexing and serving, and how does it handle node failure, and other distributed systems questions? How does the latency compare? Etc. etc.
reactordev•3h ago
I mean, anything could replace elasticsearch, but can it actually?

It sounds like they had the wrong architecture to start with and they built a database to handle it. Kudos. Most would have just thrown cache at it or fine tuned a readonly postgis database for the geoip lookups.

Without benchmarks it’s just bold claims we’ll have to ascertain.

brunohaid•3h ago
Bit thin on details and not looking like they’ll open source it, but if someone clicked the post because they’re looking for their “replace ES” thing:

Both https://typesense.org/ and https://duckdb.org/ (with their spatial plugin) are excellent geo performance wise, the latter now seems really production ready, especially when the data doesn’t change that often. Both fully open source including clustered/sharded setups.

No affiliation at all, just really happy camper.

jjordan•3h ago
Typesense is an absolute beast, and it has a pretty great dev experience to boot.
sureglymop•2h ago
These are great. I am eternally grateful that projects like this are open source, I do however find it hard to integrate them into your own projects.

A while ago I tried to create something that has duckdb + its spatial and SQLite extensions statically linked and compiled in. I realized I was a bit in over my head when my build failed because both of them required SQLite symbols but from different versions.

j_kao•2h ago
These are great projects, we use DuckDB to inspect our data lake and for quick munging.

We will have some more blog posts in the future describing different parts of the system in more detail. We were worried too much density in a single post would make it hard to read.

kosolam•3h ago
Side note 1: ES can also be embedded in your app (on the JVM). Note 2: I actually used RocksDB to solve many use cases and it’s quite powerful and very performant. If anything from this post take this, it’s open source and a very solid building block. Note 3: I would like to test drive quickwit as an ES replacement. Haven’t got the time yet.
j_kao•2h ago
1 - I think if we were sticking with the JVM, I do wonder if Lucene would be the right choice in that case

2 - It's a great tool with a lot of tuneability and support!

3 - We've been using it for K8s logs and OTEL (with Jaeger). Seems good so far, though I do wonder how the future of this will play out with the $DDOG acquisition.

mexxixan•2h ago
Would love to know how they scaled it. Also, what happens when you lose the machine and the local db? I imagine there are backups but they should have mentioned it. Even with backups how do you ensure zero data loss.
tracker1•1h ago
Nice... it's cool to see how different companies are putting together best fit solutions. I'm also glad that they at least started out with off the shelf apps instead of jumping to something like a bespoke solution early on.

Quickwit[1] looks interesting, found via Tantivity reference. Kind of like ES w/ Lucene.

1. https://github.com/quickwit-oss/quickwit