frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

The impossible predicament of the death newts

https://crookedtimber.org/2025/06/05/occasional-paper-the-impossible-predicament-of-the-death-newts/
336•bdr•7h ago•110 comments

APL Interpreter – An implementation of APL, written in Haskell (2024)

https://scharenbroch.dev/projects/apl-interpreter/
5•ofalkaed•16m ago•0 comments

Tokasaurus: An LLM Inference Engine for High-Throughput Workloads

https://scalingintelligence.stanford.edu/blogs/tokasaurus/
4•rsehrlich•11m ago•0 comments

Seven Days at the Bin Store

https://defector.com/seven-days-at-the-bin-store
92•zdw•5h ago•38 comments

Show HN: iOS Screen Time from a REST API

https://www.thescreentimenetwork.com/api/
56•anteloper•3h ago•33 comments

Show HN: ClickStack – Open-source Datadog alternative by ClickHouse and HyperDX

https://github.com/hyperdxio/hyperdx
100•mikeshi42•3h ago•19 comments

Converge (YC S23) Well-capitalized New York startup seeks product developers

https://www.runconverge.com/careers
1•thomashlvt•38m ago

Programming language Dino and its implementation

https://github.com/dino-lang/dino
16•90s_dev•3h ago•3 comments

Aurora, a foundation model for the Earth system

https://www.nytimes.com/2025/05/21/climate/ai-weather-models-aurora-microsoft.html
49•rmason•2h ago•9 comments

The Universal Tech Tree

https://asteriskmag.com/issues/10/the-universal-tech-tree
25•mitchbob•3d ago•9 comments

A proposal to restrict sites from accessing a users’ local network

https://github.com/explainers-by-googlers/local-network-access
571•doener•1d ago•329 comments

Phptop: Simple PHP ressource profiler, safe and useful for production sites

https://github.com/bearstech/phptop
90•kadrek•12h ago•13 comments

Air Lab – A portable and open air quality measuring device

https://networkedartifacts.com/airlab/simulator
299•256dpi•13h ago•139 comments

Rare black iceberg spotted off Labrador coast could be 100k years old

https://www.cbc.ca/news/canada/newfoundland-labrador/black-iceberg-labrador-coast-1.7551078
82•pseudolus•5h ago•36 comments

Gemini-2.5-pro-preview-06-05

https://deepmind.google/models/gemini/pro/
261•jcuenod•4h ago•153 comments

SkyRoof: New Ham Satellite Tracking and SDR Receiver Software

https://www.rtl-sdr.com/skyroof-new-ham-satellite-tracking-and-sdr-receiver-software/
4•rmason•2h ago•0 comments

Autonomous drone defeats human champions in racing first

https://www.tudelft.nl/en/2025/lr/autonomous-drone-from-tu-delft-defeats-human-champions-in-historic-racing-first
290•picture•1d ago•233 comments

OpenAI slams court order to save all ChatGPT logs, including deleted chats

https://arstechnica.com/tech-policy/2025/06/openai-says-court-forcing-it-to-save-all-chatgpt-logs-is-a-privacy-nightmare/
1043•ColinWright•23h ago•852 comments

From tokens to thoughts: How LLMs and humans trade compression for meaning

https://arxiv.org/abs/2505.17117
94•ggirelli•13h ago•20 comments

LLMs and Elixir: Windfall or Deathblow?

https://www.zachdaniel.dev/p/llms-and-elixir-windfall-or-deathblow
214•uxcolumbo•22h ago•106 comments

parrot.live

https://github.com/hugomd/parrot.live
202•jasonthorsness•22h ago•46 comments

End of an Era: Landsat 7 Decommissioned After 25 Years of Earth Observation

https://www.usgs.gov/news/national-news-release/end-era-landsat-7-decommissioned-after-25-years-earth-observation
95•keepamovin•17h ago•39 comments

Cysteine depletion triggers adipose tissue thermogenesis and weight loss

https://www.nature.com/articles/s42255-025-01297-8
76•bookofjoe•5h ago•52 comments

Neuromorphic computing

https://www.lanl.gov/media/publications/1663/1269-neuromorphic-computing
44•LAsteNERD•3h ago•32 comments

Show HN: I made a 3D SVG Renderer that projects textures without rasterization

https://seve.blog/p/i-made-a-3d-svg-renderer-that-projects
191•seveibar•19h ago•66 comments

Eleven v3

https://elevenlabs.io/v3
84•robertvc•2h ago•66 comments

Myanmar's chinlone ball sport threatened by conflict and rattan shortages

https://www.aljazeera.com/gallery/2025/6/5/myanmars-chinlone-ball-sport-threatened-by-conflict-and-rattan-shortages
3•YeGoblynQueenne•33m ago•0 comments

Apple Notes Will Gain Markdown Export at WWDC, and, I Have Thoughts

https://daringfireball.net/linked/2025/06/04/apple-notes-markdown
219•robenkleene•8h ago•126 comments

A Spiral Structure in the Inner Oort Cloud

https://iopscience.iop.org/article/10.3847/1538-4357/adbf9b
128•gnabgib•22h ago•33 comments

Prompt engineering playbook for programmers

https://addyo.substack.com/p/the-prompt-engineering-playbook-for
396•vinhnx•1d ago•154 comments
Open in hackernews

Mapping latitude and longitude to country, state, or city

https://austinhenley.com/blog/coord2state.html
115•azhenley•1d ago

Comments

codingdave•1d ago
> I set up an experiment that compares the original geometry with the simplified geometry by testing 1,000,000 random points within the US.

I'd be curious if the reliability is different if, instead of random locations, you limited it to locations with some level of population density. Because a lot of the USA is rural, so that random set is not going to correlate well to where people actually are. It probably matters more the farther east you go as well, as the population centers overlap borders more when you get to the eastern seaboard.

azhenley•1d ago
Good thinking. I discuss population density, cities near borders, and narrow borders in the last section.
madcaptenor•1d ago
As a native Philadelphian, I immediately see why you need a good resolution here - at 0.1 degrees resolution you very well could have assigned my birthplace to New Jersey. If I'm not mistaken New York and Philadelphia are the largest cities where you might have a problem. Chicago's on a state line but the Illinois-Indiana border is straight.
nocoiner•1d ago
I wonder if it’s actually straight though? In the chart on the page, Colorado is described as having 7000-something vertices, where I would have expected it to have … 4.
timewizard•1d ago
There's the congresionally approved boundary. Then there's the surveyed boundary. Wherein a team of people goes out and hammers survey marks and tags into the earth or creates man made monuments when that is not possible.
bigiain•1d ago
Another possible suggestion. Maybe choose random points that are within a set radius of points chosen along the borders? So perhaps choose first a random selection of points on the border, then choose random points within a circle (or perhaps just a square with a set delta in the lat/long) that are "nearby to the border" - then measure your error rates for those points at various boundary simplification tolerances? That'd remove the "middle of the state" random points where the border tolerance inevitable makes no difference.
Centrino•1d ago
The right term for what you are doing is "reverse geocoding".
esalman•1d ago
In some industries, a 0.7% rate of error for a simple reverse geocoding application would not be acceptable.
kylecazar•1d ago
Good writeup/tool. Seeing the number of vertices just for state boundaries makes me a little less hostile to Google's API.
zeke•1d ago
For reducing the number of points I've often used mapshaper.org.

For deciding if a user is in Texas you could create a simple polygon completely inside Texas and one in Oklahoma. 99% would fall in the simple polygon and the rest go to the detailed polygons. Or create bounds near the complex river borders and use the detailed polygons there.

On the other hand I just use simple, non-optimized functions for qquiz.com.

mynameisash•1d ago
> For deciding if a user is in Texas you could create a simple polygon completely inside Texas and one in Oklahoma.

This seems like the obvious optimized v1: create extremely compressed (simplified) polygons wholly within the proper geopolitical borders. You get 100% true positives for a significant fraction of queries, and any negatives you can still kick to GMaps. I understand wholly-local is the goal here, but as others have pointed out, even small error rates can be unacceptable in some scenarios.

zeke•1d ago
Yes, just paying for the between spots is exactly what I thought later in the day. Then check every month which areas cause costs and add those to the in-house polygons.
mark-r•1d ago
This is great! Now where's the same thing for time zones?
lmm•1d ago
https://github.com/RomanIakovlev/timeshape . It's glorious.
cyberax•1d ago
Use Nominatim: https://nominatim.org/

It can be self-hosted, with constant replication. There's also Photon which is a cut-down version of it: https://photon.komoot.io

tallytarik•1d ago
We self-host nominatim as part of the iplocate.io pipeline. It works great, but the requirements are pretty heavy for something to host casually.

An in-between for OP could be something like opencagedata.com, which is still a third-party API but an order of magnitude less expensive than Google. (not affiliated but have previously explored the service)

cyberax•1d ago
Komoot is also available (I linked it), they have a rate limit of 1 request per second, but it should be enough for personal use.
jillesvangurp•1d ago
Nice approach. It reminds me of an approach I saw used to resolve coordinates to countries. Instead of loading all country polygons, the team created a bitmap and used colors to map each pixel to a country code. The bitmap wasn't super large and compresses pretty nicely in png format. This worked well enough and it dumbed down the country lookup to simply figuring out the color for a coordinate. Neat trick. And you could probably figure out if you are dealing with an edge case by simply looking at neighboring pixels and fall back to something more expensive if you hit one of those.

And of course with edge cases, there are lots of them but mostly it's fine. One case that comes to mind is that of the border town of Baarle-Nassau On the border with the Netherlands and Belgium. This village has some of the weirdest borders in the world. There are Belgian exclaves inside Dutch enclaves. In some cases the border runs through houses and you can enter in one country and leave in another. Some of the exclaves are just a few meters. There are a few more examples like this around the world.

Another issue is the fractal nature of polygons. I once found a polygon for New Zealand that was around 200MB that broke my attempts to index it. This doesn't matter of course for resolving country codes because it is an island. But it's a reason I implemented the Douglas Peucker algorithm to simplify the polygon mentioned in the article at some point.

tallytarik•1d ago
I remember seeing this technique in a video by Sebastian Lague: https://youtu.be/sLqXFF8mlEU?t=787

Really cool

westnordost•1d ago
The bitmap approach you describe allows for immediate (i.e. O(1) ) lookup of region by coordinate, which is pretty neat. Space-efficiency-wise, a bitmap (+ index that maps color to country) might not be the most efficient data structure, though, as there are more than 256 countries, so you already need 16 bits for each pixel instead of 8. Then, you have the additional complexity of if you actually want the bitmap to be viewable by humans, you need to make sure that the colors for neighbouring countries at least are sufficiently distinct.

Anyway, a Kotlin library I wrote uses a similar technique to make requests for the majority of locations immediate, while also handling the edge cases - i.e. when querying a location near a border.

https://github.com/westnordost/countryboundaries (also available in Rust)

What it does is to slice up the input geometry (e.g. a GeoJson) into many small cells in a raster. So, when querying for a location, one doesn't need to do point-in-polygon checks for potentially huge polygons, but just for those little slices that are in the cell one is querying for. And of course, if a country completely covers a cell, we don't even need to do any point-in-polygon check anymore. All this slicing is done in a preprocessing step, so the actual library consumes a serialized data structure that is already in this sliced-up format.

I needed it to be fast because in my app I display a lot of POIs on the map for which there is logic that is dependent on in which country/state the POI is located.

jillesvangurp•1d ago
There are 249 countries with an iso code; so 8 bits might be enough. So it's not that bad. But even at 32 bits it would probably be fine and you could cram in some more data.

There are many similar things of course but nothing that was multiplatform, which I needed. I actually created a multiplatform kotlin library for working with language and country codes a few months ago: https://github.com/jillesvangurp/ko-iso

It seems we have some shared interests. I'll check out your library.

What you describe is nice strategy for indexing things. I've done some similar things. Another library (jillesvangurp/geogeometry) I maintain allows you to figure out which map tiles cover a polygon cover a polygon. Map tiles are nice because they are basically quad tree paths. I have a similar algorithm that does that with geohashes. You could use both for indexing geospatial stuff.

Slicing up the polygons sounds interesting. I've been meaning to have a go at intersect/union type operations on geometries. I added a boolean intersects recently to check whether geometries intersect each other. I already had containment check.

throw0101b•1d ago
> There are 249 countries with an iso code; so 8 bits might be enough.

There are 249 ISO 3166-1 country codes:

* https://en.wikipedia.org/wiki/List_of_ISO_3166_country_codes

But 193 sovereign states recognized by the UN:

* https://en.wikipedia.org/wiki/Member_states_of_the_United_Na...

Some of the discrepancy can be accounted for by "legacy" codes like .su for the Soviet Union.

jillesvangurp•1d ago
> But 193 sovereign states recognized by the UN

All of which are also in the 249 ISO 3166-1 list; it's a super set. It doesn't include the historical ones anymore. Codes for those are interesting if you have old data perhaps.

icameron•1d ago
Wow how many reverse geocode requests does a google API key make to get a bill into the thousands!?

Maybe a good iteration of this is use the .01 accuracy line work for the 99.9% of users but anything within 100m of a border could be sent to google API to get the edge cases. Probably would be in the free tier.

VladVladikoff•1d ago
Yeah I wonder if maybe they weren’t caching the responses locally and repeatedly sending their requests to Google. I run a site which gets 1.5M unique visitors a month and we use Google geocoding api, and it comes out to $0/month, because our usage fits in their free tier.
shooshx•1d ago
One obvious optimization that would half the size of the data and also solve the gaps problem is to keep every border between two states once and not twice (for each polygon). This would require some processing of the geometry to find the intersection points, but assuming that in the original data, the border between two states is the same exact line, it shouldn't be hard.
m2fkxy•1d ago
https://github.com/topojson/topojson
ThisNameIsTaken•1d ago
Last week I looked into such simplification/decimation algorithms to simplify lines sent to a showlaser projector. Turns out there is a whole bunch of different algorithms for decimation, each with different trade-offs.

It might be interesting to see how the edge cases mentioned in the article are impacted by switching to, for example, Visvalingam-Whyatt [0].

[0]: For a Python implementation: https://github.com/urschrei/simplification

m2fkxy•1d ago
> A side effect of the geometry simplication is that there are some very small gaps between states. Based on your use case, you'll need to handle the case of the point not being within any state borders. In these rare cases, you could fall back to a different method, such as distance checking centroid points, adding an episilon to all state borders, or simply asking the user. (The user may also be in another country or in the ocean...)

This is a common topic and easily dealt with by working with topology-informed geometries; most simplification algorithms support topology handling between different features. For instance, TopoJSON can be used.

wodenokoto•1d ago
This sounds like one of those “easy if you’ve learned it”. I dabble with GIS at work, so in some sense I am a pro at this, and I don’t know how topology easily deals with this.

But I’d like to know!

m2fkxy•1d ago
That's true. I have a bias of having part of my formal education quite focused on geospatial topics. Seeing non-geospatial folks reinventing wheels taught in GIS 101 both makes me smile and grimace thinking that we have have been doing something wrong with basic tools and aspects of the trade not being wider known.

You can look into TopoJSON here: https://github.com/topojson/topojson And a good general introduction to topology in GIS setting is nicely found in QGIS documentation: https://docs.qgis.org/3.40/en/docs/gentle_gis_introduction/t...

panic•1d ago
I wrote something similar a while ago for doing lon/lat -> congressional district reverse geocoding. Instead of simplifying the polygons, it splits the U.S. into tiles using a k-d tree and fetches the proper tile as a static file to do the final lookup: https://github.com/ianh/district-tiler
som•1d ago
Great approach.

Worth noting that there is a 6 decimal precision on the coordinates of the 90kb (gz) `coord2state.min.js` ... which suggests an accuracy that may not be present in the simplified data (i.e. <1m).

Before you increase tolerance to decrease filesize, you could consider lowering this decimal precision to 5, 4 or even 3 decimals given the "country, state, or city" requirement.

I also like the idea of using a heavily cached, heavily compressed image that is perfect for the >95% of the country that isn't within a pixel of a border. With a subsequent request for another heavily cached vector tile that encompasses any lat/lng within your 1px tolerance.

alexmolas•1d ago
I used to work on a logistics company and we had to map latitude and longitude to specific directions. One of the first things I learnt was to avoid storing 6 decimal precision coordinates. Also, this XKCD was shared a lot https://xkcd.com/2170/
fergonco•1d ago
That XKCD is very funny. BTW:

> You are pointing to Waldo on a page... on a specific date. Because of tectonic plates movement.

urschrei•1d ago
> A side effect of the geometry simplication is that there are some very small gaps between states. Based on your use case, you'll need to handle the case of the point not being within any state borders. In these rare cases, you could fall back to a different method, such as distance checking centroid points, adding an episilon to all state borders, or simply asking the user. (The user may also be in another country or in the ocean...)

If your pre-simplification input geometries form a coverage[0], you can use e.g. ST_CoverageSimplify[1] or coverage.simplify[2] to simplify them without introducing gaps.

[0] http://lin-ear-th-inking.blogspot.com/2022/07/polygonal-cove... [1] https://postgis.net/docs/ST_CoverageSimplify.html [2] https://shapely.readthedocs.io/en/2.1.0/reference/shapely.co...

Zobat•1d ago
How small can you get if you accept that some users might have to disambiguate when they're too close to a border? Should top out at four choices, users with no geo data has to choose between all. Feels like we can make the assumption that borders are quite sparsely populated for the most part, of course excluding cities built on borders but those are exceptions and users there might be more accepting of having to choose.
sawyna•1d ago
For reducing the number of points, I had to do something similar but for an isochrone. There were 2000 points for each isochrone and we had like 1000s of map markers. I simply picked every 200th point from the isochrone polygon and works reasonably well.

Of course, mapbox provides a parameter in the API to reduce the number of points using Douglas-Peucker algorithm. But I didn't want to make API call every single time, so we stored it and used a simple distilling depending on the use case.

lenerdenator•1d ago
Very practical library. Might use it in one of my projects soon!

Also, Missouri has more vertices than Kansas, suck it!

voidUpdate•1d ago
I think Colorado only has about 4 vertices
montroser•1d ago
Fun fact, there are pockets of New York State that are fully enveloped by New Jersey.

https://m.youtube.com/watch?v=SgZ1f4ACZBQ

tantalor•1d ago
You could save a bunch of space by encoding the data in a compact binary format and then loading it into a Float16Array.

In a .js file, each character is UTF-16 (2 bytes). Your current encoding uses 23 characters per coordinate, or 46 bytes.

Using 16-bit floats for lat/lon gives you accuracy down to 1 meter. You would need 4 bytes per coordinate. So that's a reduction by 91%.

You can't store raw binary bytes in a .js file so it would need to be a separate file. Or you can use base64 encoding (33% bigger than raw binary) in .js file (more like 6 bytes per coordinate).

(Edited to reflect .min.js)

netsharc•1d ago
> In a .js file, each character is UTF-16 (2 bytes).

What? I'd like to challenge this. The in-memory representation of a character may be UTF-16, but the file on disk can be UTF-8. Also UTF-16 doesn't mean "2 bytes per character": https://stackoverflow.com/a/27794229

The file https://github.com/AZHenley/coord2state/blob/main/dist/coord... doesn't use anything other than the 1-byte ASCII characters.

tantalor•1d ago
Yeah you're probably right, I guessed at that.

Thanks for the correction

pixelesque•1d ago
> Using 16-bit floats for lat/lon gives you accuracy down to 1 meter.

Not for Longitude it doesn't with values > abs(128), as that for example means 132.0 has the next possible value of 132.125.

float16 precision at values > 16 is pretty poor.

Converting that discrepency (132.125 - 132.0) to KM gives 10 KM.

Did you maybe mean Fixed-point? (but even then that's not enough precision for 1m)

tantalor•1d ago
Good catch, I didn't consider that.
Demiurge•1d ago
When simplifying the borders of regions, and you still want to locate any point to one of the regions, you need to simplify using topology.
1024core•1d ago
Anybody have a library to go from lat/long to DMA ID?