frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

I wrote 200 scrapers for city permit APIs to map US construction activity

https://permitradar.io
2•twincipher•1h ago

Comments

twincipher•1h ago
So something big got approved a few miles from my house -- a data center complex -- which I found out through a local news provider. The story sparked my curiosity, and I soon went down a rabbit hole of local city government websites and public data to see what other projects might be in the works.

Then the thought occurred to me: what if I could just... scrape all of it?

So one API led to another and another.... I ended up writing 200+ scrapers across 85 cities. It turns out that when the City of Columbus uses Accela, the City of Austin uses Amanda, the City of Chicago has its own thing, and half the other cities dump CSVs on an FTP server that may or not be online -- "just scrape it all" stops being simple quickly.

Some things I learned along the way:

- There is no standard for permit data. Every city invents its own schema. - Geocoding more than a million addresses sounds straightforward until you come to the conclusion that half of the addresses are things like "LOT 4 BLK 2 UNIT 7". - Government APIs have rate limits that appear to be set by someone that assumed no one would use them. - The estimated cost field is a work of creative fiction. A $200 million data center will sometimes be listed at $1.

PermitRadar is the result -- an interactive map + search across 1.6M+ results. You can lookup any city, filter by date/cost/type, and see what's going on. If you care about a specific address (homeowner, contractor, investor), you can setup alerts that notify you when new permits are filed.

The city pages (e.g. /permits/los-angeles-ca) are server-rendered and public -- no login required. The stack is Express/TypeScript + Next.js + PostGIS + Redis + BullMQ. Scrapers run on a cron job and feed a queue that handles geocoding, normalization, and AI classification (Claude Haiku 4.5).

I'm happy to answer any questions that you have regarding scraping, the data normalization hellscape, or anything under the sun.

navaed01•1h ago
Really interesting project! Did you use AI at all to build the scrapers?
twincipher•1h ago
Yes and no. Claude code helped with boilerplate and debugging, however every city's permit system is different enough to warrant hands-on work - figuring out the API endpoints, understanding the schema, pagination quirks, etc... AI was extremely useful for the repetitive parts (parsing HTML and mapping fields), but the hard part is comprehending each city's unique data and normalizing it into something consistent. That's still a human problem.

Convergent evolution in locomotory patterns of flying, swimming animals (2011)

https://www.nature.com/articles/ncomms1350
1•mooreds•52s ago•0 comments

Rapid Modeling (2023)

https://jbminn.com/blog/posts/rapid-modeling/
1•mooreds•1m ago•0 comments

GPL upgrades via section 14 proxy delegation

https://runxiyu.org/comp/gplproxy/
1•todsacerdoti•3m ago•0 comments

Zed now forces arbitration and opt-out requires PII

https://zed.dev/blog/terms-update
2•shock•3m ago•1 comments

Standard mental health therapies fall short for autistic adults, study suggests

https://www.psypost.org/standard-mental-health-therapies-often-fall-short-for-autistic-adults-stu...
1•pseudalopex•3m ago•0 comments

Version 1.4.1 of Bayesian SSH is available

https://github.com/abdoufermat5/bayesian-ssh
1•abdouyaya1998•4m ago•0 comments

Show HN: Costrace – Open-source LLM cost and latency tracking across providers

https://www.costrace.dev/
1•Ikotun•5m ago•0 comments

LLMs Are Antithetical to Writing and Humanity

https://thedispatch.com/article/donald-trump-third-term-steve-bannon-jd-vance/
2•speckx•5m ago•0 comments

The trackball that merges pointing and 3D control

https://rotatrix.com
1•OJFord•5m ago•0 comments

Chaotic 4 days led to man's suicide, says lawsuit against Google

https://www.sfgate.com/tech/article/suicide-lawsuit-google-ai-21955695.php
1•jamesmiller5•6m ago•0 comments

Mullvad VPN takes its banned anti-surveillance ad to the streets

https://www.techradar.com/vpn/vpn-privacy-security/mullvad-vpn-takes-its-banned-anti-surveillance...
1•nickslaughter02•6m ago•1 comments

Redis-py typing issue open since 2022

https://github.com/redis/redis-py/issues/2399
1•druml•6m ago•0 comments

Show HN: VideoNinja – paste video URLs, walk away, they download

1•hamuf•6m ago•0 comments

Neutralinojs developer framework compromised with malware

https://opensourcemalware.com/blog/neutralinojs-compromise
1•6mile•7m ago•0 comments

Extending Daniel Lemire's bit packing to handle 64-bit values

https://old.reddit.com/r/cpp/comments/1rlekeb/extending_daniel_lemires_bit_packing_to_uint64_t
1•gnusi•7m ago•0 comments

You Shouldn't Ask an AI for Advice Before Selling Your Soul to the Devil

https://ernaud-breissie.github.io/thoughts/why-you-shouldnt-ask-an-ai-for-advice-before-selling-y...
1•bussiere•7m ago•0 comments

Show HN: Pulse – personalized daily audio news briefs from topics you choose

https://pulsemedialaboratories.com
2•jvando•7m ago•1 comments

Product Price Alert Service

https://buysignal.co.uk/
1•hollywoodoo•7m ago•2 comments

My Data Quality Tools List: Tried Any?

https://toolsfordata.com/lists/data-quality-tools/
1•Arimbr•10m ago•0 comments

Baudrate: ActivityPub-enabled BBS built with Elixir and Phoenix

https://github.com/hiroshiyui/baudrate
1•rguiscard•11m ago•0 comments

Show HN: Move 37 – A strategy game inspired by AlphaGo's Move 37

https://play.google.com/store/apps/details?id=com.move37.app&hl=en_US
1•MUISIK•12m ago•0 comments

Parakaryon

https://en.wikipedia.org/wiki/Parakaryon
1•thunderbong•12m ago•0 comments

First PR Concierge – AI that matches your GitHub skills to open source issues

1•boweii•12m ago•0 comments

Course Hero to Pay $75M to Post University in Copyright Metadata Case

https://usaherald.com/connecticut-jury-orders-course-hero-parent-to-pay-75-million-to-post-univer...
1•williamcotton•13m ago•0 comments

Ask HN: Is claw-generated code copyright-able?

2•g-clef•13m ago•1 comments

2,622 Valid Certificates Exposed: A Google-GitGuardian Study

https://blog.gitguardian.com/certificates-exposed-a-google-gitguardian-study/
1•guedou•14m ago•0 comments

Show HN: Porchsongs.ai; Rewrite chordcharts/lyrics with AI to make them personal

https://porchsongs.ai
1•river_otter•16m ago•0 comments

Produced by Human

https://marius-anderie.com/blog/produced-by-human
1•moccajoghurt•16m ago•0 comments

Large genome model: open-source AI trained on trillions of bases

https://arstechnica.com/science/2026/03/large-genome-model-open-source-ai-trained-on-trillions-of...
1•Bender•16m ago•0 comments

Space Command chief throws cold water on the question of UAPs in space

https://arstechnica.com/space/2026/03/space-command-chief-throws-cold-water-on-the-question-of-ua...
1•Bender•16m ago•0 comments