frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

We moved our blog off Webflow and what it cost us

https://blog.bunnyhoneyclub.com/posts/why-we-moved-our-blog-off-webflow
1•shadowinbox•44s ago•0 comments

China surpasses US in research spending

https://theconversation.com/china-surpasses-us-in-research-spending-the-consequences-extend-far-b...
1•JeanKage•53s ago•0 comments

Lovable: We're Currently Experiencing Issues

https://status.lovable.dev/
1•doener•1m ago•0 comments

Why the same LLM gives different answers in different environments

https://johndwade.substack.com/p/the-environment-rewrites-the-question
1•edgecased•2m ago•1 comments

Greenest countries eye drilling as fix for Iran crisis

https://www.politico.eu/article/worlds-greenest-countries-eye-drilling-as-fix-for-iran-crisis/
1•leonidasrup•6m ago•0 comments

If this doesn't scream AI bubble is about to burst IDK what does

https://docs.github.com/en/copilot/reference/copilot-billing/models-and-pricing
1•julia-kafarska•7m ago•0 comments

Goodbye Tim Apple – daily.dev Show [video]

https://www.youtube.com/watch?v=XKO67n3xfzM
1•idosh•9m ago•0 comments

What Type of AI Usage?

https://jensrantil.github.io/posts/types-of-ai-implementations/
1•JensRantil•9m ago•1 comments

AI Is Cannibalizing Human Intelligence

https://www.wsj.com/tech/ai/is-ai-smarter-than-humans-cyborg-956e0f0e
1•JeanKage•10m ago•0 comments

$1,605: average annual ad value of a U.S. Google user

https://proton.me/blog/what-is-your-data-worth-to-google
3•muzzy19•13m ago•0 comments

A Field Guide to Bugs

https://www.stephendiehl.com/posts/field_guide_to_bugs/
1•signa11•14m ago•0 comments

Phony whistleblowers, fake journalists and cyber spies

https://www.icij.org/investigations/china-targets/fake-journalists-cyber-spies-china-targets-repo...
1•_tk_•15m ago•0 comments

AI Workflows Need Provider Escape Hatches

https://rawsignal.xyz/posts/ai-workflows-need-provider-escape-hatches/
1•chown•17m ago•0 comments

Comparing SBC prices in 2024 and 2026

https://www.cnx-software.com/2026/04/28/what-a-difference-two-years-make-comparing-sbc-prices-in-...
1•pyprism•18m ago•0 comments

GitHub Copilot code review will start consuming GitHub Actions minutes

https://github.blog/changelog/2026-04-27-github-copilot-code-review-will-start-consuming-github-a...
2•whtsky•19m ago•0 comments

AI prefers resumes written by itself: Self-preferencing in Algorithmic Hiring

https://arxiv.org/abs/2509.00462
2•ytpete•22m ago•1 comments

Notice of Obsolescence

https://thebuild.com/blog/2026/04/27/notice-of-obsolescence/
1•ggaughan•22m ago•0 comments

Donating to Open Source

https://entropicthoughts.com/open-source-donation
1•exiguus•25m ago•0 comments

The era of "malicious compliance" in AI identity is here

https://arielsakin.substack.com/p/the-era-of-malicious-compliance-in
2•asakin•27m ago•0 comments

A Complete History of Quantum Computing (and what comes next)

https://quantumzeitgeist.com/a-complete-history-of-quantum-computing/
1•Nazzareno•29m ago•0 comments

Show HN: Devicons, +1300 logos and icons in React, SVG, and icon format

https://devicons.io/
2•vorillaz•30m ago•0 comments

A Year of Hetzner Auction Data: Where Did All the Servers Go?

https://blog.iodev.org/blog/hetzner-auction-supply-crunch/
2•100ms•32m ago•0 comments

London Met Police investigates officers after using Palantir AI tool

https://www.theguardian.com/uk-news/2026/apr/25/met-police-investigates-hundreds-officers-palanti...
1•lucidplot•33m ago•2 comments

Claud is missing the 2. Sense – Transcribe the web

https://webtranscriber.com/
1•broalkvam•33m ago•1 comments

Show HN: A free crypto trade journal with cycle support and CSV import

https://retired.today/log
1•attendos•34m ago•0 comments

Show HN: Delegare – let AI agents pay safely (x402, AP2 – base/USDC and Stripe)

https://delegare.dev/
1•tpfuetze•35m ago•0 comments

From milliseconds to 26 nanoseconds: how a $20 eBay SFP module beat my NT

https://austinsnerdythings.com/2026/04/26/ptp-osa5401-26-nanoseconds-raspberry-pi/
1•fanf2•38m ago•0 comments

How to run a local coding agent with Gemma 4 and Pi

https://patloeber.com/gemma-4-pi-agent/
1•mariuz•39m ago•0 comments

Top Negotiation Skills

https://www.pon.harvard.edu/daily/negotiation-skills-daily/top-10-negotiation-skills/
1•lucidplot•40m ago•0 comments

Bohu Laser Facility

https://en.wikipedia.org/wiki/Bohu_laser_facility
1•lumax•40m ago•0 comments
Open in hackernews

Scraping 241 UK council planning portals – 2.6M decisions so far

39•mebkorea•1h ago
I've been scraping 241 UK council planning portals – 2.6M decisions so far

UK planning data is technically public. In practice it's locked behind 400+ different council portals, some still running bespoke ASP.NET that looks like it dates from 2004, some behind AWS WAF, all with subtly different schemas. I've spent four months scraping them. I'm now at 241 councils and 2.6 million decisions across England, Scotland and Wales.

The scraping problem

Most UK councils run one of a handful of portal systems, Idox being the most common. In theory this makes things easy. In practice every council has configured theirs differently, some block non-browser requests via TLS fingerprinting, some have rate limits that will get you banned inside 10 minutes, and a handful are running the aforementioned bespoke ASP.NET.

I ended up writing several scrapers: a standard requests-based one, a Playwright-based one for councils that block anything that doesn't look like a real browser, and a curl_cffi one for TLS fingerprinting. Some councils I still can't get. Liverpool's portal sits behind AWS WAF with a JavaScript challenge. I have a working Playwright-based scraper that solves the challenge once and reuses cookies, but the WAF rate-limits the IP after about 10 requests and then blocks me for a day. So I have 60k Liverpool decisions from an old scrape and no easy way to add more.

What I found

The approval rate stuff is what most people come for. Nationally it's around 88%, but it varies wildly by ward within a council, not just between councils.

The more interesting finding came from the time-to-decision data. Across 119 English and Welsh councils, 36.5% of home extension applications missed the statutory 8-week target in 2025, up from 27.9% in 2019. Guildford is the worst at scale: 66% of decisions over target, averaging 13.3 weeks.

What it is now

A postcode checker (free) and paid PDF reports (£19/£79). Zero paying customers so far, which is fine. I've been heads down on data quality and coverage.

Site is planninglens.co.uk if you want to poke around. AMA on the scraping side – that's where the interesting problems are.

Comments

CJefferson•1h ago
So, this sounds exciting to me, but the postcode checker really feels like a spam as a user. All it tells me is 'Mixed results'. I could make a website that prints 'mixed results', I bet most results are 'mixed'!

I understand wanting to get money, but honestly, there is no way I would give money to this website in it's current state, you are giving me far too little info before asking me to hand over a credit card.

Then, if someone gives you £19, a crazy amount of money honestly, the last page of the report is an advert to give them 4 times more!

mebkorea•1h ago
Really useful feedback, cheers. Yeah, "Mixed results" is kinda rubbish as you say. It should give you something concrete before asking for anything. I'll fix that today. Fair point on the £79 upsell at the end of a £19 report too. That's tone deaf and I'll move it. On the £19... I'll think about it, but you're right the site needs to do more to justify the spend before pulling out a card. Appreciate the honest take!
CJefferson•1h ago
Just a quick follow up, if my reply seemed very harsh, view that as a sign of how enthusiastic I was to see the website at first. I understand wanting to make money, but I'd seriously consider giving a lot more away (maybe even the basic report stuff) away for free, I'd love to explore my local area, my parent's, be nosey what life is like in Oxford (a place I previously lived), but even if I was willing to pay (I'm not), having to stop, get PDF, download, really breaks the flow.
mebkorea•1h ago
No, that's absolutely a fair follow-up and not harsh at all. It's very helpful. The "be nosey about places you used to live" use case is exactly what the postcode tool should serve (thinking about it), and right now it doesn't. You're right that PDF-downloads break flow badly. Tbh... that's a hangover from the "people want a thing they can save" assumption that I'm still stuck in, I guess. I'm still on the fence about giving the paid reports away wholesale, but the gap between "tells you nothing" and "£19 PDF" is way too big. I'm gonna need a middle layer of free but actually useful exploration on the site. Will have a solid think about this today. Appreciate the feedback!
gnfargbl•1h ago
I'm also enthusiastic, it's not often you see people find a genuinely underserved niche and you have.

I don't know if I would pay £19 for a general state-of-the-area report. I would almost certainly have paid £100-300 for a service that took my planning application, critically reviewed it and told me which aspects were and were not likely to pass, with references to specific examples within my local area.

mebkorea•1h ago
Thanks, honestly that means a lot! Yeah, the pre-submission review idea is interesting and I've thought about it. I have the data to surface "applications similar to yours in your ward, here's what got approved and what didn't" but I haven't built it as a workflow because it requires the user to upload their plans... and that's a different kind of trust ask, but yeah, it is definitely worth revisiting. £100-500 is also a much more honest price for something that genuinely changes a decision. £19 is in the awkward "too much for curiosity, too little for stakes" zone you and the other commenter are both pointing at.
ramon156•52m ago
Just checking, are you using an LLM to reply? Your replies are riddled with things LLMs are good at, like making quoted analogies that make no sense. They're not even analogies
pjc50•1h ago
What benefit would people gain from the reports? Average rate of success/time is interesting, but I'm not sure what you'd do with this information other than a bit of local press discourse. I suppose it's nicely timed for the council elections?
mebkorea•51m ago
Honest answer... I don't fully know, zero paying customers so it's still very much a hypothesis. The two use cases I think hold up: (1) people pre-buying a house with extension potential, who otherwise guess or pay £500+ for a planning consultant; (2) homeowners about to commission £2-5k of architect drawings who want a sanity check before proceeding. Someone else suggested £100-500 for a proper pre-submission review which is probably better for that second case than my £19 report. The "general state-of-area" framing is the weakest one and you're right it's mostly local press discourse — that's marketing not revenue.
beatthatflight•1h ago
Worth trying claude/gemini to see if they'll do some scraping for you. I've found some paywall sites only too happy to allow Gemini past the wall.
mebkorea•1h ago
Hadn't thought of that tbh. Worth a go on Liverpool especially... that's the AWS WAF one I'm currently blocked on and it is doing my head in. The challenge there is volume rather than access (~80k decisions to backfill), so even if an LLM gets through the wall I'd still need to script around it. But could be a way in for the initial cookie. Cheers for the tip and will look into it.
ashish-alex•1h ago
Working on similar problem in another domain. I found agentic direction powerful with browser use plugged into a multimodal (strong agentic capability) llm like gpt 5.4 mini working in a loop with orchestrator evaluator/judge.
mebkorea•1h ago
Nice! Yeah, I went the other way... deterministic scrapers per portal type because once you've worked out the search form quirks for an Idox or Northgate or Ocellaweb, it's the same shape across every council using that platform. So the marginal cost of adding council N is config not code. The agentic approach gets more interesting for the long tail though — the bespoke ASP.NET ones where every council is its own snowflake... and it is a GRIND honestly. How are you finding the loop on cost vs reliability?
gnfargbl•1h ago
Deterministic scrapers are almost certainly the right answer for this task, because once those special snowflakes have paid for their bespoke IT system, they'll never change it.

On the grind, why not get an agent to help you build the long tail of deterministic scrapers? Claude etc is really shockingly good at this kind of moderate-complexity iterative work, it will just keep going around the fetch/parse/understand loop until it has what you're looking for.

mebkorea•1h ago
Yeah, that's essentially what I'm doing. Claude handles most of the look at the portal, work out the search form, write the config loop. The actual bottleneck isn't code tbh, it's that every (snowflake) council needs like 30+ minutes of investigation before you can even get going, and a chunk deadend because the portal's broken or migrated. I already hit three this morning. Worcester returns connection refused, Breckland's URL is dead, Rother migrated to a different platform. The grind is "is this portal even alive" more than the scraper itself.
sublimefire•1h ago
Send a message to infoshareplus.com They might be interested in your data because they operate a business around local govs.
mebkorea•1h ago
Thanks, hadn't come across them. I will have a poke around and reach out. Appreciate the pointer.
dabeeeenster•1h ago
Have you tried using Browserless/similar to scrape around tricky hosts?
mebkorea•1h ago
No, I haven't tried Browserless. So far, it has all been from a single residential IP which is probably the bigger issue with Liverpool than the WAF challenge itself. Once I have a valid session cookie I can solve the JS challenge fine, the rate limit is per-IP. Rotating residential proxies (or Browserless behind one) might be the answer... I'm just reluctant at this stage to bite the bullet on the cost for a single (albeit huge) council. Have you used it for similar stuff?
efaref•1h ago
Great site. This data should really be more accessible. Planning in the UK is a total crapshoot, subject to the whims of the planning authorities. In our case, a simple rear extension and dormer loft conversion, similar to hundreds of thousands across the country, we ended up having to appeal which added 2 years and tens of thousands of pounds in costs to our extension project. Our area shows up as a high refusal area, which tracks.

It would be good to add appeal data in (also a public gateway) to show which councils are just being unreasonable.

I personally think the planning regulations in this country are the cause of many ills, including the housing shortage. It just costs so much to get through planning these days, it is often just not worth it. Data like this could help us get that changed.

ricardobayes•1h ago
Maybe a tongue-in-cheek comment but regulations are that way because you guys want it that way (maybe not you personally). If it wasn't like that, nothing would stop a garbage incinerator or a quarry popping up a few hundred meters from houses (which happens in European countries with more deregulated planning/zoning regulations).

You guys have all kinds of pro-individualistic, borderline nonsensical residental housing laws like "right to light" and "right to view". It's completely incompatible with "build more". Most British people view their privacy (or perceived privacy) as a higher priority than fixing the housing market. "It's so overlooked" is such common comment and it's almost bizarre to someone used to living in a higher density environment (like the UK very much is).

jayelbe•58m ago
Waste disposal and planning for quarrying and mineral extraction are different functions, decided at a higher tier of local government, and are not directly comparable to development management/planning.
imdsm•1h ago
How long did the scraping take you to build?
mebkorea•1h ago
Around four months part-time. The bulk was the first 6 to 8 weeks building the three main scrapers (Idox, Northgate, Ocellaweb). After that, councils on those platforms are mostly config. The rest has been a long tail of bespoke portals, each taking anywhere from an evening to "give up and revisit and repeat".
ferngodfather•1h ago
Your terms:

> You may not use automated tools to scrape, copy, or bulk-download data from our service.

Pot kettle, huh.

mebkorea•1h ago
Fair catch and pretty embarrassing... ngl. That's a generic template clause I didn't think hard enough about at the time and it's obviously contradictory given what the site does. I'll rewrite it today. The position I want to take is: scrape responsibly, respect rate limits, don't republish bulk data, which is what I try to do with the councils. Will fix the wording. Thanks.
mebkorea•22m ago
Updated and pushed live: planninglens.co.uk/terms. Acceptable Use clause now permits programmatic access that respects rate limits, while still protecting our derived analysis and reports. Thanks for the kick.
safehuss•1h ago
This is awesome! Worked on something similar albeit a different industry.

For the more challenging scrapes, would highly recommend using the Chrome Devtools MCP to be able to attach the network requests, being made by the browser to the site, as context for your agent/LLM chat - this approach really helped me to write a solid API-based scraper (also using curl_cffi) and bypassed the old tedious playwright-based approach I used to rely on.

vr46•1h ago
Amazing! It’s so bloody hard to access this information or even to know what there is.

Careful not to expose the councils too publicly before they shut you off

mebkorea•58m ago
Cheers! Yeah, it's honestly mental how fragmented it is. Every council is its own little island. On the shutting-off worry: the data is statutorily public. Councils are legally required to publish it, and I'm respecting rate limits and not hammering anyone. So far no council has objected. Touch wood this remains the case. Tbh, I think the risk is more from the platform vendors than the councils themselves. It seems Idox etc have a commercial interest in this data being awkward to access.
nopurpose•49m ago
There was a story how similar initiative for a courts decisions scraping was shut off.
pbhjpbhj•1h ago
Have you spoken to any planners, a quick search for similar applications in other LAs might be a useful thing for them.

There's a Royal Institute of Town Planners, they probably have a magazine you could advertise in (but equally that might get you blocked, idk).

RICS people could probably use the data too? I guess it's useful house-buyer info; houses in the vicinity had successful loft conversions, say.

On the data side - it's something of a moat for you now, but I could see you being successful with FOI requests. An MP might be interested in championing open data access.

pbhjpbhj•1h ago
Is any of the data on Gov.uk - any scrapping tips there? I've tried scraping some patent tribunal data but haven't been successful (just using Python (copying in session data), I guess Playwright might be useful there).
mebkorea•54m ago
Planning data on gov.uk is really patchy and not useful for what I want. There's planning.data.gov.uk which has some boundary/policy data but no actual decisions. The decisions only exist on council portals, which is the whole reason this project exists. On patent tribunal, I haven't looked into that one specifically but a few general gov.uk tips: most gov.uk content is actually clean HTML (way easier than council portals), so if requests isn't working it's usually either JS-rendered content (Playwright fixes this) or session/cookie weirdness. Things that have helped me elsewhere: Playwright with page.wait_for_selector rather than networkidle, copying real browser headers wholesale (not just User-Agent), and checking if there's a hidden JSON API behind the page (open devtools → Network tab → look for XHR/fetch requests when you click search). Often there's a clean JSON endpoint that the page is using, which is way easier to scrape than the rendered HTML.
doublesocket•1h ago
It's is the most ridiculous situation with council technology that they all use different providers for what are fundamentally the same functions. It's the same for council tax and a host of other services as it is for planning. Consequently, at least from the various portals I've used, they all do it badly. This absolutely could and should be done by a single, well funded central team.
mnkyokyfrnd•1h ago
Unless you use a nationalised product for this; this is the best outcome.
doublesocket•1h ago
GDS was nationalised and they certainly did a better, albeit not perfect, job than the myriad of private solutions councils use. There just doesn't appear to be the capability to properly specify and source IT at a council level.
edent•1h ago
Have you tried using FoI to get the data? I've had some success with data requests - often getting dumps in CSV or similar.

I appreciate that won't necessarily capture live / recent data. But it might be quicker than waiting for rate-limits to reset.

notarobot123•1h ago
It looks like this kind of data will start to be more open in the future. New legislation introduces mandatory data standards in England: https://mhclgdigital.blog.gov.uk/2026/04/22/data-standards-l...
niffydroid•1h ago
Ace, I can see how this could actually be quite useful for house conveyancing. You've put a lot of effort into this. How are you affected by the upcoming changes to local government? They'll no doubt be some rationalisation at some point.
morkee•50m ago
I hate to be a downer but...

> UK planning data is technically public.

it's public, but still copyrighted by those who submitted it

the councils also have database rights over their database, unless you've obtained explicit permission from them directly

https://en.wikipedia.org/wiki/Database_right#United_Kingdom

> I ended up writing several scrapers: a standard requests-based one, a Playwright-based one for councils that block anything that doesn't look like a real browser, and a curl_cffi one for TLS fingerprinting.

so they're explicitly trying to stop you doing this, and ... you're openly admitting to bypassing their technical measures to try and stop you?

have you heard of the Computer Misuse Act?

I doubt the 240 councils are going to be happy once they find out you've done this, especially if you're selling it on for profit

mebkorea•43m ago
Fair points and I appreciate the feedback. Database right is real but the threshold is "substantial part". I'm literally only showing aggregates and letting people search by postcode. I'm not completely republishing council databases. Think that's defensible, but not gonna pretend that it's 100% black and white. On CMA, I'd push back. That's about unauthorised access. These portals are public-facing and the data's published deliberately for people to view. Rotating user-agents isn't bypassing security in any meaningful way... I'm not breaking auth or guessing passwords. I back off when portals signal they're unhappy (Liverpool's WAF actively rate-limited me which is why that data's stale). No council has reached out so far. Could change ofc. Solo founder with no legal team though, so happy to be told I've got it wrong.
simonjuk•46m ago
I work with public data, and I'd love to get access to this data, but I suspect that although you have scraped the data from public websites, there are licensing and copyright implications for actually using it.

See also the open addresses project by Data Adaptive [1] which is using Freedom of Information requests to publish public council tax address data. The problem they have run into there is that their address datasets are derived from proprietary Ordnance Survey data.

It looks like data.gov.uk is in the process of standardising the planning application process, and publishing them under OGL [2].

[1]: https://www.owenboswarva.com/blog/post-addr44.htm [2]: https://www.planning.data.gov.uk/dataset/planning-applicatio...

mebkorea•36m ago
Thanks and yeah, some of my boundary data (for the choropleth) comes from ONS open boundary files which I think are OGL but I'd need to check the chain of derivation. On the data.gov.uk standardisation, I've seen it but last I looked it was policy and boundaries, not actual decisions. Has that changed? If they're publishing decisions under OGL I'd gladly ditch the scraping for a proper feed. On licensing more generally... I haven't fully nailed it down. Showing aggregates and pointing back to source, but yeah there's a gap between "data is public" and "do whatever you want with it commercially".
codeulike•39m ago
I'd be careful because even though its 'public' data, scraping it might not be legal due to TOS of the various sites.

I did a search for my postcode and got given results for a different area and council miles away

mebkorea•30m ago
Thanks for the feedback. On TOS: the same answer as I gave others... the data is statutorily public, I respect rate limits. That being said, I admit it's a grey area I haven't 100% nailed down. The postcode bug is more concerning. That shouldn't happen. Do you mind sharing which postcode or city/county? It could be that it's falling back to the wrong council because I don't have data for the right one, or it's a bug in my mapping. Either way, it needs fixing asap! Cheers for flagging.
lifeisstillgood•22m ago
Some thoughts

1. Brilliant! Governments (and corps) treat public data like it’s theirs not ours. Information yearns to be free.

2. Having said that, you are likely violating T&Cs by scraping at all.

3. It is a lot easier to defend your position if you are making it free and public yourself.

4. But paying for food is nice

5. I suggest the business model here is providing architects and lawyers with strong evidence of prior planning decisions nationally

Most people applying for (difficult) planning have experience locally. But the planning system is a mess because it is not coherent nationally or regionally. The win here is not providing a copy of your data (that has legal issues) but providing pointers to decisions that support the case of the person paying you.

So I want to turn an old pub into tasteful housing and a cafe for the local village. The local planning team don’t like it, I could spend money bribing them and the councillors (see how much I understand British democracy) or I could get from you the fifteen pub to housing conversion decisions from around the country and use that to help my bribed councillors defend their u-turn

Everyone wins :-)

mebkorea•1m ago
Cheers, appreciate the feedback. The architect/consultant precedent angle is interesting and a couple of other commenters have already nudged me in similar directions. Tbh... you're likely right that the strongest commercial play isn't B2C £19 reports, it's giving someone fighting a contested case the national pattern across 15 similar pub conversions, the appeal outcomes, what stuck and what didn't. That's a very different product to what I have now but the data supports it. On the T&Cs/legal stuff... I'm not going to pretend I have perfect clarity on it. The position I'd defend is that the data is statutorily public, councils are required by law to publish it, I respect rate limits, and I'm aggregating not republishing in bulk. But there is this grey area between data being public to view and being usable for a commercial product, and I haven't fully nailed it down.