news newest ask show jobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: I used NLP to turn UK planning PDFs into a clean CSV

https://www.kaggle.com/datasets/strictschema/uk-planning-decisions-schema-sample

1•david_s_data•2h ago

Comments

david_s_data•2h ago

Hi everyone. I've been spending a lot of time looking at UK real estate data and realized that the actual valuable stuff (like the specific reasons why a council rejects a planning application) is buried in unstructured PDFs.

I decided to build an extraction pipeline to pull the policy breaches, officer notes and timelines, etc. out of those PDFs and into a clean CSV. I also had to write a quick script to strip out all the exact addresses and names down to the postcode level to avoid GDPR issues.

I just put a 50-row sample of the schema up on Kaggle. Before I burn money on compute to scale this to 10,000+ rows across London, I'd really appreciate a sanity check from anyone who works with spatial or proptech data. Are there any obvious columns or data points I'm completely missing here?

ErrataBench – A Proofreading Benchmark for LLMs

https://revise.io/errata-bench

1•artursapek•33s ago•0 comments

OpenAI's Sam Altman tells companies to try four-day working week

https://www.thetimes.com/us/news-today/article/openai-chief-backs-four-day-week-to-spread-ai-bene...

2•romanhn•2m ago•0 comments

GitHub Copilot CLI now supports BYOK and local models

https://github.blog/changelog/2026-04-07-copilot-cli-now-supports-byok-and-local-models/

1•abraham•2m ago•0 comments

Ignore soft skills at your peril

https://togetherlondon.com/insights/ignore-soft-skills-at-your-peril

1•lucidplot•3m ago•0 comments

Towards a Bitter Lesson of Optimization

https://sifal.social/posts/Towards-a-Bitter-Lesson-of-Optimization-When-Neural-Networks-Write-The...

1•MostHumble•3m ago•0 comments

Metaculus: Labor Automation Tournament with $35k Prize Pool

https://www.metaculus.com/tournament/labor-hub/

1•postreal•4m ago•0 comments

I built a software for my PPL agency – now I want to sell the software as well

2•CalvinGomes•7m ago•0 comments

From Blindness to Cybersecurity: My Journey as a Blind Security Researcher

https://juanmathewsrebellosantos.com

2•azurejoga•7m ago•0 comments

How Much Compute Does China Have?

https://www.chinatalk.media/p/how-many-chips-does-china-have

2•speckx•8m ago•0 comments

I don't want an autonomous AI agent. I want a collaborator

3•robenglander•8m ago•0 comments

If you don't write it, I don't read it

https://josem.co/if-you-dont-write-it-i-dont-read-it/

2•josem•10m ago•0 comments

How Accurate Are Google's A.I. Overviews?

https://www.nytimes.com/2026/04/07/technology/google-ai-overviews-accuracy.html

2•cyndunlop•12m ago•0 comments

We Spent €11/Month Testing Docker Swarm So You Don't Have To

https://raus.cloud/blog/docker-swarm-test-11-euro-lesson/

2•eduardosanzb•12m ago•0 comments

Paradigm claims ignoring, not observing, is the key to scientific breakthroughs

https://paradigmsage.com/pop/ch-05-uncertainty/

2•allangoff•12m ago•0 comments

Body Language

https://www.terrygodier.com/body-language

2•cylo•13m ago•0 comments

Orca: A cognitive runtime layer for agents (open source)

https://github.com/gfernandf/agent-skills

3•gfernandf1•13m ago•1 comments

Two Years of Valkey

https://redmonk.com/sogrady/2026/04/06/valkey-at-two/

2•rmoff•13m ago•0 comments

Bill Phillips used flowing water to model the economy

https://www.npr.org/sections/planet-money/2026/04/07/g-s1-116575/how-bill-phillips-used-flowing-w...

2•rolph•14m ago•0 comments

How Not to Use AI? (At Work)

https://hackpravj.com/blog/how-not-to-use-ai-at-work/

2•01-_-•14m ago•0 comments

Find the latest tag for Docker images

https://www.schlachter.xyz/projects/find-the-latest-tag-for-docker-images

1•dddddaviddddd•14m ago•0 comments

Physical Engineering AI – tools for mech engineers

https://github.com/010zx00x1/Awesome-Physical-Engineering-AI

1•010zx00x1•17m ago•0 comments

Flying Ultra-Luxury from Paris

https://www.nytimes.com/2026/04/06/travel/first-class-luxury-flight-air-france-la-premiere.html

1•bookofjoe•17m ago•1 comments

Show HN: QRAuth – Open-source QR verification with passkeys and device trust

https://github.com/QRAuth-io/qrauth

1•aristech•18m ago•0 comments

Deere Settles Class Action Right-to-Repair Lawsuit

https://farmpolicynews.illinois.edu/2026/04/deere-settles-class-action-right-to-repair-lawsuit/

2•toomuchtodo•19m ago•1 comments

Teardown of unreleased LG Rollable shows why rollable phones aren't a thing

https://arstechnica.com/gadgets/2026/04/teardown-of-unreleased-lg-rollable-shows-why-rollable-pho...

2•DamnInteresting•20m ago•0 comments

Moving fast in hardware: lessons from lab to $100M ARR

https://blog.zacka.io/p/simplify-then-add-lightness-bc4

4•rryan•20m ago•1 comments

What Happened to the Ancient Bug Giants of 300M Years Ago?

https://nautil.us/what-happened-to-the-ancient-bug-giants-of-300-million-years-ago-1279562

1•Brajeshwar•20m ago•0 comments

What is 'muscle memory' and can I improve mine?

https://theconversation.com/what-is-muscle-memory-and-can-i-improve-mine-277471

2•Brajeshwar•20m ago•0 comments

My Blissful Week as a 'Do Not Disturb' Maximalist

https://www.wired.com/story/my-blissful-unbothered-life-as-a-do-not-disturb-maximalist/

1•Brajeshwar•20m ago•0 comments

The Story of Oil 1822–1922

https://www.thechemicalengineer.com/features/the-story-of-oil/

2•__natty__•20m ago•0 comments