frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Fuzzy-matching messy job board data against the UK Gov Visa Registry

https://apify.com/dakheera47/uk-visa-sponsor-verifier
1•dakheera47•1h ago
Hey HN,

I’ve been working on job application pipelines and kept hitting a massive data friction point: reliably filtering out UK companies that legally cannot sponsor international workers.

The UK Home Office publishes a constantly updating CSV of licensed sponsors. The problem is, the data is practically useless for standard database joins. A job board might list a role at "Acme", but the government registry lists "Acme Technologies Holdings Limited".

If you run an exact-string match or a basic ILIKE against a scrape of 10,000 Indeed or LinkedIn jobs, your false-negative rate is massive.

I wrote a TypeScript-based matching engine to solve this. Here is the pipeline:

Dynamic Ingestion: It bypasses the Gov.uk dynamic routing to pull the raw, multi-megabyte CSV directly into memory. No stale database records.

Text Normalization: I built a custom parser to strip out standard corporate suffixes ("ltd", "plc", "llp", "t/a", etc.) and handle the weird punctuation and localized encodings that break standard scrapers.

Fuzzy Scoring: It runs an optimized Levenshtein distance algorithm over the in-memory array to output a 0-100 confidence score for the match.

Initially, I built this with a persistent local cron scheduler (node:fs) for an open-source job ops project. But to make it scalable for batch processing, I ripped out the local caching and deployed it as an ephemeral Docker container. It spins up, processes an array of thousands of scraped companies entirely in-memory in a few seconds, pushes a clean JSON dataset of verified sponsors, and dies.

If you are building a job aggregator, an ATS, or a lead-gen pipeline and don't want to waste a weekend writing your own corporate-suffix normalization logic, I hosted the serverless endpoint here: https://apify.com/dakheera47/uk-visa-sponsor-verifier

I'd love any feedback on the text normalization approach, or to hear if anyone knows of specific edge cases in the Home Office data formatting that I might have missed.

The economic potential of generative AI: The next productivity frontier

https://www.mckinsey.com/capabilities/tech-and-ai/our-insights/the-economic-potential-of-generati...
1•daviducolo•55s ago•0 comments

How do you manage prompt versioning and iteration?

https://www.promptbuilder.space/
1•Jaber_Said•58s ago•1 comments

Designing a 36-key custom keyboard layout (2021)

https://peterxjang.medium.com/designing-a-36-key-custom-keyboard-layout-24498a0eecd4
1•speckx•3m ago•0 comments

Spaco – A Spatial Workspace Platform

https://www.sideprojectors.com/project/73595/spaco-a-spatial-workspace
1•GavinRatta•3m ago•1 comments

Will pressure cause Cuba to finally buckle?

https://www.ft.com/content/8140ecea-7293-4528-b06c-a0615bf25862
1•ViktorRay•3m ago•0 comments

Multiplayer: Share tmux sessions (Claude Code, etc.) over LAN and the internet

https://github.com/jpettersson/multiplayer
1•jpettersson•4m ago•1 comments

Show HN: Ez – project-scoped command aliases for macOS

https://github.com/urtti/ez
2•frankbyte•5m ago•0 comments

Anthropic Partners with CodePath

https://www.anthropic.com/news/anthropic-codepath-partnership
1•ryanhn•9m ago•0 comments

Show HN: Network-AI – A Distributed Mutex for AI Agent Swarms

https://github.com/jovanSAPFIONEER/Network-AI
1•jovanaccount•10m ago•0 comments

Open Source Is Not About You

https://gist.github.com/richhickey/1563cddea1002958f96e7ba9519972d9
7•doubleg•11m ago•0 comments

Apocalypse no: how almost everything we thought we knew about the Maya is wrong

https://www.theguardian.com/news/2026/feb/12/apocalypse-no-how-almost-everything-we-thought-we-kn...
2•speckx•11m ago•0 comments

A minimal GPT-style language model for character-level next-token prediction

https://github.com/beachdevs/gptjs
2•JPLeRouzic•12m ago•0 comments

An "ergonomics-first" theme for VSCode

https://github.com/mafik/blueprince
2•mafik•12m ago•0 comments

Higher effort reduces deep research accuracy for Gemini Flash 3 and GPT-5

https://futuresearch.ai/cost-of-deep-research/
7•wawawildwildest•13m ago•1 comments

$50 OpenClaw Deck [video]

https://www.youtube.com/watch?v=Pq3205RoOsI
1•dragonsenseiguy•14m ago•0 comments

Google VRP: Closed case Re-opened after Terminal Log proof, then re-closed

1•CorporationHit•15m ago•0 comments

A great wee place: the small Scottish factory crafting Olympic curling stones

https://www.theguardian.com/sport/2026/feb/13/a-great-wee-place-the-small-scottish-factory-crafti...
3•andsoitis•16m ago•0 comments

Img.tara.vision – a privacy-first image toolkit with two processing tiers

https://img.tara.vision
1•taravision•17m ago•1 comments

Hardware Mute Button

https://f5n.org/blog/2026/hardware-mute-button/
1•speckx•17m ago•0 comments

The easiest way to run Claude Code on Kubernetes

https://github.com/axon-core/axon
2•gjkim•18m ago•1 comments

Category Theory, AI and Jobs

https://deadneurons.substack.com/p/category-theory-ai-and-jobs
3•nr378•18m ago•0 comments

The Radical Propulsion Needed to Catch the Solar Gravitational Lens

https://www.universetoday.com/articles/the-radical-propulsion-needed-to-catch-the-solar-gravitati...
2•bookofjoe•23m ago•1 comments

What every compiler writer should know about programmers (Anton Ertl, 2015) [pdf]

https://www.complang.tuwien.ac.at/kps2015/proceedings/KPS_2015_submission_29.pdf
3•tosh•26m ago•0 comments

Apple, fix my keyboard before the timer ends or I'm leaving iPhone

https://ios-countdown.win/
5•ozzyphantom•26m ago•1 comments

Consumers and businesses paid nearly 90% of Trump tariffs in 2025

https://www.cbsnews.com/news/trump-tariffs-consumers-business-nearly-90-percent-new-york-federal-...
9•loughnane•28m ago•3 comments

Kodak Charmera Review, Tiny Toy Camera That Makes Garbage Photos Feel Like Gold

https://kirkstechtips.com/kodak-charmera-review-the-30-tiny-toy-camera-that-makes-garbage-photos-...
2•dudexsnave•29m ago•2 comments

OK, so Anthropic's AI built a C compiler. That don't impress me much

https://www.theregister.com/2026/02/13/anthropic_c_compiler/
3•beardyw•30m ago•0 comments

Show HN: Micropay – Stripe for Africa's biggest payment network

https://micropay.dev/why
3•possiblelion•31m ago•0 comments

Programming is no longer the main skill of SWE

https://edwardbx.com/articles/programming-skill/
3•crassus_ed•32m ago•0 comments

FFAB – Free GUI for FFmpeg

https://cdm.link/free-ffmpeg-gui-audio/
3•sqbic•33m ago•1 comments