frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

Open in hackernews

Show HN: I enhanced Soundex to correctly handle multi-word strings

4•ogora•3h ago
Hello HN.

I built Flookup Data Wrangler, a powerful Google Sheets add-on for data cleaning without writing single line of code.

Traditional Soundex is designed for single words like "John" and "Jonny", making data cleaning comparisons between such strings straightforward. However, typical Soundex outputs cannot be used to handle multi-word or reordered string comparisons like "John Doe" vs "Doe Jonny", as this would produce inaccurate results.

To address this, I modified the Soundex algorithm to support multi-word and reordered strings by adding a helper function that re-encodes the output into a format that can be used for accurate text-to-text comparisons. The optimisation keeps overhead minimal, ensuring negligible impact on performance.

By leveraging this enhancement, Flookup users can do the following:

+ Fuzzy matching and merging

+ Duplicate highlighting and removal

+ Extracting a list of unique values

... all based on the sound the strings or parts of the strings make (as pronounced in English).

I would love feedback, especially from those into data cleaning (which I'm guessing is everyone).

If you are curious to give it a try, here is a quick start guide: https://www.getflookup.com/get-started

Gimp 3.0.4 Released

https://www.gimp.org/news/2025/05/18/gimp-3-0-4-released/
2•cratermoon•13m ago•0 comments

A shower thought turned into a beautiful Collatz visualization

https://abstractnonsense.com/collatz/
2•abstractbill•16m ago•0 comments

Linux on Chuwi MiniBook X N150 TwinLake 2-in-1, 10" 2K, 12GB RAM, Intel TXT

https://taoofmac.com/space/reviews/2025/05/15/2230
1•transpute•19m ago•0 comments

Deliberate Boredom

https://herbertlui.net/deliberate-boredom/
2•herbertl•24m ago•0 comments

Ask HN: Is there a Wikipedia or LLM wrapper for kids (with parental controls)?

2•soferio•28m ago•4 comments

French state covered up Nestle water scandal: Senate report

https://timesofindia.indiatimes.com/world/europe/french-state-covered-up-nestle-water-scandal-senate-report/articleshow/121266215.cms
2•wslh•28m ago•0 comments

Real-Time Grass Simulation in the Browser – Over 1M Blades at 60 FPS

https://labs.techredux.co/grass/
1•handfuloflight•28m ago•0 comments

DDoSecrets publishes 410 GB of heap dumps, hacked from TeleMessage

https://micahflee.com/ddosecrets-publishes-410-gb-of-heap-dumps-hacked-from-telemessages-archive-server/
4•micahflee•28m ago•0 comments

Generalization bias in large language model summarization of scientific research

https://royalsocietypublishing.org/doi/epdf/10.1098/rsos.241776
2•Anon84•29m ago•0 comments

DEF CON hackers vs. our internet voting system (SIV.org): Report

https://hack.siv.org/reports/2024defcon
2•arianasiv•29m ago•1 comments

UK study: Almost half of young people would prefer a world without internet

https://www.theguardian.com/technology/2025/may/20/almost-half-of-young-people-would-prefer-a-world-without-internet-uk-study-finds
5•kawera•31m ago•0 comments

My Dad's Last Days

https://petros.blog/2024/07/07/my-dads-last-days/
2•overbring_labs•31m ago•0 comments

free AI for Excel

https://aiforexcel.top/
1•rooty_ship•33m ago•0 comments

SEC SIM-swapper who Googled 'signs that the FBI is after you' put behind bars

https://www.theregister.com/2025/05/19/sim_swapper_sec_x_account/
5•gslin•35m ago•0 comments

Smart textile lighting/display system [....]

https://www.nature.com/articles/s41467-022-28459-6
1•sargstuff•36m ago•0 comments

Spain Orders Airbnb to Take Down 66,000 Rental Listings

https://www.nytimes.com/2025/05/19/business/airbnb-listings-spain.html
2•stevenwoo•37m ago•1 comments

House orders Pentagon to review if it exposed Americans to weaponised ticks 2019

https://www.theguardian.com/us-news/2019/jul/16/pentagon-review-weaponised-ticks-lyme-disease
2•isomorph•37m ago•0 comments

Debugging My RSI

https://debugyourpain.substack.com/p/debugging-my-rsi
1•awakenmyrub•39m ago•0 comments

How Noxx Uses Validation to Parse Complex Resumes with AI

https://medium.com/@masaishi/how-noxx-uses-validation-to-parse-complex-resumes-with-ai-fc2d6eea3e21
1•masaishi•41m ago•1 comments

Mrs. Meir Says Moses Made Israel Oil‐Poor (1973)

https://www.nytimes.com/1973/06/11/archives/mrs-meir-says-moses-made-israel-oilpoor.html
1•wslh•43m ago•1 comments

Turn text, pdf, link and YouTube to Podcast with AI

https://aivocal.io/ai-podcast
2•caohongyuan•46m ago•1 comments

Old Growth Wood

https://brenthull.com/article/old-growth-wood
2•ksec•48m ago•0 comments

Information Processing via Human Soft Tisssue

https://ieeexplore.ieee.org/document/10935315
1•sargstuff•50m ago•0 comments

All That Is Solid Bursts into Flame: Capitalism and Fire in the 19th Century US

https://academic.oup.com/past/article/265/1/97/7625037
1•samclemens•54m ago•0 comments

Trump signs law combating deepfakes and revenge porn

https://www.bbc.com/news/articles/c74qnyz89y3o
2•OutOfHere•57m ago•1 comments

The Top Corporate Board Directors

https://www.wsj.com/business/c-suite/2025-top-corporate-directors-f605623c
1•impish9208•1h ago•1 comments

is-even-ai – Check if a number is even using the power of AI

https://www.npmjs.com/package/is-even-ai
51•modinfo•1h ago•14 comments

Is This Late-Night TV's Last Gasp?

https://www.nytimes.com/2025/05/05/business/media/late-night-streaming-john-mulaney.html
2•gmays•1h ago•0 comments

Cryptocurrency ATMs target the "unbanked" in Colorado. So do scammers

https://coloradosun.com/2025/05/19/cryptocurrency-atms-scams-colorado/
2•bediger4000•1h ago•0 comments

Google Translate Now Available as Default Translation App on iPhone and iPad

https://www.macrumors.com/2025/05/19/google-translate-default-option-ios/
1•mfiguiere•1h ago•0 comments