frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

Open in hackernews

Show HN: Help improve language coverage in Common Crawl

8•ccgreg•4h ago
Hey HN, the Common Crawl Foundation is trying to expand the coverage of our crawl to more languages, regions and cultures, and if you speak a language other than English (LOTE) you can help!

By validating Language Identification data (LangID or LID): https://dynabench.org/tasks/text-language-identification

By contributing urls for our seed crawl: https://github.com/commoncrawl/web-languages

We're also organizing a Workshop on Multilingual Data Quality Signals (WMDQS) with MLCommons and EleutherAI where we have a call for papers open (https://wmdqs.org/cfp/) and a upcoming shared task on language identification (https://wmdqs.org/shared-task/)

Snyk Acquires Invariant Labs

https://snyk.io/news/snyk-acquires-invariant-labs-to-accelerate-agentic-ai-security-innovation/
1•od0•54s ago•0 comments

The Secret Rules of the Terminal

https://wizardzines.com/zines/terminal/
1•marvinborner•2m ago•0 comments

Scaling Pinterest ML Infrastructure with Ray: From Training to ML Pipelines

https://medium.com/pinterest-engineering/scaling-pinterest-ml-infrastructure-with-ray-from-training-to-end-to-end-ml-pipelines-4038b9e837a0
1•herbertl•4m ago•0 comments

Show HN: I built an AI thumbnail generator for YouTubers who can't design

https://thumbo.io
1•isacbuilds•4m ago•0 comments

Amish company embraced robots–then made an even bolder bet

https://fortune.com/2025/06/24/flextur-robots-automation-manufacturing-small-business/
1•Bluestein•4m ago•0 comments

AI doesn't have to reason to take your job

https://www.vox.com/future-perfect/417325/artificial-intelligence-apple-reasoning-openai-chatgpt
1•lr0•5m ago•0 comments

The Reenchanted World: On finding mystery in the digital age

https://harpers.org/archive/2025/06/the-reenchanted-world-karl-ove-knausgaard-digital-age/
1•herbertl•6m ago•0 comments

Adding to markwhen documents via SMS and email

https://docs.markwhen.com/meridiem/api/sms-email
1•koch•11m ago•0 comments

Alcohol-soaked star system could explain why life, including us was able to form

https://www.livescience.com/space/exoplanets/alcohol-soaked-star-system-could-help-explain-why-life-including-us-was-able-to-form
1•Bluestein•11m ago•0 comments

Personal Copilot: Train Your Own Coding Assistant

https://huggingface.co/blog/personal-copilot
1•auraham•12m ago•0 comments

Agency is your secret edge

https://alanwu.xyz/posts/agency/
2•lunw•13m ago•0 comments

Stealthy ship hull cuts through waves like butter

https://news.engin.umich.edu/2025/06/stealthy-ship-hull-cuts-through-waves-like-butter/
1•gnabgib•13m ago•0 comments

What's Predictive in an AI Persona?

https://askrally.com/article/whats-predictive-in-a-persona
1•virtual_rf•14m ago•0 comments

The German automotive industry wants to develop open-source software together

https://www.vda.de/en/press/press-releases/2025/250624_PM_Automotive_industry_signs_Memorandum_of_Understanding
2•smartmic•15m ago•0 comments

I wrote 280 articles about web scraping. Here's their index grouped by tag

https://github.com/TheWebScrapingClub/ArticleIndex
2•PigiVinci83•15m ago•0 comments

LLMs can hoover up data from books, judge rules

https://www.theregister.com/2025/06/24/anthropic_book_llm_training_ok/
2•rntn•16m ago•0 comments

Cut Django Database Latency by 50-70ms with Native Connection Pooling

https://saurabh-kumar.com/articles/2025/06/cut-django-database-latency-by-50-70ms-with-native-connection-pooling/
1•selectnull•16m ago•0 comments

Show HN: Gitbasher – A simple bash utility to make Git easy to use

https://github.com/maxbolgarin/gitbasher
2•maxbolgarin•18m ago•0 comments

Biocide overdose blunder suspected in A321 dual-engine incident

https://www.flightglobal.com/safety/biocide-overdose-blunder-suspected-in-a321-dual-engine-incident/138004.article
3•worik•18m ago•0 comments

Cheapest DIY Microscope (1 min video)

https://www.youtube.com/shorts/SMjOA-P95CM
1•rmason•19m ago•0 comments

Owsla Manifesto – Can we fix Education?

https://owsla.io/manifesto
1•ChilledTonic•20m ago•0 comments

Strike Set Back Iran's Nuclear Program by Only a Few Months, U.S. Report Says

https://www.nytimes.com/2025/06/24/us/politics/iran-nuclear-sites.html
3•zzzeek•21m ago•4 comments

How to Think About Time in Programming

https://shanrauf.com/archive/how-to-think-about-time-in-programming
5•rmason•21m ago•0 comments

Vertically stacked monolithic perovskite colour photodetectors

https://www.nature.com/articles/s41586-025-09062-3
1•anfractuosity•21m ago•0 comments

Unify engineers growth by using the right model for every task

https://openai.com/index/unify/
1•gmays•22m ago•0 comments

HODL.Bar – Minimal, live Bitcoin ticker for any device

https://hodl.bar/
1•ccie_fr•23m ago•1 comments

Uranium Miner's Russian Routes Unnerve Potential Bond Investors

https://www.bloomberg.com/news/articles/2025-06-24/uranium-miner-s-russian-routes-unnerve-potential-bond-investors
1•Bluestein•23m ago•0 comments

Benchmark for Multimodal Action Models

https://twitter.com/HarshSikka/status/1937524186995732989
1•harshsikka123•24m ago•0 comments

Using an LLM for query planning in RAG –> 40% better answer relevance

https://techcommunity.microsoft.com/blog/azure-ai-services-blog/up-to-40-better-relevance-for-complex-queries-with-new-agentic-retrieval-engine/4413832
1•pmc00•25m ago•0 comments

The Economics Behind "Basic Economy" – A Masterclass in Price Discrimination

https://blog.getjetback.com/the-economics-behind-basic-economy-a-masterclass-in-price-discrimination/
11•bdev12345•27m ago•0 comments