frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Greater Manchester still says no to NHS data platform with Palantir at its heart

https://www.theregister.com/public-sector/2026/05/13/greater-manchester-still-says-no-to-nhs-data...
1•Bender•1m ago•0 comments

Audrey: Local-first memory guard for AI agents (source)

https://github.com/Evilander/Audrey
1•evilanders•1m ago•0 comments

What's with all the slide decks? A polycausal theory

https://dynomight.substack.com/p/slides
1•crescit_eundo•1m ago•0 comments

U.S. Intelligence Shows Iran Retains Substantial Missile Capabilities

https://www.nytimes.com/2026/05/12/us/politics/iran-missiles-us-intelligence.html
2•hebelehubele•1m ago•0 comments

Underrated Ideas in Biotech

https://nikomc.com/essays/underrated-ideas-01.html
1•mailyk•1m ago•0 comments

FCC walks back router update ban before it bricks America's network security

https://www.theregister.com/networks/2026/05/12/fcc-walks-back-router-update-ban-before-it-bricks...
1•Bender•2m ago•0 comments

Analog Computer Applications: The Lorenz-attractor [pdf]

https://anabrid.com/media/pages/publikation-dateien-nur-zur-verwaltung-der-dateien/f4766de51c-174...
1•tosh•2m ago•0 comments

NBA's Rwanda ties face scrutiny after sanctions-linked BAL withdrawal

https://www.theguardian.com/sport/2026/apr/25/nba-rwanda-sanctions-bal-apr-withdrawal-kagame
1•PaulHoule•2m ago•0 comments

SpaceX targets May 19 for debut of Starship Version 3, Launch Pad 2

https://spaceflightnow.com/2026/05/12/spacex-targets-may-19-for-debut-of-starship-super-heavy-ver...
1•bookmtn•2m ago•0 comments

Most teams optimize the prompt. Agentic systems have more moving parts

https://www.aevyra.ai/posts/prompt-optimization-agentic-systems.html
2•agunapal•3m ago•0 comments

Show HN: EleutherAI / Lm-Evaluation-Harness

https://github.com/EleutherAI/lm-evaluation-harness
1•marvinified•3m ago•0 comments

Wireloom: A Markdown extension for UI wireframes

https://github.com/StardockCorp/Wireloom
1•watbe•3m ago•0 comments

Scientists find insects may feel pain after crickets nurse sore antenna

https://www.theguardian.com/science/2026/may/13/insects-feel-pain-research
1•YeGoblynQueenne•4m ago•0 comments

BitLocker-protected drives can now be opened using files on a USB stick

https://www.tomshardware.com/tech-industry/cyber-security/microsoft-bitlocker-protected-drives-ca...
1•Timofeibu•4m ago•0 comments

In Praise of Acoustic Mathematics

https://link.springer.com/article/10.1007/s00283-026-10512-7
1•sebg•4m ago•0 comments

Show HN: BossHogg: A PostHog CLI for Agents

https://github.com/aaronkwhite/bosshogg-cli
1•aaronkwhite•4m ago•0 comments

How do agents see your website?

https://what-do-agents-see.runtype.app/
2•zackangelo•5m ago•0 comments

Discover: A Love Letter to RSS

https://brine.dev/posts/discover-a-love-letter-to-rss
1•speckx•5m ago•0 comments

The unmet needs in human disease index

https://www.convoke.bio/blog/introducing-the-unmet-needs-index
1•sebg•6m ago•0 comments

CloudFront's flat-rate plan (CDN+WAF+DNS) now scales to 6B req and 600TB/mo

https://aws.amazon.com/blogs/networking-and-content-delivery/cloudfront-premium-flat-rate-plan-su...
1•cristiangraz•6m ago•0 comments

Show HN: Ledger – Claude Code Token Spend Analyzer

https://github.com/delta-hq/cc-ledger
1•tsv650•6m ago•0 comments

Unknowable Math Can Help Hide Secrets

https://www.quantamagazine.org/how-unknowable-math-can-help-hide-secrets-20260511/
1•Xcelerate•7m ago•0 comments

Google Unveils Googlebook, a New AI Laptop Built Around Gemini

https://www.macrumors.com/2026/05/12/google-unveils-googlebook/
1•Brajeshwar•8m ago•0 comments

Some Proposals for Reviving the Philosophy of Mathematics (1979) [pdf]

https://gwern.net/doc/math/1979-hersh.pdf
1•sebg•9m ago•0 comments

Filen deleted all of my data. A heads-up for others

https://old.reddit.com/r/filen_io/comments/1t3r055/filen_deleted_all_of_my_data_a_headsup_for_oth...
1•tcp_handshaker•9m ago•0 comments

DeepSeek and Grok hallucinated the same fictitious OpenBSD manpage quote

https://stuart-thomas.com/research/the-empirical-council/
1•ethical•12m ago•2 comments

One in seven prefer consulting AI chatbots to seeing a doctor, UK study shows

https://www.theguardian.com/society/2026/may/13/one-in-seven-prefer-ai-chatbots-to-seeing-doctor-...
1•chrisjj•13m ago•1 comments

Skip – One Swift Codebase. Two Native Platforms

https://skip.dev/
3•nikolay•14m ago•0 comments

AI is making it easy but also hard

1•andrewmurphy•14m ago•2 comments

Show HN: Ratify Protocol – prove who authorized an AI agent, offline, in <1ms

https://github.com/identities-ai/ratify-protocol
2•chuks•14m ago•0 comments
Open in hackernews

Scherlok – zero-config data quality monitoring, works with dbt

https://github.com/rbmuller/scherlok
2•rbmuller•1h ago

Comments

rbmuller•1h ago
Hey Folks,

I am Robson, I work as a Data Engineer, one of my duties is to guarantee data quality on pipelines, the projects I touched in the past never had a real good solution for bad data quality, every existing data quality tool I tried, like Great Expectations, Soda, dbt tests requires to write the rule for the failure first, which is exactly the part that's hard, as we can't imagine all the problems before.

So I have been working on Scherlok, taking the opposite approach, first profile the data, and then detect when something changes, as a result, there's no yaml to write, no deep configuration to perform, currently it's detecting volumes, schemas, NULL rates, distributions, freshness cadence, cardinality and stores it locally and then detects when something changes in subsequent runs, severity is being classified in 3 categories like WARNING/INFO/CRITICAL

The code is pure python, z-score based and intent to be light weight over more complex and sometimes expensive market solutions.

I decided to open this project, so we can make it more robust, have some contributors with some already merged PRs.

It can start with 3 commands:

scherlok connect scherlok investigate scherlok watch

it works with DBT, reading `target/manifest.json`, discovering every materialized model, auto-resolve the connection from `profiles.yml`, and profiles each model. CI integration is actually one line

I would love feedback, and would be glad with external help from folks facing data quality issues in other scenarios I can't yet imagine

https://github.com/rbmuller/scherlok/labels/good%20first%20i...

Repo: https://github.com/rbmuller/scherlok PyPI: pip install scherlok

Thanks Robson