frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Crovia Spider v1 –Forensic crawler exposing compliance gaps in LAION-5B

https://github.com/croviatrust/crovia-core-engine
2•crovia•2mo ago
Today we're releasing Crovia Spider v1: an open-core forensic tool that digs into existing public AI datasets (2024–2026) for license hints, provenance signals, and compliance holes – no new crawls, no private data touched. Just verifiable clarity on what's already out there.

Gran it on LAION-5B (the backbone of Stable Diffusion, etc.):

Unverified CC-BY 4.0 / 3.0 licenses

Tens of thousands of "unknown" entries

Mixed variants with zero audit trace

First-ever Compliance Score: 14/100 (every model on it inherits the risk)

Real receipts (e.g., cid:url_sha256:c7cc5b0acf8330e51ffd1ed02f108e6a9649e13ed3547a14255dad6bdf7f01c5 → cc-by-4.0 unverified).

Why? EU AI Act hits 2026: models need reproducible evidence, transparent licensing, and Annex IV bundles. Spider outputs audit packs that plug straight into Crovia Trust (offline Merkle proofs <30s). All Apache 2.0, CLI-ready.

Reproduce it: crovia-spider from-laion --output receipts.ndjson on your dataset. Brutal feedback? Integrations with HF/FAISS?

Let's build the governance layer AI deserves.

Repo: https://github.com/croviatrust/crovia-core-engine

(Real receipts extracted via Crovia Spider)

cid:url_sha256:c7cc5b0acf8330e51ffd1ed02f108e6a9649e13ed3547a14255dad6bdf7f01c5

License: cc-by-4.0 (unverified)

cid:url_sha256:267ad746f168458aa6aca730d82dd565ba0dbada0107317d2252d3b60d57fade

License: cc-by-sa-3.0 (unverified)

cid:url_sha256:8bad9a02f5b4b1e08e19a6417bd6fb03576c80a80deef4f4a1ca868eb9265e71

License: unknownDocs/Spec: docs/CROVIA_SPIDER_RECEIPT_v1.md

#AIGovernance

The Genus Amanita

https://www.mushroomexpert.com/amanita.html
1•rolph•1m ago•0 comments

We have broken SHA-1 in practice

https://shattered.io/
1•mooreds•1m ago•1 comments

Ask HN: Was my first management job bad, or is this what management is like?

1•Buttons840•2m ago•0 comments

Ask HN: How to Reduce Time Spent Crimping?

1•pinkmuffinere•4m ago•0 comments

KV Cache Transform Coding for Compact Storage in LLM Inference

https://arxiv.org/abs/2511.01815
1•walterbell•8m ago•0 comments

A quantitative, multimodal wearable bioelectronic device for stress assessment

https://www.nature.com/articles/s41467-025-67747-9
1•PaulHoule•10m ago•0 comments

Why Big Tech Is Throwing Cash into India in Quest for AI Supremacy

https://www.wsj.com/world/india/why-big-tech-is-throwing-cash-into-india-in-quest-for-ai-supremac...
1•saikatsg•10m ago•0 comments

How to shoot yourself in the foot – 2026 edition

https://github.com/aweussom/HowToShootYourselfInTheFoot
1•aweussom•11m ago•0 comments

Eight More Months of Agents

https://crawshaw.io/blog/eight-more-months-of-agents
3•archb•13m ago•0 comments

From Human Thought to Machine Coordination

https://www.psychologytoday.com/us/blog/the-digital-self/202602/from-human-thought-to-machine-coo...
1•walterbell•13m ago•0 comments

The new X API pricing must be a joke

https://developer.x.com/
1•danver0•14m ago•0 comments

Show HN: RMA Dashboard fast SAST results for monorepos (SARIF and triage)

https://rma-dashboard.bukhari-kibuka7.workers.dev/
1•bumahkib7•14m ago•0 comments

Show HN: Source code graphRAG for Java/Kotlin development based on jQAssistant

https://github.com/2015xli/jqassistant-graph-rag
1•artigent•19m ago•0 comments

Python Only Has One Real Competitor

https://mccue.dev/pages/2-6-26-python-competitor
3•dragandj•21m ago•0 comments

Tmux to Zellij (and Back)

https://www.mauriciopoppe.com/notes/tmux-to-zellij/
1•maurizzzio•21m ago•1 comments

Ask HN: How are you using specialized agents to accelerate your work?

1•otterley•23m ago•0 comments

Passing user_id through 6 services? OTel Baggage fixes this

https://signoz.io/blog/otel-baggage/
1•pranay01•24m ago•0 comments

DavMail Pop/IMAP/SMTP/Caldav/Carddav/LDAP Exchange Gateway

https://davmail.sourceforge.net/
1•todsacerdoti•24m ago•0 comments

Visual data modelling in the browser (open source)

https://github.com/sqlmodel/sqlmodel
1•Sean766•26m ago•0 comments

Show HN: Tharos – CLI to find and autofix security bugs using local LLMs

https://github.com/chinonsochikelue/tharos
1•fluantix•27m ago•0 comments

Oddly Simple GUI Programs

https://simonsafar.com/2024/win32_lights/
1•MaximilianEmel•27m ago•0 comments

The New Playbook for Leaders [pdf]

https://www.ibli.com/IBLI%20OnePagers%20The%20Plays%20Summarized.pdf
1•mooreds•27m ago•1 comments

Interactive Unboxing of J Dilla's Donuts

https://donuts20.vercel.app
1•sngahane•29m ago•0 comments

OneCourt helps blind and low-vision fans to track Super Bowl live

https://www.dezeen.com/2026/02/06/onecourt-tactile-device-super-bowl-blind-low-vision-fans/
1•gaws•31m ago•0 comments

Rudolf Vrba

https://en.wikipedia.org/wiki/Rudolf_Vrba
1•mooreds•31m ago•0 comments

Autism Incidence in Girls and Boys May Be Nearly Equal, Study Suggests

https://www.medpagetoday.com/neurology/autism/119747
1•paulpauper•32m ago•0 comments

Wellness Hotels Discovery Application

https://aurio.place/
1•cherrylinedev•33m ago•1 comments

NASA delays moon rocket launch by a month after fuel leaks during test

https://www.theguardian.com/science/2026/feb/03/nasa-delays-moon-rocket-launch-month-fuel-leaks-a...
1•mooreds•33m ago•0 comments

Sebastian Galiani on the Marginal Revolution

https://marginalrevolution.com/marginalrevolution/2026/02/sebastian-galiani-on-the-marginal-revol...
2•paulpauper•37m ago•0 comments

Ask HN: Are we at the point where software can improve itself?

1•ManuelKiessling•37m ago•2 comments