frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Data Discovery – plain-English to discovering and acquiring data using AI

https://datris.ai/videos/data-discovery-ingestion-consumption
1•tfearn•1h ago

Comments

tfearn•1h ago
I've been building data infrastructure for 25+ years across Goldman, Bridgewater, Freddie Mac. The same problem exists everywhere: getting a new external data source wired up takes days. You have to figure out what datasets the source even exposes, write the ingestion code, build a pipeline, wire up scheduling, and test it — before a single byte of data lands anywhere useful.

The data discovery process that we just added to our platform (open-source) collapses that into one session.

You describe what you want in plain English ("company earnings", "option chain data"," "SEC EDGAR company filings"). The AI identifies the source, enumerates every dataset it exposes — grouped by category with parameters and auth requirements — and you select what you want. From there, Discovery generates Python tap scripts in parallel, runs them immediately as a test, self-heals on failure (up to 3 attempts), creates the pipelines, and optionally schedules everything. The whole thing drops into a Data Catalog that groups related taps and pipelines together.

The artifact isn't a one-time wizard output — it creates real, editable tap scripts and pipeline configs you can modify afterward. Parameters can be sourced from a table you already have ingested, a file upload, or an AI-generated list ("give me S&P 500 tickers"). Date tokens like `{{TODAY}}` are substituted at runtime so daily snapshots just work.

The same flow is also exposed as an MCP tool (`discover_source`), so external AI agents can drive Discovery programmatically — ask "what datasets are in polygon?" and get back the same structured dataset catalog the wizard uses.

Destinations: MongoDB, PostgreSQL, Kafka, MinIO, pgvector, Qdrant, Milvus, Chroma, Weaviate, ActiveMQ, REST endpoints.

Full demo walkthrough: https://datris.ai/videos/data-discovery-ingestion-consumptio... Docs: https://docs.datris.ai/discovery OSS (AGPL): https://github.com/datris/datris-platform-oss

Measuring Human Performance on ARC-AGI-3

https://arcprize.org/blog/arc-agi-3-human-dataset
1•sanxiyn•1m ago•0 comments

PyCon needs 1,146 more hotel nights to meet "hotel minimum bookings."

https://twitter.com/LundukeJournal/status/2044107055049388115
1•fortran77•3m ago•0 comments

You can literally build a neural network w your hands. This CTO open-sourced how

https://twitter.com/lenadroid/status/2044185631396639195
1•lenadroid•4m ago•1 comments

The Most Boring Book in the World

https://ei3lh.eu/2026/04/14/the-most-boring-book-in-the-world/
1•austinallegro•5m ago•0 comments

A Terminal Spotify Client

https://github.com/dubeyKartikay/lazyspotify
1•Jotalea•7m ago•0 comments

Blowin' in the Wind: How Nordic Countries Made Electricity Free

https://atmos.earth/climate-solutions/blowin-in-the-wind-how-nordic-countries-made-electricity-free/
1•doener•9m ago•0 comments

Pushing Simulation to the Limit to Find Order in Chaos [video]

https://www.youtube.com/watch?v=8jVogdTJESw
1•kinderjaje•10m ago•1 comments

Scarcity and Fairness at Theme Parks

https://thelivingfossils.substack.com/p/scarcity-and-fairness-at-theme-parks
1•jger15•10m ago•0 comments

How a Ukraine Is Building a Cleaner, Stronger Power Future

https://happyeconews.com/ukraines-renewable-energy-rebuild/
1•doener•10m ago•0 comments

SpankMatch, Secrets, and (Everyone's) Orphaned Google Container Registry Layers

https://amenbreakpoint.com/posts/spankmatch-gcr-orphans/
1•ian_d•16m ago•0 comments

Show HN: OtaKit – push app updates without App Store reviews

https://www.otakit.app/
1•gregolo•16m ago•1 comments

If it starts, a nuclear arms race will be unstoppable

https://economist.com/international/2026/04/14/if-it-starts-a-nuclear-arms-race-will-be-unstoppable
3•andsoitis•21m ago•1 comments

Four Choppers and a Blimp: The Piasecki Helistat

https://hackaday.com/2026/04/14/four-choppers-and-a-blimp-the-bizarre-piasecki-helistat/
3•devonnull•24m ago•0 comments

My first project ever. A platform built with AI

https://www.ugcfinder.website/
1•aiamed•26m ago•3 comments

Superintelligent Will (2012) [pdf]

https://nickbostrom.com/superintelligentwill.pdf
1•rappatic•26m ago•0 comments

The AI backlash is turning revolutionary (Fortune)

https://fortune.com/2026/04/14/ai-backlash-revolutionary-sam-altman-molotov-cocktails-data-centers/
3•nickvec•28m ago•1 comments

No one can force me to have a secure website

https://tom7.org/httpv/
4•vanyle•31m ago•0 comments

TaskRabbit founder: the pivot is the point

https://www.fastcompany.com/91513007/taskrabbit-founder-the-pivot-is-the-point-pivot-founders-bui...
1•doctaj•34m ago•0 comments

Secure private networking for users, nodes, agents, Workers – Cloudflare Mesh

https://blog.cloudflare.com/mesh/
4•cantaloupe•42m ago•1 comments

Contra Byrnes on UV and Cancer

https://hedonicescalator.substack.com/p/contra-byrnes-on-uv-and-cancer
2•paulpauper•43m ago•0 comments

Race for the best cybersecurity model heating up

https://www.reuters.com/technology/openai-unveils-gpt-54-cyber-week-after-rivals-announcement-ai-...
1•gaurangt•43m ago•0 comments

What Claude Code's Source Revealed About AI Engineering Culture

https://techtrenches.dev/p/the-snake-that-ate-itself-what-claude
1•lucketone•46m ago•0 comments

Why Affordability Isn't the Same as Falling Prices

https://www.urbanproxima.com/p/why-affordability-isnt-the-same-as
1•paulpauper•46m ago•0 comments

Show HN: We fine-tuned an AI model for log search – Accuracy 50% to 80%

https://thedex.run/blog/why-general-purpose-ai-fails-at-log-search
1•rkorlimarla•48m ago•0 comments

Show HN: GizmoSauce – no-code website widgets

https://demo.gizmosauce.com/demos/little-caesars/
1•endurant_dev•48m ago•0 comments

GitHub webhook secrets leaked in headers

https://gist.github.com/ltrgoddard/7abfc8e4123e403505dfbe767a2487ab
1•ltrg•49m ago•1 comments

Gemini Plugin for Claude Code

https://github.com/sakibsadmanshajib/gemini-plugin-cc
1•sakibss•50m ago•0 comments

Painful learnings from sponsoring a tech conference in SF

https://www.terezatizkova.com/writing/conference-booths
2•tizkovatereza•51m ago•1 comments

Civilization Is Not the Default. Violence Is

https://apropos.substack.com/p/civilization-is-a-public-good
28•paulpauper•51m ago•15 comments

MetaBrainz is looking for a new executive director

https://blog.metabrainz.org/2026/04/14/seeking-a-new-executive-director/
2•MrKomodoDragon1•54m ago•0 comments