We Replaced ETL with MCP

https://rashidazarang.com/c/how-were-making-business-software-talk-to-each-other-10x-faster

11•rashidae•6mo ago

Comments

rashidae•6mo ago

We used to spend 40–80 hours writing and maintaining brittle ETL code for every integration. Now we spend 4–8 hours deploying MCP (Model Context Protocol) interfaces and letting AI handle the rest. No hardcoded pipelines.

criticalfault•6mo ago

Can you give some more info on the results?

Meaning, correctness, completeness, etc...

Would you use it for e.g. tax information? Because if wrong, you could get fined.

rashidae•6mo ago

We're using AI to write the boring integration code that moves data from System A to System B. The actual data processing is deterministic code that's tested like any critical system.

Correctness: 100% schema mapping accuracy after human validation. We've never had a data type mismatch or field misalignment make it to production. The AI suggests mappings at ~85% accuracy, humans catch and correct the remaining 15%.

Completeness: Zero data loss incidents. We run reconciliation reports comparing source record counts to destination. Any discrepancy fails deployment. Most common issue: the AI initially missing compound key relationships, which we catch in testing.

Tax/Financial Data: Yes, we handle financial data for several clients, including:

QuickBooks to data warehouse pipelines (invoice/payment data)

Payroll system integrations

Revenue reconciliation between CRM and accounting

Our approach for sensitive data:

AI generates the integration logic, never sees actual records

Test with synthetic data matching production schemas

Run parallel processing for 1-2 cycles to verify accuracy

Maintain full audit logs of all transformations

Human sign-off required before production cutover

rooftopzen•6mo ago

You naively replaced deterministic process w probabilistic process - following a trend that is uneducated.

I am taking screenshots of blogposts like this for a museum exhibit opening next year - lmk if you’re willing.

rashidae•6mo ago

We're not replacing deterministic processes with probabilistic ones, that would be insane for production data.

Here's what actually happens:

1. MCP exposes system schemas in a standardized way 2. AI analyzes the schemas and suggests mappings 3. Engineers review and validate every mapping 4. AI generates deterministic integration code (think: writing the SQL, not running it) 5. We test with real data before any production deployment

laardaninst•6mo ago

So you've not replaced ETL with MCP, you're just using LLMs to generate SQL.

nickphx•6mo ago

That's a bold move. Hopefully there are no stray cats.

bakemawaytoys•6mo ago

I have asked AI on multiple occasions to take items from some input and output a table, or a json structure and every time it has simply skipped or ignored several items from the input for no reason.

This sounds like a terrible idea, and nearly impossible to debug when it inevitably drops data.

rashidae•6mo ago

Yeah, we’ve seen that too. Raw AI output isn’t reliable enough for high-stakes data work.

That’s exactly why we don’t let AI run migrations. We use it to speed up the boring parts, like mapping table structures. But humans are always in control.

SolubleSnake•6mo ago

As others have mentioned this is an extremely odd thing to expect to work....

I'll give an example. I worked for a FTSE 100 company using a very old Product Lifecycle Management system (model manager - based actually on pre-DOS technology)....we had to upgrade it to a new fancy one.

Therefore we had to migrate all data relating to the company, and group companies engineering designs...everything to do with 2D drawings, 3D designs...any important connections etc....all electrical designs....excel sheets related to these containing lists of PCBs and their component parts in Bills Of Materials etc...There is absolutely no way in hell I would trust AI with almost any of that, to get it right....or even to attempt a load without almost immediately erroring.

rashidae•6mo ago

Totally agree. We wouldn’t trust AI to run that kind of migration either... And we don’t.

But here’s what we do use AI for: • Mapping legacy schemas • Spotting patterns • Generating boilerplate ETL code fast

Then humans step in: • Validate every mapping • Write custom logic for edge cases • Test everything... every field, every BOM, every relationship • Migrate with deterministic, human-reviewed code

Stop building automations. Start running your business

You can't QA your way to the frontier

Show HN: PalettePoint – AI color palette generator from text or images

Robust and Interactable World Models in Computer Vision [video]

Nestlé couldn't crack Japan's coffee market.Then they hired a child psychologist

Notes for February 2-7

Study confirms experience beats youthful enthusiasm

The Big Hunger by Walter J Miller, Jr. (1952)

The Genus Amanita

We have broken SHA-1 in practice

Ask HN: Was my first management job bad, or is this what management is like?

Ask HN: How to Reduce Time Spent Crimping?

KV Cache Transform Coding for Compact Storage in LLM Inference

A quantitative, multimodal wearable bioelectronic device for stress assessment

Why Big Tech Is Throwing Cash into India in Quest for AI Supremacy

How to shoot yourself in the foot – 2026 edition

Eight More Months of Agents

From Human Thought to Machine Coordination

The new X API pricing must be a joke

Show HN: RMA Dashboard fast SAST results for monorepos (SARIF and triage)

Show HN: Source code graphRAG for Java/Kotlin development based on jQAssistant

Python Only Has One Real Competitor

Tmux to Zellij (and Back)

Ask HN: How are you using specialized agents to accelerate your work?

Passing user_id through 6 services? OTel Baggage fixes this

DavMail Pop/IMAP/SMTP/Caldav/Carddav/LDAP Exchange Gateway

Visual data modelling in the browser (open source)

Show HN: Tharos – CLI to find and autofix security bugs using local LLMs

Oddly Simple GUI Programs

The New Playbook for Leaders [pdf]

Stop building automations. Start running your business

You can't QA your way to the frontier

Show HN: PalettePoint – AI color palette generator from text or images

Robust and Interactable World Models in Computer Vision [video]

Nestlé couldn't crack Japan's coffee market.Then they hired a child psychologist

Notes for February 2-7

Study confirms experience beats youthful enthusiasm

The Big Hunger by Walter J Miller, Jr. (1952)

The Genus Amanita

We have broken SHA-1 in practice

Ask HN: Was my first management job bad, or is this what management is like?

Ask HN: How to Reduce Time Spent Crimping?

KV Cache Transform Coding for Compact Storage in LLM Inference

A quantitative, multimodal wearable bioelectronic device for stress assessment

Why Big Tech Is Throwing Cash into India in Quest for AI Supremacy

How to shoot yourself in the foot – 2026 edition

Eight More Months of Agents

From Human Thought to Machine Coordination

The new X API pricing must be a joke

Show HN: RMA Dashboard fast SAST results for monorepos (SARIF and triage)

Show HN: Source code graphRAG for Java/Kotlin development based on jQAssistant

Python Only Has One Real Competitor

Tmux to Zellij (and Back)

Ask HN: How are you using specialized agents to accelerate your work?

Passing user_id through 6 services? OTel Baggage fixes this

DavMail Pop/IMAP/SMTP/Caldav/Carddav/LDAP Exchange Gateway

Visual data modelling in the browser (open source)

Show HN: Tharos – CLI to find and autofix security bugs using local LLMs

Oddly Simple GUI Programs

The New Playbook for Leaders [pdf]

We Replaced ETL with MCP

Comments