frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Incorporator, Turn any API/File into typed Python graph with pipeline

https://github.com/PyPlumber/Incorporator/
2•PyPlumber•8h ago
When landing data I prefer to keep it as close to the original source as possible. Most of the Python data ingestion programs I saw treated Python more like SQL instead of harnessing object orientation. This was my attempt at translating my object-orented columnar approach to Python. I originally did it with Requests and Pandas but the overhead costs did not seem worth it. Claude helped refactor for async and Pydantic.

Now HTTPX’s async capabilities and Pydantic’s class building took this project over the top. By harnessing their abilities I shifted the codebase from data mapper to pipeline orchestrator. I added every format I could that seemed to have an established Python library. Right now I believe I support : JSON, NDJSON, XML, CSV, TSV, PSV, SQLite, and HTML out of the box. Optional extras (~30 MB pyarrow) unlock Parquet, Feather, ORC; Avro and XLSX have their own extras. I also added every compression I could find. Benchmarks at least for a windows machine are on par with other elt packages.

By focusing on function wrappers to make the developer’s syntax as easy as possible for the original data mapping calls, I established simple automated pipelines with one cli command and one JSON reference file. The JSON is basically the same syntax you would use in Python.

Both stream and fjord accept inflow and outflow Python code. Inflow code allows you to set custom conversion functions and mappings for the incoming data. The outflow code allows you to manipulate the exporting data into a new object new entirely.

Also, because your pipeline is basically created by a JSON file. You should eventually be able to automate the creation of the entire pipeline. Enjoy.

https://github.com/PyPlumber/Incorporator/

How you use it: Declare a subclass with no fields, point it at a URL, and it infers a Pydantic model from the response at runtime — with full strict typing, dot-notation, and an optional registry lookup by any key. class Launch(Incorporator): pass launches = await Launch.incorp( inc_url="https://ll.thespacedevs.com/2.2.0/launch/upcoming/" )

These functions handle the rest of your data mapping and export format needs: - test() lets the framework write the call kwargs for you - refresh() re-fetches with the seed call's params auto-replayed - export() serialises to any of the 13 formats

Then these functions create a pipeline. - stream() runs a chunked daemon with bounded memory. Can be used in two modes: pass-through or stateful (in RAM) updates to be manipulated in real-time. - fjord() fans out N sources and fuses them through a user reducer. This accepts multiple sources and exports.

After that all works copy the parameters into pipeline.json and the command can be as simple: incorporator validate pipeline.json incorporator fjord pipeline.json –logs

Comments

PyPlumber•7h ago
The docs and examples folders have: Tutorials 1-7 with matching code files. Should be a nice progression on using the tool.

Appendices have more advanced examples. There's a fantasy racing league example with 6-api calls & 1-file source with 3 outflows all in the form of an automated fjord pipeline cli call.

Show HN: Epiq – Distributed Git based issue tracker TUI

https://ljtn.github.io/epiq/
20•jolaflow•2h ago•6 comments

Show HN: Watch a neural net learn to play Snake

https://ppo.gradexp.xyz/
128•c1b•1d ago•31 comments

Show HN: Burn, baby, burn (those tokens)

https://github.com/dtnewman/burn-baby-burn
74•dtnewman•9h ago•15 comments

Show HN: Sx – an open-source package manager for AI skills, MCPs, and commands

https://github.com/sleuth-io/sx
39•detkin•10h ago•22 comments

Show HN: Needle: We Distilled Gemini Tool Calling into a 26M Model

https://github.com/cactus-compute/needle
747•HenryNdubuaku•3d ago•208 comments

Show HN: WolfSPDM a embedded focused requester SPDM 1.2 Stack built on WolfSSL

https://github.com/aidangarske/wolfSPDM
3•aidangarske•4h ago•0 comments

Show HN: ShaderKit – A browser GLSL editor I built for my own art

https://shaderkit.com/app
3•scrygl•4h ago•0 comments

Show HN: Browser based sythesizer, drum machine and squencer

https://github.com/madmonk13/modal-16
5•madmonk•7h ago•1 comments

Show HN: VisiSign – $0.10 per envelope e-signatures with no monthly fee

https://visisign.app/
5•rdoneill•4h ago•0 comments

Show HN: Claude64, a Commodore 64 client for Claude

https://github.com/theletterf/claude64
4•theletterf•5h ago•1 comments

Show HN: SwarmWright, structured multi-agent AI defined in markdowns

https://www.swarmwright.com/
3•ralphbarendse•6h ago•0 comments

Show HN: MIT OSS LinkedIn DMs for Agents (CLI and Example TUI)

https://allman.sh
2•toobulkeh•6h ago•0 comments

Show HN: TinySearch - Give small LLMs fast web access without context bloat

https://github.com/MarcellM01/TinySearch
2•MarcellM01•6h ago•0 comments

Show HN: Profine – optimize your PyTorch training script before the run

https://github.com/ProfineAI/profine-cli
2•aisinghal•6h ago•0 comments

Show HN: GridTravel – A community based travel app for users to share routes

https://www.gridtravel.app
53•knuaym9•1d ago•35 comments

Show HN: Running the second public ODoH relay

https://numa.rs/blog/posts/odoh-anonymous-dns-without-an-account.html
121•rdme•1d ago•41 comments

Show HN: Raybeam – A better way to screen share on macOS

https://raybeam.live/
2•fisc•7h ago•0 comments

Show HN: X open sourced their algorithm

https://www.xalgorithm.xyz/en
4•hsnrique•8h ago•1 comments

Show HN: Incorporator, Turn any API/File into typed Python graph with pipeline

https://github.com/PyPlumber/Incorporator/
2•PyPlumber•8h ago•1 comments

Show HN: Nibble

https://github.com/glouw/nibble
96•glouwbug•2d ago•24 comments

Show HN: Gigacatalyst – Extend your SaaS with an embedded AI builder

60•namanyayg•3d ago•24 comments

Show HN: Termini – Open-Source Menu Bar Terminal for macOS

https://github.com/ModernProgrammer/Termini
4•reflextech•10h ago•3 comments

Show HN: Vibe Coding a $20k /Year Enterprise Logistics Platform

https://trmnl.com/blog/vibe-coding-shiphero
28•ryanckulp•13h ago•6 comments

Show HN: Nemo – A visual, local server and job runner

https://github.com/andrewchilds/nemo
4•andrewchilds•11h ago•0 comments

Show HN: Revos – architecture checks for AI-generated code

https://github.com/mattykry/revos
3•Mattykry•11h ago•0 comments

Show HN: SiteRows – Query Websites with SQL

https://siterows.com/
2•mozersky•11h ago•0 comments

Show HN: Race to the Bottom

https://race-to-the-bottom.onrender.com
59•maxwellito•1d ago•48 comments

Show HN: Chuddy, self-hosted media downloading, translation and OCR Telegram bot

https://github.com/kivirnz/chuddy
3•kivir•12h ago•0 comments

Show HN: LFK – A Fast Kubernetes TUI

https://github.com/janosmiko/lfk
4•baskInEminence•12h ago•1 comments

Show HN: TikTok but for scientific papers

https://andreaturchet.github.io/website/index.html
193•ciwrl•4d ago•77 comments