frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Ragctl – document ingestion CLI for RAG (OCR, chunking, Qdrant)

https://github.com/datallmhub/ragstudio
1•ahsekka•2h ago
Hi HN — sharing ragctl, an open-source CLI for the most failure-prone part of RAG pipelines: document ingestion, OCR, parsing/cleaning, and chunking.

Vector DB setup is fairly standardized now, but getting high-quality, consistent text + metadata into it still takes a lot of brittle glue code. ragctl aims to make that “pre-vector” step repeatable: turn messy documents into retrieval-ready chunks in a few commands.

Features • Multi-format input: PDF, DOCX, HTML, images • OCR for scanned/image-based docs • Semantic chunking (LangChain) • Batch runs with retries + error handling • Output: direct ingestion into Qdrant (for now)

Looking for feedback • DX: is the CLI intuitive? • Performance / edge cases: weird PDFs, mixed layouts, tables • Roadmap: which connectors (S3, Slack, Notion) or vector stores should be next?

Repo: https://github.com/datallmhub/ragstudio Happy to answer questions about the architecture and chunking approach.

Don't Become the Machine

https://armeet.bearblog.dev/becoming-the-machine/
1•armeetj•5m ago•0 comments

You Can Get Every AI Model for Free

https://infiniax.ai
1•ZacharyGolinger•15m ago•1 comments

Ask HN: Critique wanted — granular-physics pyramid preprint

https://zenodo.org/records/18036910
1•Sherlock_Blight•16m ago•1 comments

The semantic layer is dead. Long live the wiki

https://promptql.io/blog/semantic-layer-dead-long-live-wiki
2•tirumaraiselvan•17m ago•0 comments

Big Space Sandwich Broke a Record

https://nautil.us/this-big-space-sandwich-broke-a-record-1256821/
1•fleahunter•22m ago•0 comments

China bans sharing 'obscene' material – potentially including sexting

https://www.washingtonpost.com/world/2025/12/23/china-porn-ban-online-censorship/
1•0in•24m ago•0 comments

Yendor: A Zach-like, rogue-like game and language made in 7 days

https://github.com/olifog/YENDOR
1•azhenley•24m ago•0 comments

China Delays Plans for Mass Production of Self-Driving Cars After Accident

https://www.nytimes.com/2025/12/23/business/china-autonomous-cars-driving.html
1•bookofjoe•24m ago•1 comments

Poetiq achieves 75% at under $8 / problem using GPT-5.2 X-High on ARC-AGI-2

https://poetiq.ai/posts/arcagi_announcement/
3•mromanuk•27m ago•0 comments

A semantic POP-style framework for structuring AI-assisted programs

https://github.com/dohuyhoang93/theus/blob/main/README.md
2•dohuyhoangvn93•30m ago•1 comments

How to Become AGI: From Capitalism to Compute-Ism

https://medium.com/@zichengxu/how-to-become-agi-a5b2d7d74bda
1•lossy_compress•31m ago•0 comments

Casuistic Alignment

https://fi-le.net/casuism/
2•fi-le•40m ago•0 comments

Show HN: Depsy – normalized SaaS dependency health in one API call (cached,fast)

https://depsy.io/
1•malik_naji•1h ago•0 comments

Show HN: Send free letters to your future self or others

https://lettertolater.com
1•sankar_builds•1h ago•0 comments

DownDownDown Come and challenge the 100th floor game

https://downdowndown.live/
1•bitvvip•1h ago•0 comments

Peter Thiel's $74M Shake-Up: Slashes Tesla, Bets Big on Microsoft and Apple

https://www.13radar.com/guru/peter-thiel
3•EvansWilson•1h ago•3 comments

Name That Part: 3D Part Segmentation and Naming

https://name-that-part.github.io/
3•unisub_guy•1h ago•1 comments

Ask HN: Thoughts on Webview vs. React Native for mobile app?

1•hnroo99•1h ago•0 comments

Correspondence Between Don Knuth and Peter van Emde Boas on Priority Deques 1977 [pdf]

https://staff.fnwi.uva.nl/p.vanemdeboas/knuthnote.pdf
10•vismit2000•1h ago•1 comments

Jim Beam pauses production at main distillery as bourbon inventories rise

https://www.cnn.com/2025/12/21/business/jim-beam-tariffs-pause-production
5•wewewedxfgdf•1h ago•1 comments

Show HN: Turn raw HTML into production-ready images for free

https://html2png.dev
5•alvinunreal•1h ago•2 comments

Autonomously navigating the real world: lessons from the PG&E outage

https://waymo.com/blog/2025/12/autonomously-navigating-the-real-world
2•scoofy•1h ago•0 comments

Palisade: Bringing Zero-Trust to the AI Model Supply Chain

https://highflame.com/blogs/launching-palisade-zero-trust-security-for-the-ai-model-supply-chain
1•sharathr•1h ago•1 comments

Could lockfiles just be SBOMs?

https://nesbitt.io/2025/12/23/could-lockfiles-just-be-sboms.html
11•zdw•1h ago•4 comments

U.S. Bars 5 European Tech Regulators and Researchers

https://www.nytimes.com/2025/12/23/technology/trump-rubio-european-tech-disinformation-digital-se...
4•2OEH8eoCRo0•1h ago•2 comments

'Dracula's Chivito': Hubble reveals largest birthplace of planets ever observed

https://phys.org/news/2025-12-chaotic-dracula-chivito-hubble-reveals.html
9•wglb•1h ago•1 comments

Show HN: T2T – Voice-to-text with MCP support local and cross-platform

https://t2t.now/
2•acoyfellow•1h ago•2 comments

Show HN: Dwani.ai – AI for Indian Languages

1•gaganyatri•1h ago•0 comments

The Rise of Classic Home Computers [video]

https://www.youtube.com/watch?v=G3aDPtL4_cE
1•ibobev•1h ago•0 comments

Tieredsort: Header only, blazing fast (3-4x) C++17 sorting for numeric types

https://github.com/Cranot/tieredsort
2•signa11•1h ago•0 comments