frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Golden Cross vs. Death Cross: Crypto Trading Guide

https://chartscout.io/golden-cross-vs-death-cross-crypto-trading-guide
1•chartscout•1m ago•0 comments

Hoot: Scheme on WebAssembly

https://www.spritely.institute/hoot/
2•AlexeyBrin•4m ago•0 comments

What the longevity experts don't tell you

https://machielreyneke.com/blog/longevity-lessons/
1•machielrey•5m ago•0 comments

Monzo wrongly denied refunds to fraud and scam victims

https://www.theguardian.com/money/2026/feb/07/monzo-natwest-hsbc-refunds-fraud-scam-fos-ombudsman
2•tablets•10m ago•0 comments

They were drawn to Korea with dreams of K-pop stardom – but then let down

https://www.bbc.com/news/articles/cvgnq9rwyqno
2•breve•12m ago•0 comments

Show HN: AI-Powered Merchant Intelligence

https://nodee.co
1•jjkirsch•14m ago•0 comments

Bash parallel tasks and error handling

https://github.com/themattrix/bash-concurrent
2•pastage•14m ago•0 comments

Let's compile Quake like it's 1997

https://fabiensanglard.net/compile_like_1997/index.html
1•billiob•15m ago•0 comments

Reverse Engineering Medium.com's Editor: How Copy, Paste, and Images Work

https://app.writtte.com/read/gP0H6W5
2•birdculture•21m ago•0 comments

Go 1.22, SQLite, and Next.js: The "Boring" Back End

https://mohammedeabdelaziz.github.io/articles/go-next-pt-2
1•mohammede•27m ago•0 comments

Laibach the Whistleblowers [video]

https://www.youtube.com/watch?v=c6Mx2mxpaCY
1•KnuthIsGod•28m ago•1 comments

Slop News - HN front page right now as AI slop

https://slop-news.pages.dev/slop-news
1•keepamovin•32m ago•1 comments

Economists vs. Technologists on AI

https://ideasindevelopment.substack.com/p/economists-vs-technologists-on-ai
1•econlmics•34m ago•0 comments

Life at the Edge

https://asadk.com/p/edge
3•tosh•40m ago•0 comments

RISC-V Vector Primer

https://github.com/simplex-micro/riscv-vector-primer/blob/main/index.md
4•oxxoxoxooo•44m ago•1 comments

Show HN: Invoxo – Invoicing with automatic EU VAT for cross-border services

2•InvoxoEU•44m ago•0 comments

A Tale of Two Standards, POSIX and Win32 (2005)

https://www.samba.org/samba/news/articles/low_point/tale_two_stds_os2.html
3•goranmoomin•48m ago•0 comments

Ask HN: Is the Downfall of SaaS Started?

3•throwaw12•49m ago•0 comments

Flirt: The Native Backend

https://blog.buenzli.dev/flirt-native-backend/
2•senekor•51m ago•0 comments

OpenAI's Latest Platform Targets Enterprise Customers

https://aibusiness.com/agentic-ai/openai-s-latest-platform-targets-enterprise-customers
1•myk-e•54m ago•0 comments

Goldman Sachs taps Anthropic's Claude to automate accounting, compliance roles

https://www.cnbc.com/2026/02/06/anthropic-goldman-sachs-ai-model-accounting.html
3•myk-e•56m ago•5 comments

Ai.com bought by Crypto.com founder for $70M in biggest-ever website name deal

https://www.ft.com/content/83488628-8dfd-4060-a7b0-71b1bb012785
1•1vuio0pswjnm7•57m ago•1 comments

Big Tech's AI Push Is Costing More Than the Moon Landing

https://www.wsj.com/tech/ai/ai-spending-tech-companies-compared-02b90046
5•1vuio0pswjnm7•59m ago•0 comments

The AI boom is causing shortages everywhere else

https://www.washingtonpost.com/technology/2026/02/07/ai-spending-economy-shortages/
3•1vuio0pswjnm7•1h ago•0 comments

Suno, AI Music, and the Bad Future [video]

https://www.youtube.com/watch?v=U8dcFhF0Dlk
1•askl•1h ago•2 comments

Ask HN: How are researchers using AlphaFold in 2026?

1•jocho12•1h ago•0 comments

Running the "Reflections on Trusting Trust" Compiler

https://spawn-queue.acm.org/doi/10.1145/3786614
1•devooops•1h ago•0 comments

Watermark API – $0.01/image, 10x cheaper than Cloudinary

https://api-production-caa8.up.railway.app/docs
2•lembergs•1h ago•1 comments

Now send your marketing campaigns directly from ChatGPT

https://www.mail-o-mail.com/
1•avallark•1h ago•1 comments

Queueing Theory v2: DORA metrics, queue-of-queues, chi-alpha-beta-sigma notation

https://github.com/joelparkerhenderson/queueing-theory
1•jph•1h ago•0 comments
Open in hackernews

Ask HN: Who is doing the best Word/PDF RAG tool with deep research?

4•_samjarman•6mo ago
Hi HN, which SaaS providers are you eyeing up these days for your RAG needs with thousands of PDFs or Word docs and with a agent that can take its time and give well researched, cited answers? TIA!

Comments

randomname4325•6mo ago
checkout www.Airwave.us. They are focused on field services where techs comb through thousands of pages of manuals/documentation for part numbers or specific instructions that have to be 100% accurate.
Norcim133•6mo ago
I spent the last 2 months trying out RAG/parsing plays. My use-case required high accuracy on complex tables and figures.

Ranking: 1. LlamaCloud/LlamaParse 2. GroundX 3. Unstructured.io 4. Google RAG Engine 5. Docling ... capability gap... 6. Azure - Document Intelligence 7. AWS - Textract 8. LlamaIndex (DIY)

Imanari•6mo ago
This ranking is just for the parsing, not the RAG Portion, correct?
Norcim133•6mo ago
Correct-ish. LlamaCloud and GroundX do everything up to retrieval. Here is an interactive graphic of major players along RAG flow: https://claude.ai/public/artifacts/b872435b-1d9c-461e-a29c-b...
TXTOS•6mo ago
I've been working on something that directly targets this problem: WFGY — a reasoning engine built for RAG on large-scale PDF/Word documents, especially when you're doing deep research, not just shallow QA.

Instead of just chunking text and throwing it into an embedding model, WFGY builds a persistent semantic resonance layer — meaning it tracks context through formatting breaks, footnotes, diagram captions, even corrupted OCR sections.

The engine applies multiple self-correcting pathways (we call them BBMC and BBPF) so even when parsing is incomplete or wrong, reasoning still holds. That’s crucial if your source materials are academic papers, messy reports, or 1000+ page archives.

It’s open source. No tuning. Works with any LLM. No tricks.

Backed by the creator of tesseract.js (36k) — who gets why document mess is the real challenge.

Check it out: https://github.com/onestardao/WFGY

lisa_coicadan•6mo ago
Great thread, we’ve seen the exact same pain points around working with large volumes of complex PDFs/Word docs.

At Retab.com, we focus on the “hard pre-RAG” layer: turning raw documents : including scanned reports, OCR messes, financial statements, or regulatory filings... into clean, structured, model-ready data.

Instead of relying on embeddings over noisy text chunks, we use schema-driven generation, multi-LLM consensus, and an evaluation UI to ensure output is accurate, complete, and explainable. No manual parsing, no hallucinations, just structured JSON (or any format you want), ready for retrieval, agents, or analytics.

We work with teams doing RAG on contracts, audits, earnings reports, etc.. anywhere that “close enough” isn’t good enough. Happy to run your hardest docs through Retab if you want to benchmark against WFGY or LlamaParse

_samjarman•6mo ago
What makes a PDF 'hard' in your mind?