frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

Open in hackernews

Show HN: Robust LLM Extractor for HTML/Markdown in TypeScript

https://github.com/lightfeed/lightfeed-extract
2•andrew_zhong•2h ago
While working with LLMs for structured web data extraction, we saw issues with invalid JSON and broken links in the output. This led me to build a library focused on robust extraction and enrichment:

- Clean HTML conversion: transforms HTML into LLM-friendly markdown with an option to extract just the main content - LLM structured output: Uses Gemini 2.5 flash or GPT-4o mini to balance accuracy and cost. Can also also use custom prompt - JSON sanitization: If the LLM structured output fails or doesn't fully match your schema, a sanitization process attempts to recover and fix the data, especially useful for deeply nested objects and arrays - URL validation: all extracted URLs are validated - handling relative URLs, removing invalid ones, and repairing markdown-escaped links

I'd love to hear if anyone else has experimented with LLMs for data extraction or if you have any questions about this approach!

GPT-4.1 will be available directly in ChatGPT starting today

https://twitter.com/OpenAI/status/1922707554745909391
1•tosh•23s ago•0 comments

Details about Lava: Airbnb's new animation format

https://twitter.com/ramon_fritsch/status/1922647368295481421
1•aqeelat•25s ago•0 comments

Oniux: Kernel-level Tor isolation for any Linux app

https://blog.torproject.org/introducing-oniux-tor-isolation-using-linux-namespaces/
1•todsacerdoti•27s ago•0 comments

Kids Online Safety Act is back

https://www.theverge.com/news/666729/kids-online-safety-act-reintroduced
1•leotravis10•2m ago•0 comments

Launch day for Maple AI – a new privacy AI in the Apple App Store with E2EE

https://blog.trymaple.ai/introducing-maple-ai-for-iphone-and-ipad-your-most-personal-ai-assistant-on-the-go/
2•markskram•3m ago•0 comments

Dropbox Is Down

https://status.dropbox.com/
1•davidcox143•4m ago•1 comments

My Microsoft MultiMedia Keyboard 1.0A is dead

https://kotaku.com/mechanical-keyboard-microsoft-squidgy-keys-1851780616
1•ericzawo•7m ago•0 comments

Yume – Transform your content with GPU shaders

https://yume.sh/
1•andrew_rfc•8m ago•0 comments

Unhappy Meals (2007)

https://www.nytimes.com/2007/01/28/magazine/28nutritionism.t.html
1•Tomte•10m ago•0 comments

How to systematically secure anything (2023)

https://github.com/veeral-patel/how-to-secure-anything
2•Tomte•10m ago•0 comments

A Framework for Defining and Refining Your ICP

https://a16z.com/framework-define-refine-icp/
1•tzury•11m ago•0 comments

Keeping Time on a Stream

https://s2.dev/blog/timestamping
1•shikhar•12m ago•0 comments

Beyond the Wrist: Debugging RSI

https://www.debugyourpain.org/docs/main_posts/understand/debugging_rsi/
5•luu•12m ago•0 comments

54 years ago, a computer programmer fixed a bug, created an existential crisis

https://www.inverse.com/innovation/blinking-cursor-history
2•cpeterso•13m ago•0 comments

How the 'end of history' illusion shapes your life choices

https://www.bbc.com/future/article/20230619-how-the-end-of-history-illusion-shapes-your-life-choices
2•jhncls•14m ago•0 comments

Warners Reverses Course: Changes Max's Name Back to HBO Max

https://www.hollywoodreporter.com/business/digital/max-name-change-hbo-max-upfronts-1236216616/
2•jaredwiener•16m ago•0 comments

How we (re)built our AI agent for code reviews in IDEs

https://www.coderabbit.ai/blog/how-we-built-our-ai-code-review-tool-for-ides
1•smb06•17m ago•1 comments

In search of a dynamist vision for safe superhuman AI

https://helentoner.substack.com/p/dynamism-vs-stasis
1•stephenflanders•17m ago•0 comments

ONOX: The all-electric tractor with swappable battery packs

https://electrek.co/2025/05/13/meet-onox-the-all-electric-tractor-with-swappable-battery-packs/
1•gnabgib•19m ago•1 comments

Trump admin ends extreme weather database that has tracked cost of disasters

https://www.cnn.com/2025/05/08/climate/noaa-ends-disaster-database
5•vinnyglennon•19m ago•0 comments

Learning pointers 10 years too late

https://codebynight.dev/posts/day-2-of-learning-go-the-pointers-i-finally-understood-10-years-later/
2•shivc•20m ago•0 comments

CISA changes vulnerabilities updates, shifts to X and emails

https://www.theregister.com/2025/05/12/cisa_vulnerabilities_updates_x/
2•rbanffy•21m ago•1 comments

TikTok is using AI-generated alt text to describe photos

https://www.theverge.com/news/666632/tiktok-accessibility-ai-generated-alt-text-contrast-bold
1•01-_-•22m ago•1 comments

Students Are Short-Circuiting Their Chromebooks for a Social Media Challenge

https://www.nytimes.com/2025/05/14/us/tiktok-trend-school-laptops-fire.html
5•ChrisArchitect•23m ago•1 comments

Show HN: I made a client MCP react app for Supabase's MCP server

https://github.com/tambo-ai/supabase-mcp-client
1•michaelmilst•24m ago•0 comments

China and Russia have signed a deal to build a nuclear power plant on the moon

https://www.scmp.com/news/china/science/article/3310315/china-and-russia-sign-nuclear-reactor-deal-fuel-lunar-research-station
2•zachguo•25m ago•0 comments

Free AI Code Reviews for Cursor, Windsurf and VS Code: CodeRabbit in IDE

https://www.coderabbit.ai/blog/ai-code-reviews-vscode-cursor-windsurf
1•smb06•29m ago•1 comments

Cell cycle duration determines oncogenic transformation capacity

https://www.nature.com/articles/s41586-025-08935-x
3•PaulHoule•30m ago•0 comments

Show HN: Turn any workflow diagram into compilable, running and stateful code

https://workflows.diagrid.io/
1•yaronsc•30m ago•0 comments

Teaching Kids about Money in the Age of Tap to Pay

https://blog.tendollaradventure.com/teaching-kids-about-money-in-a-digital-age/
2•dskhatri•32m ago•0 comments