frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

France's homegrown open source online office suite

https://github.com/suitenumerique
82•nar001•1h ago•36 comments

Start all of your commands with a comma (2009)

https://rhodesmill.org/brandon/2009/commands-with-comma/
334•theblazehen•2d ago•110 comments

Hoot: Scheme on WebAssembly

https://www.spritely.institute/hoot/
46•AlexeyBrin•2h ago•9 comments

Reinforcement Learning from Human Feedback

https://arxiv.org/abs/2504.12501
25•onurkanbkrc•2h ago•2 comments

OpenCiv3: Open-source, cross-platform reimagining of Civilization III

https://openciv3.org/
728•klaussilveira•17h ago•227 comments

The Waymo World Model

https://waymo.com/blog/2026/02/the-waymo-world-model-a-new-frontier-for-autonomous-driving-simula...
989•xnx•22h ago•562 comments

Coding agents have replaced every framework I used

https://blog.alaindichiappari.dev/p/software-engineering-is-back
66•alainrk•1h ago•61 comments

Vocal Guide – belt sing without killing yourself

https://jesperordrup.github.io/vocal-guide/
110•jesperordrup•7h ago•49 comments

Unseen Footage of Atari Battlezone Arcade Cabinet Production

https://arcadeblogger.com/2026/02/02/unseen-footage-of-atari-battlezone-cabinet-production/
79•videotopia•4d ago•12 comments

Ga68, a GNU Algol 68 Compiler

https://fosdem.org/2026/schedule/event/PEXRTN-ga68-intro/
24•matt_d•3d ago•5 comments

Making geo joins faster with H3 indexes

https://floedb.ai/blog/how-we-made-geo-joins-400-faster-with-h3-indexes
143•matheusalmeida•2d ago•37 comments

Show HN: Kappal – CLI to Run Docker Compose YML on Kubernetes for Local Dev

https://github.com/sandys/kappal
6•sandGorgon•2d ago•2 comments

Show HN: Look Ma, No Linux: Shell, App Installer, Vi, Cc on ESP32-S3 / BreezyBox

https://github.com/valdanylchuk/breezydemo
247•isitcontent•17h ago•27 comments

Monty: A minimal, secure Python interpreter written in Rust for use by AI

https://github.com/pydantic/monty
255•dmpetrov•17h ago•133 comments

Show HN: I spent 4 years building a UI design tool with only the features I use

https://vecti.com
349•vecti•19h ago•157 comments

Cross-Region MSK Replication: K2K vs. MirrorMaker2

https://medium.com/lensesio/cross-region-msk-replication-a-comprehensive-performance-comparison-o...
5•andmarios•4d ago•1 comments

Hackers (1995) Animated Experience

https://hackers-1995.vercel.app/
517•todsacerdoti•1d ago•251 comments

Sheldon Brown's Bicycle Technical Info

https://www.sheldonbrown.com/
398•ostacke•23h ago•103 comments

What Is Ruliology?

https://writings.stephenwolfram.com/2026/01/what-is-ruliology/
51•helloplanets•4d ago•51 comments

Show HN: If you lose your memory, how to regain access to your computer?

https://eljojo.github.io/rememory/
315•eljojo•20h ago•194 comments

Microsoft open-sources LiteBox, a security-focused library OS

https://github.com/microsoft/litebox
364•aktau•23h ago•189 comments

An Update on Heroku

https://www.heroku.com/blog/an-update-on-heroku/
443•lstoll•23h ago•292 comments

Dark Alley Mathematics

https://blog.szczepan.org/blog/three-points/
99•quibono•4d ago•26 comments

PC Floppy Copy Protection: Vault Prolok

https://martypc.blogspot.com/2024/09/pc-floppy-copy-protection-vault-prolok.html
78•kmm•5d ago•11 comments

How to effectively write quality code with AI

https://heidenstedt.org/posts/2026/how-to-effectively-write-quality-code-with-ai/
283•i5heu•20h ago•234 comments

Was Benoit Mandelbrot a hedgehog or a fox?

https://arxiv.org/abs/2602.01122
26•bikenaga•3d ago•14 comments

Female Asian Elephant Calf Born at the Smithsonian National Zoo

https://www.si.edu/newsdesk/releases/female-asian-elephant-calf-born-smithsonians-national-zoo-an...
48•gmays•12h ago•20 comments

I now assume that all ads on Apple news are scams

https://kirkville.com/i-now-assume-that-all-ads-on-apple-news-are-scams/
1096•cdrnsf•1d ago•476 comments

Understanding Neural Network, Visually

https://visualrambling.space/neural-network/
313•surprisetalk•4d ago•46 comments

I spent 5 years in DevOps – Solutions engineering gave me what I was missing

https://infisical.com/blog/devops-to-solutions-engineering
160•vmatsiiako•22h ago•73 comments
Open in hackernews

Transform DOCX into LLM-ready data

https://contextgem.dev/converters/docx.html
15•sergiishcherbak•9mo ago

Comments

sergiishcherbak•9mo ago
As part of work on my open-source project ContextGem, I've built a native, zero-dependency DOCX converter that transforms Word documents into LLM-ready data.

This custom-built converter directly processes Word XML, provides comprehensive content extraction + covers what other open-source tools often miss or lack support for:

- Rich paragraph and sentence metadata for enhanced context

- Misaligned tables

- Comments, footnotes, and textboxes

- Embedded images

The converted document can then be easily used in ContextGem's LLM extraction workflows.

Perfect for developers building contract intelligence applications where precision matters. The converter preserves document structure and relationships, empowering LLMs to better understand and analyze document content.

Try it / share with your dev team today and see the difference in your document processing pipeline!

GitHub: https://github.com/shcherbak-ai/contextgem

All DocxConverter features: https://contextgem.dev/converters/docx.html

WalterGR•9mo ago
zero-dependency DOCX converter

I’ve read that there are a lot of OpenXML elements that are pretty opaque. They appear to basically be XML-esque representations of binary, in-memory structs used internally by Office. (Maybe this has changed over time.)

How much OpenXML does this actually handle?

Extracts information that other open-source tools often do not capture: misaligned tables

Could you expand on what you mean by misaligned tables? Are these tables that appear as separate ‘table nodes’ in the XML, or ones that appear as a single node but have wonky formatting?

obeavs•9mo ago
Hey! This is really awesome. Do you intend to support analysis on redlining/tracked changes? That's where it would become very useful for my use cases.
eightysixfour•9mo ago
Yes, this is the one that always gets me in the MS ecosystem. Would make a few of my workflows so much better.
TiredOfLife•9mo ago
How it compares to https://github.com/microsoft/markitdown?