frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

OpenCiv3: Open-source, cross-platform reimagining of Civilization III

https://openciv3.org/
594•klaussilveira•11h ago•176 comments

The Waymo World Model

https://waymo.com/blog/2026/02/the-waymo-world-model-a-new-frontier-for-autonomous-driving-simula...
901•xnx•17h ago•545 comments

What Is Ruliology?

https://writings.stephenwolfram.com/2026/01/what-is-ruliology/
22•helloplanets•4d ago•17 comments

How we made geo joins 400× faster with H3 indexes

https://floedb.ai/blog/how-we-made-geo-joins-400-faster-with-h3-indexes
95•matheusalmeida•1d ago•22 comments

Unseen Footage of Atari Battlezone Arcade Cabinet Production

https://arcadeblogger.com/2026/02/02/unseen-footage-of-atari-battlezone-cabinet-production/
28•videotopia•4d ago•0 comments

Show HN: Look Ma, No Linux: Shell, App Installer, Vi, Cc on ESP32-S3 / BreezyBox

https://github.com/valdanylchuk/breezydemo
203•isitcontent•11h ago•24 comments

Monty: A minimal, secure Python interpreter written in Rust for use by AI

https://github.com/pydantic/monty
199•dmpetrov•12h ago•91 comments

Show HN: I spent 4 years building a UI design tool with only the features I use

https://vecti.com
313•vecti•13h ago•137 comments

Microsoft open-sources LiteBox, a security-focused library OS

https://github.com/microsoft/litebox
353•aktau•18h ago•176 comments

Sheldon Brown's Bicycle Technical Info

https://www.sheldonbrown.com/
355•ostacke•17h ago•92 comments

Hackers (1995) Animated Experience

https://hackers-1995.vercel.app/
459•todsacerdoti•19h ago•231 comments

Delimited Continuations vs. Lwt for Threads

https://mirageos.org/blog/delimcc-vs-lwt
24•romes•4d ago•3 comments

Show HN: If you lose your memory, how to regain access to your computer?

https://eljojo.github.io/rememory/
259•eljojo•14h ago•155 comments

Dark Alley Mathematics

https://blog.szczepan.org/blog/three-points/
80•quibono•4d ago•19 comments

An Update on Heroku

https://www.heroku.com/blog/an-update-on-heroku/
392•lstoll•18h ago•266 comments

Was Benoit Mandelbrot a hedgehog or a fox?

https://arxiv.org/abs/2602.01122
7•bikenaga•3d ago•1 comments

PC Floppy Copy Protection: Vault Prolok

https://martypc.blogspot.com/2024/09/pc-floppy-copy-protection-vault-prolok.html
53•kmm•4d ago•3 comments

Vocal Guide – belt sing without killing yourself

https://jesperordrup.github.io/vocal-guide/
3•jesperordrup•1h ago•0 comments

How to effectively write quality code with AI

https://heidenstedt.org/posts/2026/how-to-effectively-write-quality-code-with-ai/
235•i5heu•14h ago•178 comments

Introducing the Developer Knowledge API and MCP Server

https://developers.googleblog.com/introducing-the-developer-knowledge-api-and-mcp-server/
46•gfortaine•9h ago•13 comments

Why I Joined OpenAI

https://www.brendangregg.com/blog/2026-02-07/why-i-joined-openai.html
122•SerCe•7h ago•103 comments

I spent 5 years in DevOps – Solutions engineering gave me what I was missing

https://infisical.com/blog/devops-to-solutions-engineering
136•vmatsiiako•16h ago•60 comments

Show HN: R3forth, a ColorForth-inspired language with a tiny VM

https://github.com/phreda4/r3
68•phreda4•11h ago•12 comments

Understanding Neural Network, Visually

https://visualrambling.space/neural-network/
271•surprisetalk•3d ago•37 comments

Female Asian Elephant Calf Born at the Smithsonian National Zoo

https://www.si.edu/newsdesk/releases/female-asian-elephant-calf-born-smithsonians-national-zoo-an...
25•gmays•6h ago•7 comments

I now assume that all ads on Apple news are scams

https://kirkville.com/i-now-assume-that-all-ads-on-apple-news-are-scams/
1044•cdrnsf•21h ago•431 comments

Zlob.h 100% POSIX and glibc compatible globbing lib that is faste and better

https://github.com/dmtrKovalenko/zlob
13•neogoose•4h ago•9 comments

Learning from context is harder than we thought

https://hy.tencent.com/research/100025?langVersion=en
171•limoce•3d ago•92 comments

FORTH? Really!?

https://rescrv.net/w/2026/02/06/associative
60•rescrv•19h ago•22 comments

Show HN: Smooth CLI – Token-efficient browser for AI agents

https://docs.smooth.sh/cli/overview
89•antves•1d ago•66 comments
Open in hackernews

Data Activation Thoughts

https://galsapir.github.io/sparse-thoughts/2026/01/17/data_activation/
21•galsapir•2w ago
i've been working with healthcare/biobank data and keep thinking about what "data moats" mean now that llms can ingest anything. some a16z piece from 2019 said moats were eroding — now the question seems to be whether you can actually make your data useful to these systems, not just have it. there's some recent work (tables2traces, ehr-r1) showing you can convert structured medical data into reasoning traces that improve llm performance, but the approaches are still rough and synthetic traces don't fully hold up to scrutiny (writing this to think through it, not because i have answers)

Comments

sgt101•2w ago
How to know if one should fine tune/pretrain or RL / reasoning train given some data set?
galsapir•2w ago
i honestly dont think there's a simple y/n answer there - i think considerations include mostly like 'how costly it is to do so', 'how often do you think you'll need it', and so on. traces are not as "ephemeral" as FT models - since you can use those to guide agent behaviour when a newer model is released (but still, not as evergreen as other assets - traces generated using say GPT4 would seem pale and outdated compared to ones created on the same dataset using Opus4.5 i reckon)
armcat•2w ago
I've been working in legaltech space and can definitely echo the sentiments there. There are some major legaltech/legal AI companies but after speaking to dozens of law firms, none of them are finding these tools very valuable. But they have signed contracts with many seats, they are busy people, and tech is not intrinsic to them, so they are not in the business of just changing tools and building things in-house (a handful of them are). And the problem is despite massive amount of internal data, all the solutions fail on the relevance and precision scale. When I sit down with actual legal associates, I can see how immensely complex these workflows are, and to fully utilize this data moat you need: (1) multi-step agentic retrieval, (2) a set of rules/heuristics to ground and steer everything per transaction/case "type", (3) adaptation/fine-tuning towards the "house language/style", (4) integration towards many different data sources and tools; and you need to wrap all this with real-world evals (where LLM-as-a-judge technique often fail).
dennisy•2w ago
Could you please expand on “none of them find the tools very useful”?

I would love to know how big your sample is, in what way the tools fail, what features are missing etc.

armcat•2w ago
Sure! So to qualify - I've been working in contractual law, and more specifically contract drafting. There are a tonne of other tools in the areas of document management, research, regulatory, timekeeping, etc, so I cannot speak on behalf of those.

Sample size: around 150 law firms across UK, Nordics and DACH (and a smithering across the US). Some were actual month long pilots so there were deeper interactions with some, whilst others were "just conversations". Let's say in each law firm it's 3-4 associates and 1-2 partners, so it's >600 lawyers.

Typically the legal AI solutions in contract drafting involve the lawyer uploading "their database" aka drag-and-drop a folder or a zip file containing potentially 100s-1000s contracts from previous transactions.

What's missing:

- Relevance: For the current transaction the lawyer is working on, the recommendations from AI tools suggest irrelevant information. For example, if it's an M&A transaction in one market (e.g. Nordics), it suggests pricing mechanics from a different market practice (e.g. US) that are irrelevant or not desirable. The text semantics have closest cosine (or whatever) distance, but the market characteristics are orthogonal.

- Representation: as a lawyer you are always representing a specific party (e.g. a "buyer" purchasing another company or an asset from a "seller"). You want your side to be best represented - however the tools often fail to "understand" what/who you are representing, and tend to recommend the opposite of what you want for your client.

- Diversity: The same handful of documents keep being referenced all the time, even though there are other "better" documents that should be used to ground the responses and recommendations.

- Precision: Sometimes you want precise information, such as specific leverage ratios or very specific warranty clauses for a transaction of a particular size within a particular industry.

- Language/tonality: Lawyers talk to other lawyers and there is a specific tonality and language used - precision, eloquence, professionalism. Each law firm also has their "house style" in terms of how they put the words together. AI tools come across as "odd" in terms of how they respond (even when they are correct). It trips the lawyers up a bit and they lose the trust somewhat.

Etc.

(there are many others)