frontpage.
newsnewestaskshowjobs

Open Source @Github

fp.

Founders OS – give your AI client your real business context, self-hosted

https://github.com/OurThinkTank/founders-os
1•ourthinktank•1m ago•0 comments

How not to forget what matters

https://www.henrikkarlsson.xyz/p/hypomnemata
1•jger15•1m ago•0 comments

What is prole drift and what to do about it

https://unprole.christianboyle.com/theory
1•lain98•3m ago•0 comments

How to Think About AI Before It's Too Late

https://www.theatlantic.com/podcasts/2026/06/how-to-think-about-ai-before-its-too-late/687644/
1•zb•4m ago•0 comments

FTSE 100 slips after PM Starmer says will resign

https://www.reuters.com/world/uk/london-shares-slip-after-pm-starmer-says-will-resign-2026-06-22/
1•wilsonfiifi•6m ago•0 comments

Ask HN: Any AI native Anki alternatives?

1•shadag•9m ago•0 comments

Callback – Less Scroll. More Soul

https://commodore.net/
1•thorin•10m ago•0 comments

Vibe Management

https://blog.jenkster.com/2026/06/vibe-management/
1•krisajenkins•10m ago•0 comments

I Decoupled Attention from Weights [video]

https://www.youtube.com/watch?v=1jGR4zqpyKA
1•aspirin•13m ago•0 comments

How a Microsecond-Level Low-Latency Engine Works

https://medium.com/@DolphinDB_Inc/c-speed-without-c-pain-inside-a-microsecond-level-low-latency-e...
2•CrazyTomato•13m ago•0 comments

In Defense of the Marginal Baby

https://caseyhandmer.wordpress.com/2026/06/22/in-defense-of-the-marginal-baby/
1•jger15•14m ago•0 comments

Vibe-coding niche Mac apps

https://cornfieldlabs.github.io/posts/vibecoding-niche-mac-apps/
2•cornfieldlabs•16m ago•0 comments

LetterBucket – A newsletter platform for owning your audience

https://letterbucket.com/
2•SergioPulido•18m ago•0 comments

Generative AI Music Attribution Rethinks Royalties

https://spectrum.ieee.org/ai-music-attribution
2•rbanffy•20m ago•0 comments

Show HN: Saar Agentic Orchestration Platform

https://github.com/Poi5eN/Nexus
2•Poi5eN•23m ago•1 comments

OCaml 5.5.0 Released

https://discuss.ocaml.org/t/ocaml-5-5-0-released/18265
3•birdculture•27m ago•0 comments

Hilarious German Compound Nouns you won't want to miss

https://www.rayburntours.com/blog/2016/10/03/25-hilarious-german-compound-nouns-wont-want-miss/
3•Tomte•30m ago•2 comments

Reality has a surprising amount of detail (2017)

https://johnsalvatier.org/blog/2017/reality-has-a-surprising-amount-of-detail
4•tosh•33m ago•0 comments

Show HN: Onbalance – privacy-first cashflow planning app got a multiuser update

https://onbalance.app/
2•doctorsolberg•38m ago•0 comments

GitHub Copilot improves productivity 40%

https://arxiv.org/abs/2606.00438
1•copy-pashte•42m ago•0 comments

Data centers become the face of AI backlash

https://www.axios.com/2026/06/22/ai-data-center-backlash-poll
2•ilreb•46m ago•0 comments

Productivity: Bookmarkr Chrome plugin with Visual and organized grid

https://chromewebstore.google.com/detail/bookmarkr-—-visual-bookma/lianafemkbankodapdaokiefoffi...
1•mnomansd•46m ago•0 comments

Bitcoin is stuck near $64,000 as ETF outflows reach a sixth week

https://www.techsentiments.com/article/2026/06/22/live-markets-bitcoin-is-stuck-near-64000-as-etf...
1•rajsuper123•49m ago•0 comments

Black Box Probing: A Security Analysis of Xiaomi's MJA1 Secure Chip

https://blog.quarkslab.com/black-box-probing-a-security-analysis-of-xiaomis-mja1-secure-chip.html
1•ahlCVA•51m ago•0 comments

Show HN: Prismag – Per-block model routing for the terminal and any IDE

https://github.com/rufus-SD/prismag
1•arthur-G•59m ago•0 comments

UUID: NewV7() always generates a UUID with 7000 on browsers (Golang)

https://github.com/golang/go/issues/80084
1•mfrw•1h ago•0 comments

QUIC is more than a replacement for TCP

https://kerkour.com/quic-tcp
3•enz•1h ago•1 comments

In Mizoram, India, the Shops Have No Shopkeepers (2020)

https://matadornetwork.com/read/mizoram-india-shops-no-shopkeepers/
2•susam•1h ago•0 comments

Turning spoken commands into JSON tool calls on iPhones

https://blog.wildedge.dev/posts/in-app-voice-assistant
4•wojked•1h ago•0 comments

Lessons from Building Evals for Financial AI Agents

https://www.primerapp.com/blog/lessons-from-3-years-of-evals/
3•smallwoodal•1h ago•3 comments
Open in hackernews

Lessons from Building Evals for Financial AI Agents

https://www.primerapp.com/blog/lessons-from-3-years-of-evals/
3•smallwoodal•1h ago

Comments

sj8070•1h ago
how is primer different from all the other legions of finance agents?
tangweigang•54m ago
A useful distinction would be whether the agent ships with an evaluation surface, not just a workflow surface.

For finance I would look for: the exact task class it claims to handle, the data snapshot used for an answer, the tool calls it was allowed to make, a failure taxonomy, and examples where the agent chooses not to answer. If those are visible, it is much easier to compare it with other finance agents. If they are not visible, it is mostly a UI/product-positioning difference.

smallwoodal•44m ago
Absolutely agree. If fundamental investing becomes mostly about maintaining and improving your own AI research system, then a typical SaaS frontend is not enough. The financial institutions furthest ahead on adoption are already thinking this way: over time, most apps probably need to become an API surface as much as a UI surface.

For Primer to be the core research engine for a team, users need to understand not just the output, but how the agent got there: task class, source snapshot, tool calls, evidence used, failure modes, and cases where the agent should not answer.

That is a big part of how we think about the product and the workflow surface is important, but the evaluation surface is what lets users trust, compare and improve the agent over time. Otherwise you are right: it becomes very hard to distinguish a genuinely better research system from a better UI.

Most investment firms are a long way away from having the capability to think about proper evaluation of these types of systems so we should be helping them in this process.