frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Introduction to Multi-Armed Bandits

https://arxiv.org/abs/1904.07272
26•Anon84•1h ago

Comments

esafak•1h ago
One way to address the https://en.wikipedia.org/wiki/Exploration%E2%80%93exploitati...
rented_mule•15m ago
We employed bandits in a product I worked on. It was selecting which piece of content to show in a certain context, optimizing for clicks. It did a great job, but there were implications that I wish we understood from the start.

There was a constant stream of new content (i.e., arms for the bandits) to choose from. Instead of running manual experiments (e.g., A/B tests or other designs), the bandits would sample the new set of options and arrive at a new optimal mix much more quickly.

But we did want to run experiments with other things around the content that was managed by the bandits (e.g., UI flow, overall layout, other algorithmic things, etc.). It turns out bandits complicate these experiments significantly. Any changes to the context in which the bandits operate lead them to shift things more towards exploration to find a new optimal mix, hurting performance for some period of time.

We had a choice we could make here... treat all traffic, regardless of cohort, as a single universe that the bandits are managing (so they would optimize for the mix of cohorts as a whole). Or we could setup bandit stats for each cohort. If things are combined, then we can't use an experiment design that assumes independence between cohorts (e.g., A/B testing) because the bandits break independence. But the optimal mix will likely look different for one cohort vs. another vs. all of them combined. So it's better for experiment validity to isolate the bandits for each cohort. Now small cohorts can take quite a while to converge before we can measure how well things work. All of this puts a real limit on iteration speed.

Things also become very difficult to reason about because their is state in the bandit stats that are being used to optimize things. You can often think of that as a black box, but sometimes you need to look inside and it can be very difficult.

Much (all?) of this comes from bandits being feedback loops - these same problems are present in other approaches where feedback loops are used (e.g., control theory based approaches). Feedback mechanisms are incredibly powerful, but they couple things together in ways that can be difficult to tease apart.

kianN•9m ago
I’ve actually run into the exact same issue. At the time we similarly had to scrap bandits. Since then I’ve had the opportunity to do a fair amount of research into hierarchical dirichelete processes in an unrelated field.

On a random day, a light went off in my head that hierarchy perfectly addresses the stratification vs aggregation problems that arise in bandits. Unfortunately I’ve never had a chance to apply this (and thus see the issues) in a relevant setting since.

Isotopic analysis determines that water once flowed on asteroid Ryugu

https://phys.org/news/2025-09-isotopic-analysis-asteroid-ryugu.html
1•PaulHoule•2m ago•0 comments

Bolt v2

https://twitter.com/boltdotnew/status/1973063093849567591
1•XCSme•4m ago•1 comments

FBI director gifted NZ police and intelligence chiefs 3D-printed guns

https://www.rnz.co.nz/news/national/574618/fbi-director-gifted-nz-police-and-intelligence-chiefs-...
1•colinprince•5m ago•0 comments

Dropping Upstream Nix from Determinate Nix Installer

https://determinate.systems/blog/installer-dropping-upstream/
1•kblissett•9m ago•0 comments

Claude sonet 4.5 will no longer be only for devs thing

https://www.youtube.com/watch?v=oXfVkbb7MCg
1•Cappybara12•10m ago•1 comments

Rust to the Automotive Stack

https://filtra.io/rust/interviews/volvo-sep-25
1•weinzierl•11m ago•0 comments

Live Video of Global Sumud Flotilla

https://globalsumudflotilla.org/live/
4•novateg•17m ago•1 comments

Swift to add blockchain-based ledger to its infrastructure stack

https://www.swift.com/news-events/press-releases/swift-add-blockchain-based-ledger-its-infrastruc...
1•jnord•18m ago•1 comments

The SC prepared to lie to us, and what we can do about it

https://discourse.nixos.org/t/the-sc-prepared-to-lie-to-us-and-what-we-can-do-about-it-whistleblo...
2•kblissett•20m ago•0 comments

Show HN: CulinaryWiki, a Wiki for Culinary Knowledge

https://culinarywiki.org/wiki/Main_Page
2•fromwilliam•21m ago•0 comments

Beeper Should Become Link-Friendly

1•LucCogZest•21m ago•0 comments

Show HN: NanoModal – a tiny accessible modal library

https://muffinman.io/nano-modal/
1•stanko•26m ago•0 comments

Allheadline change the look of there website

https://allheadline.com/
1•fatbrother•27m ago•0 comments

Wealthfront S-1

https://www.sec.gov/Archives/edgar/data/1524566/000162828025043113/wealthfront-sx1.htm
1•mauriziocalo•27m ago•0 comments

LLM security agent finds vulnerability in LLM engineering platform

https://www.depthfirst.com/post/how-an-authorization-flaw-reveals-a-common-security-blind-spot-cv...
1•ponderwonder•29m ago•0 comments

Samsung confirms it will begin showing you advertisements on refrigerator screen

https://fortune.com/2025/09/19/samsung-family-hub-refrigerators-advertisements/
5•hippich•30m ago•2 comments

Prototype-First Software Design with Agents

https://serce.me/posts/2025-09-30-prototype-first-software-design-with-agents
1•SerCe•31m ago•0 comments

Claude AI Now Executes Code in Real-Time (Sandboxed Python/Node.js)

https://tolearn.blog/blog/claude-ai-now-executes-code
1•leoli123•33m ago•1 comments

Offset – Autonomous Financial Analyst

https://offset.app/
1•raj_khare•34m ago•1 comments

Dreamer 4

https://arxiv.org/abs/2509.24527
1•theOGognf•38m ago•1 comments

Solid Knitting

https://textiles-lab.github.io/publications/2024-solid-knitting/
1•mathgenius•40m ago•0 comments

I was kicked out of the US by the government

https://twitter.com/itsericlay/status/1972703130279346236
12•pfexec•45m ago•0 comments

As of Q4 2025, video footage is no longer a real proof? (Sora 2)

1•Printerisreal•46m ago•2 comments

Evaluating AI Model Performance on Real-World Economically Valuable Tasks [pdf]

https://cdn.openai.com/pdf/d5eb7428-c4e9-4a33-bd86-86dd4bcf12ce/GDPval.pdf
2•Anon84•46m ago•1 comments

SmugMug accelerates business intelligence with Amazon QuickSight scenarios

https://aws.amazon.com/blogs/business-intelligence/how-smugmug-accelerates-business-intelligence-...
2•kabell•48m ago•0 comments

Simple, free and efficient ad-blocker

https://zenprivacy.net/
1•orixilus•49m ago•0 comments

Pfizer's Drug Price Cuts Yield Three-Year Tariff Reprieve

https://www.bloomberg.com/news/articles/2025-09-30/pfizer-to-cut-medicaid-drug-prices-in-deal-wit...
2•petethomas•49m ago•1 comments

We Can Just Do Things (In ATProto)

https://underreacted.leaflet.pub/3m23gqakbqs2j
2•verdverm•55m ago•0 comments

FTC Sues Zillow and Redfin

https://www.ftc.gov/news-events/news/press-releases/2025/09/ftc-sues-zillow-redfin-over-illegal-a...
7•itbeho•57m ago•1 comments

The Network Effect of Intelligence

https://rashidazarang.com/c/the-network-effect-of-intelligence
3•rashidae•59m ago•0 comments