Show HN: Git bayesect – Bayesian Git bisection for non-deterministic bugs

https://github.com/hauntsaninja/git_bayesect

63•hauntsaninja•4d ago

Comments

hauntsaninja•3d ago

git bisect works great for tracking down regressions, but relies on the bug presenting deterministically. But what if the bug is non-deterministic? Or worse, your behaviour was always non-deterministic, but something has changed, e.g. your tests went from somewhat flaky to very flaky.

In addition to the repo linked in the title, I also wrote up a little bit of the math behind it here: https://hauntsaninja.github.io/git_bayesect.html

Myrmornis•1h ago

This is really cool! Is there an alternative way of thinking about it involving a hidden markov model, looking for a change in value of an unknown latent P(fail)? Or does your approach end up being similar to whatever the appropriate Bayesian approach to the HMM would be?

supermdguy•3d ago

Okay this is really fun and mathematically satisfying. Could even be useful for tough bugs that are technically deterministic, but you might not have precise reproduction steps.

Does it support running a test multiple times to get a probability for a single commit instead of just pass/fail? I guess you’d also need to take into account the number of trials to update the Beta properly.

hauntsaninja•3d ago

Yay, I had fun with it too!

IIUC the way you'd do that right now is just repeatedly recording the individual observations on a single commit, which effectively gives it a probability + the number of trials to do the Beta update. I don't yet have a CLI entrypoint to record a batch observation of (probability, num_trials), but it would be easy to add one

But ofc part of the magic is that git_bayesect's commit selection tells you how to be maximally sample efficient, so you'd only want to do a batch record if your test has high constant overhead

Retr0id•1h ago

Super cool!

A related situation I was in recently was where I was trying to bisect a perf regression, but the benchmarks themselves were quite noisy, making it hard to tell whether I was looking at a "good" vs "bad" commit without repeated trials (in practice I just did repeats).

I could pick a threshold and use bayesect as described, but that involves throwing away information. How hard would it be to generalize this to let me plug in a raw benchmark score at each step?

davidkunz•40m ago

Useful for tests with LLM interactions.

You're still signing data structures the wrong way

Windows 95 defenses against installers that overwrite a file with an older one

EmDash – a spiritual successor to WordPress that solves plugin security

Ask HN: Who is hiring? (April 2026)

TurboQuant KV Compression and SSD Expert Streaming for M5 Pro and IOS

Show HN: Git bayesect – Bayesian Git bisection for non-deterministic bugs

AI for American-produced cement and concrete

StepFun 3.5 Flash is #1 cost-effective model for OpenClaw tasks (300 battles)

An Introduction to Writing Systems and Unicode

Show HN: Zerobox – Sandbox any command with file, network, credential controls

The Anti-Intellectualism of Silicon Valley Elites

CERN levels up with new superconducting karts

Show HN: Real-time dashboard for Claude Code agent teams

The OpenAI Graveyard: All the Deals and Products That Haven't Happened

The AI Marketing BS Index

Apple at 50

NASA Artemis II moon mission live launch broadcast

Is BGP safe yet?

Random numbers, Persian code: A mysterious signal transfixes radio sleuths

Ukrainian Drone Holds Position for 6 Weeks

Wasmer (YC S19) Is Hiring – Rust and DevRel Positions

Ada and Spark on ARM Cortex-M – A Tutorial with Arduino and Nucleo Examples

Intuiting Pratt Parsing

Claude Wrote a Full FreeBSD Remote Kernel RCE with Root Shell (CVE-2026-4747)

Consider the Greenland Shark (2020)

Randomness on Apple Platforms (2024)

Show HN: CLI to order groceries via reverse-engineered REWE API (Haskell)

Claude Code Unpacked : A visual guide

Chess in SQL

SpaceX confidentially files to go public at $1.75T, reports say

Show HN: Git bayesect – Bayesian Git bisection for non-deterministic bugs

Comments

You're still signing data structures the wrong way

Windows 95 defenses against installers that overwrite a file with an older one

EmDash – a spiritual successor to WordPress that solves plugin security

Ask HN: Who is hiring? (April 2026)

TurboQuant KV Compression and SSD Expert Streaming for M5 Pro and IOS

Show HN: Git bayesect – Bayesian Git bisection for non-deterministic bugs

AI for American-produced cement and concrete

StepFun 3.5 Flash is #1 cost-effective model for OpenClaw tasks (300 battles)

An Introduction to Writing Systems and Unicode

Show HN: Zerobox – Sandbox any command with file, network, credential controls

The Anti-Intellectualism of Silicon Valley Elites

CERN levels up with new superconducting karts

Show HN: Real-time dashboard for Claude Code agent teams

The OpenAI Graveyard: All the Deals and Products That Haven't Happened

The AI Marketing BS Index

Apple at 50

NASA Artemis II moon mission live launch broadcast

Is BGP safe yet?

Random numbers, Persian code: A mysterious signal transfixes radio sleuths

Ukrainian Drone Holds Position for 6 Weeks

Wasmer (YC S19) Is Hiring – Rust and DevRel Positions

Ada and Spark on ARM Cortex-M – A Tutorial with Arduino and Nucleo Examples

Intuiting Pratt Parsing

Claude Wrote a Full FreeBSD Remote Kernel RCE with Root Shell (CVE-2026-4747)

Consider the Greenland Shark (2020)

Randomness on Apple Platforms (2024)

Show HN: CLI to order groceries via reverse-engineered REWE API (Haskell)

Claude Code Unpacked : A visual guide

Chess in SQL

SpaceX confidentially files to go public at $1.75T, reports say