Show HN: Detail, a Bug Finder

67•drob•2mo ago

Hi HN, tl;dr we built a bug finder that's working really well, especially for app backends. Try it out and send us your thoughts!

Long story below.

--------------------------

We originally set out to work on technical debt. We had all seen codebases with a lot of debt, so we had personal grudges about the problem, and AI seemed to be making it a lot worse.

Tech debt also seemed like a great problem for AI because: 1) a small portion of the work is thinky and strategic, and then the bulk of the execution is pretty mechanical, and 2) when you're solving technical debt, you're usually trying to preserve existing behavior, just change the implementation. That means you can treat it as a closed-loop problem if you figure out good ways to detect unintended behavior changes due to a code change. And we know how to do that – that's what tests are for!

So we started with writing tests. Tests create the guardrails that make future code changes safer. Our thinking was: if we can test well enough, we can automate a lot of other tech debt work at very high quality.

We built an agent that could write thousands of new tests for a typical codebase, most "merge-quality". Some early users merged hundreds of PRs generated this way, but intuitively the tool always felt "good but not great". We used it sporadically ourselves, and it usually felt like a chore.

Around this point we realized: while we had set out to write good tests, we had built a system that, with a few tweaks, might be very good at finding bugs. When we tested it out on some friends' codebases, we discovered that almost every repo has tons of bugs lurking in it that we were able to flag. Serious bugs, interesting enough that people dropped what they were doing to fix them. Sitting right there in peoples codebases, already merged, running in prod.

We also found a lot of vulns, even in mature codebases, and sometimes even right after someone had gotten a pentest.

Under the hood: - We check out a codebase and figure out how to build it for local dev and exercise it with tests. - We take snapshots of the built local dev state. (We use Runloop for this and are big fans.) - We spin up hundreds of copies of the local dev environment to exercise the codebase in thousands of ways and flag behaviors that seem wrong. - We pick the most salient, scary examples and deliver them as linear tickets, github issues, or emails.

In practice, it's working pretty well. We've been able to find bugs in everything from compilers to trading platforms (even in rust code), but the sweet spot is app backends.

Our approach trades compute for quality. Our codebase scans take hours, far beyond what would be practical for a code review bot. But the result is that we can make more judicious use of engineers’ attention, and we think that’s going to be the most important variable.

Longer term, we think compute is cheap, engineer attention is expensive. Wielded properly, the newest models can execute complicated changes, even in large codebases. That means the limiting reagent in building software is human attention. It still takes time and focus for an engineer to ingest information, e.g. existing code, organizational context, and product requirements. These are all necessary before an engineer can articulate what they want in precise terms and do a competent job reviewing the resulting diff.

For now we're finding bugs, but the techniques we're developing extend to a lot of other background, semi-proactive work to improve codebases.

Try it out and tell us what you think. Free first scan, no credit card required: https://detail.dev/

We're also scanning on OSS repos, if you have any requests. The system is pretty high signal-to-noise, but we don't want to risk annoying maintainers by automatically opening issues, so if you request a scan for an OSS repo the results will go to you personally. https://detail.dev/oss

Comments

howinator•2mo ago

I played around with Detail recently and it was super helpful to point me directly to the code causing some bugs that I know I had, but wasn't sure about the root cause.

Waxing philosophical a bit, I think tools like these are going to be super helpful as our collective understanding of the codebases we own decreases over time due to the proliferation of AI generated code. I'm not making a value judgement here, just pointing out that as we understand codebases less, tools that help us track down the root causes of bugs will be more important.

sbruchmann•2mo ago

Got redirected to a 404 after signing in with GitHub:

https://app.detail.dev/onboarding

drob•2mo ago

Fix is deploying, sorry about that!

dbworku•2mo ago

Very impressed with the results on our repo. Great stuff for managing all of the AI slop.

chrsw•2mo ago

How does this work if your repos aren't on GitHub? And what if your code has nothing to do with backend web apps?

drob•2mo ago

Github only for now. Out of curiosity, is yours on gitlab? Something else?

We should be able to find something interesting in most codebases, as long as there's some plausible way to build and test the code and the codebase is big enough. (Below ~250 files the results get iffy.) We've just tested it a lot more thoroughly on app backends, because that's what we know best.

chrsw•1mo ago

> Out of curiosity, is yours on gitlab? Something else?

Something else, it's a self-hosted Git server similar to GitHub, GitLab, etc. We have multiple repos well clear of 1k files. Almost none of it is JavaScript or TypeScript or anything like that. None of our own code is public.

ZeroConcerns•2mo ago

So, this is only for codebases hosted on Github, right? Any plans for folks not in that ecosystem? And which languages do you support? The examples show Go, (Type|Java)Script, Python, Rust and Zig, which is rather diverse, but lacks some typical 'enterprise' options. The examples look nice and quite different from the usual static analyzer slop, so that is welcome!

drob•2mo ago

Just github for now, but purely for reasons of plumbing. We'll add gitlab and others.

We support java, c/c++, kotlin, ruby, and swift as well. Did you have something specific in mind?

ZeroConcerns•2mo ago

My immediate personal use case would be C# on a self-hosted Gitea instance.

Realistically, anything paid would need to be fully self-hostable, though. There's a bunch of Java codebases that I work on that would benefit from something like this, but they're all behind two or three layers of Citrix...

suprnurd•2mo ago

Looking forward to this working with Gitlab!

munchler•2mo ago

I wanted to give this a try, but it immediately asks for authority to "Act on your behalf" on GitHub. That's not something that I'm going to grant to an unfamiliar agent.

It would make a lot more sense to me if you provided a lighter "intro" version, even if that means it can only run on public repos.

drob•1mo ago

As far as we can tell this is a github-ism, and any OAuth permission is a form of "acting on your behalf": https://dappling.medium.com/a-github-app-would-like-to-act-o...

munchler•1mo ago

That's good to know, but I would still suggest an on-ramp that only uses GitHub for authentication (i.e. no permissions needed). To that end, it would be nice if I could also authenticate with other OAuth providers instead, like Google, etc.

Again, I understand that this would limit me to scanning public repos, but that would be fine.

drob•1mo ago

Other auth providers for sure. We'll be adding shortly.

Using an alternate auth provider won't even prevent you from scanning non-public GitHub code. There's a GitHub OAuth App just for auth (which is what you're seeing here), and a separate GitHub App that you need to install either way to give Detail access to the right repos. We can swap out the former for Google/Okta/pw if you want to avoid this warning. GitHub Apps (the half that manages repo access) have a much finer grained permissions model.

bjtitus•1mo ago

I looked for an explanation of what the tool does on my behalf on your site but didn't see anything.

I guess I expected on the homepage or maybe "About" but I was looking for something related to whether you open PRs on my behalf given that OAuth prompt.

I think adding that or some explanation during onboarding about the permissions might help.

cloudhead•2mo ago

Looks interesting, but I self host so it would have to work with plain Git URLs.

hiesenbug•2mo ago

Does this work for cross-compiled projects as well? Do you only require code that's buildable on the host or also runnable? How would it behave for a firmware codebase?

drob•1mo ago

We've run it on a few firmware repos and gotten good results. A lot of firmware code tends to have really poor type-safety which means lots of low-hanging bugs.

We should be able to handle cross-compilation. Want to try it? Ping me in any direct channel (dan@detail.dev / @danlovesproofs) and we can keep an eye on your repo.

bflesch•1mo ago

On the landing page I see full names and pictures of customers but not any information about the founders and/or shareholders. I click on "about us" and "privacy" and "terms" and "trust center" and I cannot figure out: What is the name of the company, where is it located, who will be having access to my data. For a security-related startup if such information is missing it's a big red flag.

Also unfortunately the animation on the landing page makes the whole website quite slow.

drob•1mo ago

Hi bflesch, fair point – our About Us page has a lot about what we think and not about... us!

I'm the founder. Previously I was at Heap for nine years. There's a company LinkedIn with the rest of the team: https://www.linkedin.com/company/detail-dev/

We're located in SF. The About Us page lists some of our angel investors at the bottom.

Regarding security in particular, there's a lot more info in our Trust Center: https://trust.detail.dev/

If anything else seems conspicuously missing, please flag. In all likelihood it's omitted without intent.

bflesch•1mo ago

Thanks for your reply. As I said, on your website there is no address, there is no legal entity name, there is no company registration number. You could sit in north korea for all I know.

Now I spotted in the last sentence of your "about us" that "We're based in SF". Oh and only now I see on the "terms" page has "15. Contact information qqbot, Inc 3624 16th St San Francisco, CA 94114 Email: support@detail.dev"

Why not put that address into the footer or add an imprint section to the website? It's such a quick win to establish trust. Also if guillermo rauch is an angel investor why mention him at the last sentence of the "about us" page and not in the middle of your landing page. Why did guillermo not post a testimonial that add to the landing page? Did he not like the product? Or did he not review the product?

PS: When I search for "qqbot" on kagi a lot of chinese-language results show up. Is the company affiliated with china?

Sorry for challenging you. I wish you good luck if your claims hold it is a worthwhile effort.

solatic•1mo ago

$30/committer/month, while only running scans biweekly, not even including "Enterprise" pricing, is really, really steep and will be a big barrier to adoption in larger enterprises with many engineers. You're basically asking enterprises to take the $30/committer/month pricing that they're spending on something like GitLab Premium, and double it, for bug reports? They may be great bug reports, but if it's difficult enough to get teams to merge automated MRs from tools like Dependabot/Renovate, what makes you so confident that a large enterprise customer will be so willing to add Another Tool that opens More MRs that require engineers to spend More Time Reviewing that may or may not have anything to do with shipping more features out the door?

Please consider a pricing model that's closer to bug bounties. There's clearly a working pricing model where companies are willing to pay bounties for discovered vulnerabilities. Your tool finds vulnerabilities (among other classes of bugs). Why not a pricing model where customers agree up-front to pay per bug your model finds? There are definitely some tricky parts to that model - you need an automated way of grading/scoring the bugs you find, since critical-severity bugs will be worth more (and be more interesting to customers) compared to low-severity bugs, and some customers will surely appeal some of the automatic scores - but could you make it work? Customers could then have more control over scaling up usage of Detail (adding slowly to more repositories), including capping how many bugs of each severity they would like reports for (to limit their spend), allowing customers to slowly add more repositories and run scans more frequently to find more bugs as they get more proven value from the tool.

drob•1mo ago

We've been thinking about this too. We have some ideas. Thanks for the comment, in any case – gave us a lot to chew on.

eikenberry•1mo ago

How do you define "merge-quality" and how to you determine a PR is of merge quality? Particularly when you are generating a lot of them with no human oversight involved?

StrangeSound•1mo ago

How would this work with a monorepo? I tried earlier with no success unfortunately

drob•1mo ago

Should work fine. Ping me dan@detail.dev or @danlovesproofs with your account info so we can look into it?

valmarelox•1mo ago

Do you have public results for vulnerabilities found? I saw only non-security bugs in the website

We Mourn Our Craft

I Write Games in C (yes, C)

Hoot: Scheme on WebAssembly

SectorC: A C Compiler in 512 bytes

Stories from 25 Years of Software Development

U.S. Jobs Disappear at Fastest January Pace Since Great Recession

OpenCiv3: Open-source, cross-platform reimagining of Civilization III

The AI boom is causing shortages everywhere else

Al Lowe on model trains, funny deaths and working with Disney

The Waymo World Model

Reinforcement Learning from Human Feedback

Brookhaven Lab's RHIC Concludes 25-Year Run with Final Collisions

Start all of your commands with a comma (2009)

Vocal Guide – belt sing without killing yourself

France's homegrown open source online office suite

Coding agents have replaced every framework I used

A Fresh Look at IBM 3270 Information Display System

Show HN: I saw this cool navigation reveal, so I made a simple HTML+CSS version

History and Timeline of the Proco Rat Pedal (2021)

Selection Rather Than Prediction

72M Points of Interest

Unseen Footage of Atari Battlezone Arcade Cabinet Production

Where did all the starships go?

Show HN: Look Ma, No Linux: Shell, App Installer, Vi, Cc on ESP32-S3 / BreezyBox

Learning from context is harder than we thought

Monty: A minimal, secure Python interpreter written in Rust for use by AI

Show HN: Kappal – CLI to Run Docker Compose YML on Kubernetes for Local Dev

Hackers (1995) Animated Experience

Making geo joins faster with H3 indexes

Sheldon Brown's Bicycle Technical Info

We Mourn Our Craft

I Write Games in C (yes, C)

Hoot: Scheme on WebAssembly

SectorC: A C Compiler in 512 bytes

Stories from 25 Years of Software Development

U.S. Jobs Disappear at Fastest January Pace Since Great Recession

OpenCiv3: Open-source, cross-platform reimagining of Civilization III

The AI boom is causing shortages everywhere else

Al Lowe on model trains, funny deaths and working with Disney

The Waymo World Model

Reinforcement Learning from Human Feedback

Brookhaven Lab's RHIC Concludes 25-Year Run with Final Collisions

Start all of your commands with a comma (2009)

Vocal Guide – belt sing without killing yourself

France's homegrown open source online office suite

Coding agents have replaced every framework I used

A Fresh Look at IBM 3270 Information Display System

Show HN: I saw this cool navigation reveal, so I made a simple HTML+CSS version

History and Timeline of the Proco Rat Pedal (2021)

Selection Rather Than Prediction

72M Points of Interest

Unseen Footage of Atari Battlezone Arcade Cabinet Production

Where did all the starships go?

Show HN: Look Ma, No Linux: Shell, App Installer, Vi, Cc on ESP32-S3 / BreezyBox

Learning from context is harder than we thought

Monty: A minimal, secure Python interpreter written in Rust for use by AI

Show HN: Kappal – CLI to Run Docker Compose YML on Kubernetes for Local Dev

Hackers (1995) Animated Experience

Making geo joins faster with H3 indexes

Sheldon Brown's Bicycle Technical Info

Show HN: Detail, a Bug Finder

Comments