Tests Don't Prove Code Is Correct They Just Agree with It

https://medium.com/@arnonaxelrod/proof-driven-development-or-the-business-value-of-clean-code-b84380ff312e

20•birdculture•3mo ago

Comments

none2585•3mo ago

Personally, I've found automation testing of any kind (unit, integration, God forbid UI) is the biggest waste of time in modern software engineering practices. I've never, not once in nearing 20 years of professional software development in companies big and small, well-known and not, have ever seen it be worth the monumental amount of time they waste.

I could go on and on about this but damn just feels good to "say" aloud.

hu3•3mo ago

In my experience, tests help me refactor stuff and detect when I have broken some expected inplementation.

For example, I was refactoring a SQL query builder and tests told me my JOINs where no longer containing ON clause. Might seem trivial but multiply this by a large codebase and benefits increase.

Another thing is that when I write functions with tests in mind, they tend to be more maintenable and simpler because it's hard to test functions that to a ton of things and have many side effects.

hinkley•3mo ago

About three to five times a year the tests help me make a feature much more useful because I realize that there’s an interesting corner case that’s easy to check and easy to implement.

Lots of times people don’t ask for things because they guess wrong about where the difficult parts will be and they learn from our pushing back to ask for less.

heymijo•3mo ago

So you are saying that test driven development is: a) not useful for you and your use cases b) never useful in any use cases c) sometimes useful but not for you

paulryanrogers•3mo ago

Sounds like you had to maintain tests with the wrong kind of coverage. IMO there is an art to crafting tests that are economical.

They should cover core features that pay for the business, ideally as coarsely as practical. Some coupling with the implementation is inevitable, so I prefer it be at the highest level that can be maintained.

Have you never seen tests catch bugs that otherwise would have gone into production?

burnt-resistor•3mo ago

"In X years .." is the surest sign of complacency, overconfidence, and excuses on the path towards hubris. And the "I've never needed a seatbelt before" fallacy is a really terrible habit.

There is nuance between Cucumber bureaucracy, and proper testing practices that check real bits.

01HNNWZ0MV43FF•3mo ago

I've found unit tests valuable as executable documentation for inner interfaces that manual testing can't reach easily

Say I've got some algorithm like a binary search that you can implement in a page of code but it's gonna be buried three layers deep in the business logic. You could expose a debug command and test it manually, or you could throw a few unit tests on it to hit all the edge cases, make sure it gives the expected outputs for given inputs, and then you know there aren't major bugs there

FrostKiwi•3mo ago

> God forbid UI

Especially for UI, the amount of hours saved from e2e tests with playwright was huge for me. Tests for iOS Safari and Chrome Desktop, all with their own layout quirks from unexpected ways a layout adjustment plays out on a DPI x1 1920x1080 screen with mouse clicks vs DPI x2.75 900x1600 smartphone screen with touch inputs.

Especially here in Japan I have seen whole firms hired to UI tests a million different variations of these each release candidate and that is "the biggest waste of time in modern software engineering" I have seen so far. You can drown in trying to automate all forms of testing and human testing is important. But so is automating the basic sanity checks. Every tool can of course be misused in pursuit of an impractical extreme of either.

hinkley•3mo ago

There’s like this three way war in testing between people who think they’re terrible and write tests that are self fulfilling prophecies, people who think terrible tests aren’t bad and good tests are “too much work”, and a small minority of people that think different tests are just different work and only take a bit of practice to do decently.

It took me two mentors to feel like I could do tests worth having. Two more to feel I was good at it. And one more to realize everyone is terrible at them and I just suck less.

There is definitely something wrong with testing. But nobody has cracked the code so we do this even though we know it’s often suboptimal. I recently hunted down guy #5 and asked him about my conclusion and he agreed.

deterministic•3mo ago

You are doing it wrong.

I dump highly complex modified code straight into production and haven't had a single production bug for 6+ years.

The reason why is automatic tests.

rcxdude•3mo ago

I am generally against the brainless insistence on just writing tests without thinking about the value vs cost (especially on unit tests, which are usually the lowest value and highest cost IMO), but I have found them extremely valuable on occasion. Sometimes for being able to hammer some bit of tricky code through a bunch of different scenarios while figuring out the edge cases, others for validating top-level requirements, and especially for helping prevent regressions while making other changes (which will 100% happen once you have more than one person working on a more than trivial codebase).

burnt-resistor•3mo ago

Superficial article waxing vague nonsense like "clean", which is meaningless.

Necessary and sufficient tests are essential, otherwise code is of limited value. At a minimum, unit test complicated bits and integration test the big, common uses. Add tests for fixed bugs to never repeat them. Without tests, refactoring becomes really risky and damn hard. For extra confidence: benchmark, fuzz, and property test, and sometimes consider formal methods like theorem-proving to validate behavior falls within bounds where applicable. It's easy to go overboard on process or act as a cowboy coder who doesn't do things properly.

tomjakubowski•3mo ago

> Add tests for fixed bugs to never repeat them

In my experience of all the kinds of tests one can write, it's easiest to see that regression tests carry their own weight. But they have a tendency to bitrot; after enough refactors and rewriting of core elements and dropping of features they may need to be thrown out. Hopefully your bug tracker is in good shape so you can recapture the test's intent in your new context when you rewrite it, or else make the case that the bug is no longer relevant and discard the test for good.

symbolicAGI•3mo ago

Proving program correctness is not easy. For example, one must prove the correctness of the entire dependency supply chain that the target program might directly or indirectly call. Sure, the proofs of those dependencies are to a degree reusable, but would likely need to be re-proved for each distinct set of parameters.

And on the other hand, TDD shows that having expectations of the results before coding the procedure is a good thing. Not necessarily proving the implementation correct but TDD is more likely to produce correct code.

jsmith45•3mo ago

Yeah, proving correct is not a panacea. If you have C code that has been proven correct with respect to what the C Standard mandates (and some specific values of implementation defined limits), that is all well and good.

But where is the proof that your compiler will compile the code correctly with respect to the C standard and your target instruction set specification? How about the proof of correctness of your C library with respect to both of those, and the documented requirements of your kernel? Where is the proof that the kernel handles all programs that meet it documented requirements correctly?

Not to point too fine a point on it, but: where is the proof that your processor actually implements the ISA correctly (either as documented, or as intended, given that typos in the ISA documentation are that THAT rare)? This is very serious question! There have been a bunch of times that processors have failed to implement the ISA spec is very bad and noticeable ways. RDRAND has been found to be badly broken many times now. There was the Intel Skylake/Kaby Lake Hyper-Threading Bug that needed microcode fixes. And these are just some of the issues that got publicized well enough that I noticed them. Probably many others that I never even heard about.

kazinator•3mo ago

Unit testing is very good at what it does in spite of not being exhaustive. You need the right examples, not all the examples.

Where testing is weak is entire systems. As you go up the integration scale, tests get more flimsy.

EnPissant•3mo ago

Obviously?

Goto Considered Awesome [video]

Show HN: I Built a Free AI LinkedIn Carousel Generator

Implementing Auto Tiling with Just 5 Tiles

Open Challange (Get all Universities involved

Apple Tried to Tamper Proof AirTag 2 Speakers – I Broke It [video]

Show HN: Vibe as a Code / VaaC – new approach to vibe coding

Show HN: More beautiful and usable Hacker News

Toledo Derailment Rescue [video]

War Department Cuts Ties with Harvard University

Show HN: LocalGPT – A local-first AI assistant in Rust with persistent memory

A Bid-Based NFT Advertising Grid

AI readability score for your documentation

NASA Study: Non-Biologic Processes Don't Explain Mars Organics

I inhaled traffic fumes to find out where air pollution goes in my body

X said it would give $1M to a user who had previously shared racist posts

155M US land parcel boundaries

Private Inference

Font Rendering from First Principles

Show HN: Seedance 2.0 AI video generator for creators and ecommerce

Wally: A fun, reliable voice assistant in the shape of a penguin

Rewriting Pycparser with the Help of an LLM

Lobsters Vibecoding Challenge

E-Commerce vs. Social Commerce

Avoiding Modern C++ – Anton Mikhailov [video]

Show HN: AegisMind–AI system with 12 brain regions modeled on human neuroscience

Zig – Package Management Workflow Enhancements

AI-powered text correction for macOS

AppSecMaster – Learn Application Security with hands on challenges

Fibonacci Number Certificates