Necessary and sufficient tests are essential, otherwise code is of limited value. At a minimum, unit test complicated bits and integration test the big, common uses. Add tests for fixed bugs to never repeat them. Without tests, refactoring becomes really risky and damn hard. For extra confidence: benchmark, fuzz, and property test, and sometimes consider formal methods like theorem-proving to validate behavior falls within bounds where applicable. It's easy to go overboard on process or act as a cowboy coder who doesn't do things properly.
In my experience of all the kinds of tests one can write, it's easiest to see that regression tests carry their own weight. But they have a tendency to bitrot; after enough refactors and rewriting of core elements and dropping of features they may need to be thrown out. Hopefully your bug tracker is in good shape so you can recapture the test's intent in your new context when you rewrite it, or else make the case that the bug is no longer relevant and discard the test for good.
And on the other hand, TDD shows that having expectations of the results before coding the procedure is a good thing. Not necessarily proving the implementation correct but TDD is more likely to produce correct code.
But where is the proof that your compiler will compile the code correctly with respect to the C standard and your target instruction set specification? How about the proof of correctness of your C library with respect to both of those, and the documented requirements of your kernel? Where is the proof that the kernel handles all programs that meet it documented requirements correctly?
Not to point too fine a point on it, but: where is the proof that your processor actually implements the ISA correctly (either as documented, or as intended, given that typos in the ISA documentation are that THAT rare)? This is very serious question! There have been a bunch of times that processors have failed to implement the ISA spec is very bad and noticeable ways. RDRAND has been found to be badly broken many times now. There was the Intel Skylake/Kaby Lake Hyper-Threading Bug that needed microcode fixes. And these are just some of the issues that got publicized well enough that I noticed them. Probably many others that I never even heard about.
Where testing is weak is entire systems. As you go up the integration scale, tests get more flimsy.
none2585•3mo ago
I could go on and on about this but damn just feels good to "say" aloud.
hu3•3mo ago
For example, I was refactoring a SQL query builder and tests told me my JOINs where no longer containing ON clause. Might seem trivial but multiply this by a large codebase and benefits increase.
Another thing is that when I write functions with tests in mind, they tend to be more maintenable and simpler because it's hard to test functions that to a ton of things and have many side effects.
hinkley•3mo ago
Lots of times people don’t ask for things because they guess wrong about where the difficult parts will be and they learn from our pushing back to ask for less.
heymijo•3mo ago
paulryanrogers•3mo ago
They should cover core features that pay for the business, ideally as coarsely as practical. Some coupling with the implementation is inevitable, so I prefer it be at the highest level that can be maintained.
Have you never seen tests catch bugs that otherwise would have gone into production?
burnt-resistor•3mo ago
There is nuance between Cucumber bureaucracy, and proper testing practices that check real bits.
01HNNWZ0MV43FF•3mo ago
Say I've got some algorithm like a binary search that you can implement in a page of code but it's gonna be buried three layers deep in the business logic. You could expose a debug command and test it manually, or you could throw a few unit tests on it to hit all the edge cases, make sure it gives the expected outputs for given inputs, and then you know there aren't major bugs there
FrostKiwi•3mo ago
Especially for UI, the amount of hours saved from e2e tests with playwright was huge for me. Tests for iOS Safari and Chrome Desktop, all with their own layout quirks from unexpected ways a layout adjustment plays out on a DPI x1 1920x1080 screen with mouse clicks vs DPI x2.75 900x1600 smartphone screen with touch inputs.
Especially here in Japan I have seen whole firms hired to UI tests a million different variations of these each release candidate and that is "the biggest waste of time in modern software engineering" I have seen so far. You can drown in trying to automate all forms of testing and human testing is important. But so is automating the basic sanity checks. Every tool can of course be misused in pursuit of an impractical extreme of either.
hinkley•3mo ago
It took me two mentors to feel like I could do tests worth having. Two more to feel I was good at it. And one more to realize everyone is terrible at them and I just suck less.
There is definitely something wrong with testing. But nobody has cracked the code so we do this even though we know it’s often suboptimal. I recently hunted down guy #5 and asked him about my conclusion and he agreed.
deterministic•3mo ago
I dump highly complex modified code straight into production and haven't had a single production bug for 6+ years.
The reason why is automatic tests.
rcxdude•3mo ago