Testing is better than data structures and algorithms

https://nedbatchelder.com/blog/202509/testing_is_better_than_dsa.html

48•rsyring•4h ago

Comments

wjrb•1h ago

Are there any resources out there that anyone can recommend for learning testing in the way the author describes?

In-the-trenches experience (especially "good" or "doing it right" experience) can be hard to come by; and why not stand on the shoulders of giants when learning it the first time?

cogman10•1h ago

Resources, none that I'm aware of. I generally think this is an OK way to look at testing [1], though I think it goes too far if you completely adopt their framework.

The boil down the tests I like to see. Structure them with "Given/when/then" statements. You don't need a framework for this, just make method calls with whatever unit test framework you are using. Keep the methods small, don't do a whole lot of "then"s, split that into multiple tests. Structure your code so that you aren't testing too deep. Ideally, you don't need to stand up your entire environment to run a test. But do write some of those tests, they are important for catching issues that can hide between unit tests.

[1] https://cucumber.io/docs/bdd/

Jtsummers•1h ago

Working Effectively with Legacy Code by Michael Feathers. It spends a lot of time on how to introduce testability into existing software systems that were not designed for testing.

Property-Based Testing with PropEr, Erlang, and Elixir by Fred Hebert. While a book about a particular tool (PropEr) and pair of languages (Erlang and Elixir), it's a solid introduction on property-based testing. The techniques described transfer well to other PBT systems and other languages.

Test-Driven Development by Kent Beck.

https://www.fuzzingbook.org/ by Zeller et al. and https://www.debuggingbook.org/ by Andreas Zeller. The latter is technically about debugging, but it has some specific techniques that you can incorporate into how you test software. Like Delta Debugging, also described in a paper by Zeller et al. https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=988....

I'm not sure of other books I can recommend, the rest I know is from learning on the job or studying specific tooling and techniques.

marcosdumay•1h ago

When testing job candidates, sure, no doubt about that.

For for learning, no, it's not. You should not spend as much time learning testing as you spend leaning data structures.

Jtsummers•1h ago

I feel like this mischaracterizes the blog. You seem to be taking this:

> People should spend less time learning DSA, more time learning testing.

And reading it as "More total time should be spent on learning testing than the total time spent learning DSA". That's one reading, another is that people are studying DSA too much, and testing too little. The ratio of total time can still be in favor of studying DSA more, but maybe instead of 10:1 it should be more like 8:1 or 5:1.

marcosdumay•42m ago

That's a fair point. But then the author makes a blatant and unrealistic generalization about how much time people spend on each of those. Between CS undergrads and introduction to programming bootcamps, the variance on that number is extreme.

pshirshov•1h ago

Pure bullshit and incompetence.

> esoteric things like Bloom filters, so you can find them later in the unlikely case you need them.

They are not esoteric, they are trivial and extremely useful in many cases.

> Less DSA, more testing.

Testing can't cover all the cases by definition, why not property testing? Why not formal proofs?

Plus, in our days, it's easy to delegate testcase writing to LLMs, while they literally cannot invent new useful AnDS.

cogman10•1h ago

> extremely useful in many cases.

I've not ran into a case where I can apply a bloom filter. I keep looking because it always seems like it'd be useful. The problem I have is bloom filter has practically reverse characteristics from what I want. It gives false positives and true negatives. I most often want true positives and false negatives.

pshirshov•1h ago

Assume that you need to build a large-scale search or analytics tool for example. All the sketch data structures (like cuckoo filters and especially hypermihashes) are extremely useful in these scenarios.

burnt-resistor•1h ago

It's a strawman besmirching niche knowledge for methodology. The two aren't mutex and shouldn't be competitors. Bloom filters are really trivial to implement and are great examples of time/space tradeoffs, and are useful mostly for checking if a key isn't a member of an otherwise expensive lookup operation and so can be avoided early.

What's more concerning is "engineers" incurious about how lower levels of the stack work, or aren't interested in learning breadth, depth, or new things.

cogman10•1h ago

I agree.

The main benefit of being familiar with how data structures and algorithms work is that you become familiar with their runtime characteristics and thus can know when to reach for them in a real problem.

The author is correct here. You'll almost never need to implement a B-Tree. What's important is knowing that B-Trees have log n insertion times with good memory locality making them faster than simple binary trees. Knowing how the B-Tree works could help you in tuning it correctly, but otherwise just knowing the insertion/lookup efficiencies is enough.

hvb2•1h ago

This feels backwards. When you have a good understanding of data structures you have the luxury of testing.

If you focus on testing over data structures, you might end up testing something that you didn't need to test because you used the wrong data structures.

IMHO too often people dont consider big O because it works fine with their 10 row test case.... And then it grinds to a halt when given a real problem

cogman10•1h ago

That wasn't the thrust of the article.

The article is saying that it's more important to write tests than it is to learn how to write data structures. It specifically says you should learn which data structures you should use, but don't focus on knowing how to implement all them.

It calls out, specifically, that you should know that `sort` exists but you really don't need to know how to implement quicksort vs selection sort.

hvb2•1h ago

No, it says learn data structures first, then focus on testing.

You don't have to go super deep on all the sort algorithms, sure. That's like saying that learning testing implies writing a mocking library

jancsika•16m ago

> IMHO too often people dont consider big O because it works fine with their 10 row test case.... And then it grinds to a halt when given a real problem

Not if the user can, say, farm 1,000,000 different rows 100 times over an hour and a half while gossiping with their office mates. I over Excel as Exhibit A.

nice_byte•1h ago

ignore this advice.

spend plenty of time studying data structures and algorithms as well as computer architecture. these are actually difficult things that take a long time to understand and will have a positive impact on your career.

study the underlying disciplines of your preferred domain.

in general, focus on more fundamental things and limit the amount of time you spend on stupid shit like frameworks, build systems, quirks of an editor or a programming language. all these things will find a way to steal your time _anyway_, and your time is extremely precious.

"testing" is not fundamental. there is no real skill to be learned there, it's just one of those things that will find a way to steal your time anyway so there is no point in focusing actively on it.

put it that way: you will NEVER get the extra time to study fundamental theory. you will ALWAYS be forced to spend time to write tests.

if you somehow find the time, spend it on things that are worth it.

ChrisMarshallNY•1h ago

I agree with the article, but I'll bet a lot of others, don't. Discussions on Code Quality, don't fare well, here. Wouldn't surprise me, if the article already has flags.

Of course, "testing," is in the eye of the beholder.

Some folks are completely into TDD, and insist that you need to have 100% code coverage tests, before writing one line of application code, and some folks think that 100% code coverage unit tests, means that the system is fully tested.

I've learned that it's a bit more nuanced than this[0].

[0] https://littlegreenviper.com/testing-harness-vs-unit/

general1465•10m ago

Testing, especially vstest.console.exe in Visual Studio has carried my business really far. I have accumulated thousands of tests on my codebase usually based on customer requirements or on past bugs which I have been trying to replicate.

I think that a lot of people dislike testing because a lot of tests can run for hours. In my case it is almost 6 hours from start to finish. However as a software developer I have accumulated a lot of computers which are kind of good and I don't want to throw them out yet but they are not really usable for current development - i.e. 8GB of RAM, 256GB SSD, i5 CPU from 2014 - That would be a punishment to use it with Visual Studio today. But it is a perfect machine for compiling in console i.e. dotnet build or msbuild and running tests via vstest glued together with PowerShell script. So this dedicated testing machine is running on changes over night and I will see if it passed or not and if not fix tests which did not passed.

This setup may feel clunky, but it allows me to make sweeping changes in a codebase and be confident enough, that if the tests pass, it will very likely work for the customer too. The most obvious example where tests were carrying me around has been moving to .NET8 from .NET Framework 4.8. I have went from 90% failure rate on tests to all tests clear in like 3-4 iterations.

glitchc•1h ago

The article fails to demonstrate how code-tests result in objectively better code. Many comp sci programs have courses on testing that cover TDD, unit testing and fuzzing, among other topics.

Yet much of the safety critical code we rely on for critical infrastructure (nuclear reactors, aircraft, drones, etc) is not tested in-situ. It is tested via simulation, but there's minimal testing in the operating environment which can be quite complex. Instead the code follows carefully chosen design patterns, data structures and algorithms, to ensure that the code is hazard-free, fault-tolerant and capable of graceful degradation.

So, testing has its place, but testing is really no better than simulation. And in simulation, the outputs are only as good as the inputs. It cannot guarantee code safety and is not a substitute for good software design (read: structures and algorithms).

Having said that, fuzzing is a great way to find bugs in your code, and highly recommended for any software that exposes an API to other systems.

MoreQARespect•1h ago

>fails to demonstrate how code-tests result in objectively better code.

Tests give the freedom to refactor which results in better code.

>So, testing has its place, but testing is really no better than simulation

Testing IS simulation and simulation IS testing.

>And in simulation, the outputs are only as good as the inputs. It cannot guarantee code safety

Only juniors think that you can get guarantees of code safety. Seniors look for ways to de-risk code, knowing that you're always trending towards a minima.

One of the key skills in testing is defining good, realistic inputs.

azeirah•12m ago

I don't understand what the difference between a simulation and a test is?

burnt-resistor•1h ago

Sigh. Monochromatic myopia denying the need for holistic quality and mastery in multiple arenas and methodologies. Belts and suspenders, not just elastic waistbands.

CyberDildonics•30m ago

Oh if the holistic myopia is multiples of monochromatic does it really need elastic mastery? Sigh.

danielmarkbruce•1h ago

This will annoy a lot of folks, but:

1 - If you work on large scale software systems, especially infrastructure software of most types then you need to know and understand DSA and feel it in your bones.

2 - Most people work on crud apps or similar and don't really need to know this stuff. Many people in this camp don't realize that people working on 1 really do need to know this stuff.

What someone says on this topic says more about what things they have worked on in their life than anything else.

uncivilized•43m ago

I already know the answer to this, but did you read the article? Ned addresses your concerns.

danielmarkbruce•15m ago

No, he doesn't. He doesn't discuss the gigantic dividing line between the two different types of systems I categorize above. He also doesn't cover the "feel it in your bones" required in the type 1 systems. Spend a minute reading or listening to Jeff Dean talk, and you'll see what is required to build those types of systems. Spend some time somewhere working on those systems and you'll come across some folks who just have this ready to go and can apply it and the drop of a hat.

ecshafer•41m ago

> What someone says on this topic says more about what things they have worked on in their life than anything else.

This is the crux of the debate. If you work on CRUD apps, you basically need to know hash maps, and lists, but getting better at SQL and writing clean code is good. But there are many areas where writing the right code vs the wrong code really matters. I was writing something the other day where one small in loop operation was the difference betweeen a method running in miliseconds and minutes. Or choose the right data structure can simplify a feature into 1/10th the code and makes it run 100x better than the wrong one.

MoreQARespect•10m ago

This happens to me too it just happens roughly 100x less than me needing to know how to test properly.

It's never the other day it's 10x a day, every day.

So, OP is still correct.

evrydayhustling•39m ago

This is so true. When you get DSA wrong, you end up needing insanely complex system designs to compensate -- and being great at Testing just can't keep up with the curse of dimensionality from having more moving parts.

mlinhares•13m ago

That's going to be true in all fields, people think their experiences are the only valid experiences and everyone else must think and work on what they think is important, otherwise they're wrong.

karmakaze•1h ago

The context what you should spend time to learn starting out. TL;DR

> Here is what I think in-the-trenches software engineers should know about data structures and algorithms: [...]

> If you want to prepare yourself for a career, and also stand out in job interviews, learn how to write tests: [...]

I feel like I keep writing these little context comments to fix the problem of clickbait titles or those lacking context. It helps to frame the rest of the comments which might be coming at it from different angles.

atmavatar•1h ago

The title is unfortunately more than a little irresponsible, considering it's the norm for many (most?) to read only the title.

There is no dichotomy here: you need to know testing as well as data structures and algorithms.

However, the thrust of the article itself I largely agree with -- that it's less important to have such in-depth knowledge about data structures and algorithms that you can implement them from scratch and from memory. Nearly any modern language you'll program in includes a standard library robust enough that you'll almost never have to implement many of the most well-known data structures and algorithms yourself. The caveat: you still need to know enough about how they work to be capable of selecting which to use.

In the off-chance you do have to implement something yourself, there's no shortage of reference material available.

jerf•1h ago

This is one of the things I'd tune in the current curriculum.

When I went to college in the late 1990s, we were right on the verge of a major transition to DSAs being something every programmer would implement themselves to something that you just pick up out of your libraries. So it makes sense that we would have some pretty heavy-duty labs on implementing very basic data structures.

That said, I escaped into the dynamic programming world for the next 15 years or so, so I almost never actually did anything of significance with this. And now even in the static world, I almost never do anything with this stuff directly because it's all libraries for them now too. Even a lot of modern data structures work is just using associative maps and arrays together properly.

So I would agree that we could A: spend somewhat less time on this in the curriculum and B: tune it to more about how to use arrays and maps and less about how to bit bang efficient hash tables.

People always get frosty about trying to remove or even "tune down" the amount of time spent in a curriculum, but consider the number of things you want to add and consider that curricula are essentially zero-sum games; you can't add to them without removing something. If we phrase this in terms of "what else could we be teaching other than a fifth week on pointer-based data structures" I imagine it'll sound less horrifying to tweak this.

Not that it'll be tweaked, of course. But it'd be nice to imagine that I could live in a world where we could have reasonable discussions about what should be in them.

fastaguy88•16m ago

It really depends. Working on genome analysis, I once encountered/interrupted (by rebooting after a software update) a student who had been running an analysis for more than a week, because they had not pre-sorted the data. With pre-sorted data, it took a few minutes.

Not everyone works on web sites using well-optimized libraries; some people need to know about N and Nlog(N) vs N^2.

matheusmoreira•14m ago

> Of course some engineers need to implement hash tables, or sorting algorithms or whatever.

> We love those engineers: they write libraries we can use off the shelf so we don’t have to implement them ourselves.

The world needs to love "infrastructure developers" more. To me it seems only the killer app writing crowd is valued. Nobody really thinks about the work that goes into programming languages, libraries and tools. It's invisible work, taken for granted, often open source, not rarely unpaid.