quality assurancetesting

Software Testing Beyond Unit Testing – the Bigger Picture

Software testing beyond unit testing. Image source: http://maxpixel.freegreatpicture.com/Automation-Software-Bugs-Search-Service-Testing-It-762486Each time I write about unit testing, people get angry. So let me explain why I rant about unit tests every once in a while.

First of all, there’s nothing wrong with unit testing. That’s not the point. Usually, I start ranting when I notice people believe in unit tests religiously. Many good developers seem to believe that unit testing is the only test tool they ever need. Both agile programming and continuous delivery make us believe such a nonsense.

There’s so much more to testing than just unit tests. This post gives you a short overview about different test techniques. It’s not an in-depth explanation of these approaches to testing. Instead, it focuses on telling you why they are important. You’ll see there are many important tests that can’t be automated at all. You’ll always need human beings to test your software.

Testing is an art of its own

In my eyes, testing is one of the most important jobs in the software industry. At the same time, it’s underrated. Usually, testers are paid less than programmers. Adding insult to injury, they suffer from a bad reputation. Programmers create things. Testers destroy them. Well, they don’t, but testing feels like a destructive job. The more faults a tester finds, the better. I’ll never forget my favorite tester approaching me shyly, embarrassed by having to tell me I’ve done it wrong again. In reality, I was happy she found the bug before the customer did.

Cutting a long story short, I believe testing is every bit as valuable and as important as programming. A good tester intuitively knows where to look. They know the weak spots. Maybe they even test differently if they know who wrote the code. If I had a say in it, I’d pay good testers at least as good as good programmers.

What is testing? What can be automated?

In other words, I’m pretty much convinced there’s nothing that beat human intuition. Automated tests are valuable, no doubt about that. But they are only a part of the story. They cover errors we’ve encountered or imagined in the past. Automated tests are not creative. They can’t think about new error scenarios. That’s the domain of human beings.

So let’s form a test team, and forget about unit tests!

When I joined the first company I worked for, the ideas of unit testing and test driven design had yet to be invented. Predecessors of these ideas go back to the 1960s, and people like Kent Beck were already using a test driven approach, but at our company, nobody had heard about it yet. Instead, we focused on human testers. In average, there was one tester for each developer. We delivered a new version of our software every three months. A month before publishing the release, we declared a code freeze and our testers started doing their magic.

In the early days, they simply followed their intuition. They’d read the requirements, thought of corresponding test scenarios and walked through them.

In general, this approach worked very well. As a side effect, our testers became very familiar with the software. So they could do the first-level support. In the remaining time, when there was no angry customer on the phone and there was no test to perform, they wrote the user manual. From my perspective, the approached worked well enough.

Code regressions

However, we fought with bugs we’d already solved. For some reason, they popped up again in each new version. At other times, new features broke code we’d already tested and shipped. So we learned the hard way how important regression tests are.

After a while, the test department team lead came up with test plans and check lists. Testing became much more professional and boring. Most of the time, testing wasn’t a creative job. It was simply following rules and filling checklists.

Test automation

In other words, we’d entered the age of test automation. Our technology wasn’t ready for leaving the chore of running the automated tests to the computer, so it was still performed by human beings. But we started investigating the topic. One or two years later, Kent Beck and Martin Fowler published their book on extreme programming, and we knew what we’d missed. It’s obvious that automated tests work better if they’re not an afterthought. You have to embrace TDD to be successful!

Unfortunately, the idea of automated unit testing has a lot of consequences. You have to write your code with testing in mind. The client-server applications we wrote in the 90s couldn’t be tested automatedly. Our team was very modern, with a clear separation between UI, business layer, and persistence layer. Even so, we couldn’t run the business layer without the rest of the program. With the beginning of the millennium, we started writing web applications. This kind of applications separated the three layers of the application much more rigorously, so unit testing became an option.

Explorative testing

Even so, each time our CEO manage to find five minutes for looking at our program, he’d find half a dozen errors. That’s what we call explorative testing. Our CEO wasn’t restricted by our checklist. He’d simply look at the program, try to perform a task, and start sobbing. In many cases, he’d found a real bug. But more often than not, he’d found a bug that wasn’t a code quality problem. We’d implemented a customer’s requirement without thinking twice. Sometimes this requirement didn’t match the requirements of another customer using the same software. Or the user experience was bad. Often we had implemented the feature in such a convoluted way that you need a special training to do the job.

This is why you need human beings for testing. Even if you’re writing excellent unit tests, they’ll never ponder about if the algorithm under test makes sense. There are different definitions of “correct”. Your unit test checks whether the algorithm does what you wanted it to do. It can’t test whether your customer wants it to do the same thing. And by no means can it respond to other changes, such as new laws. My original algorithm calculating the annual percentage rate of a loan is still correct, but nowadays German law required banks to calculate it by a different algorithm. By this definition, my algorithm is wrong.

There’s so much more than can be put under test!

It’s high time to think about what testing is and what can be tested. Some time ago, Itamar Turner-Trauring showed this diagram in his talk about the big picture of software testing:

Now you know why I tend to rant about unit testing. Unit tests live in the lower right quadrant. They prevent change. But I’m a consultant dealing a lot with people. So my favorite residence is the upper left quadrant. This is the source of change. As to my experience, requirements are not constant. They change all the time. In many cases, they change even during the development of the program.

When this happens, a common reaction of development teams is to get angry. “The customer doesn’t know what they want”, they grumble. But that’s rubbish. It’s perfectly OK to learn when the program is developed. More often than not, the first approach has a lot of headroom to improve. So let’s embrace change. Unit tests prevent that by definition.

So test driven development doesn’t work for me. I’m not a backend programmer. As a rule of thumb, front-ends change at a faster pace than the backend. You may stick to a database for twenty years and longer. The business rules defining the backend algorithms tend to be valid for decades, too. But front-ends are the visible part of the application. Users work with them on a daily basis. So they are either subject to fashion (if your application runs on the public internet) or they are the best leverage to improve the productivity of the company. Plus, everybody feels competent about user interfaces and user experience, so you’ll get change requests all the time.

Dimensions of testing

Looking at Itamar’s sketch, you’ll notice there are two dimensions of testing. One dimension is human testing vs. automated testing. The other dimension is about learning.

Most people assume that testing is the act of affirming that the program complies with a set of rules. The sheer number of tools and techniques developed for the lower quadrants show how important this class of tests is. There are code reviews, linters, compiler checks, and static code analysis ensuring good code quality. There are penetration tests to ensure security from hackers. We’ve already talked about unit tests. Tools like Selenium even allow you to test the user interface.

The upper half of the diagram is more exciting. These are the tests you can’t implement using a test-first approach. On the right-hand side are the tests examining the behavior of the program. How does it behave if it’s put under stress? This is done by soak tests and stress tests. Another important source of information are the error logs. Don’t forget to check the logs regularly. Depending on the corporate culture, you’ll find many errors there long before the first user takes the pain to report it.

Users matter!

Even such a tool as Piwik may be used for tests. Of course, Piwik is a web analytics tools, so it won’t detect errors. But it shows what the users do with the program. How often do they use a certain program feature? Do they use it at all? Can you safely remove the feature from the application?

Now we’ve reached the upper left quadrant. In a way, that’s where we’d started. If you’re following an agile approach, you’ve defined acceptance criteria before writing the program. Now it’s time to check the acceptance criteria. While this is the realm of unit tests, the more important question is whether the user accepts the program. Mind you, that’s not necessarily the same as guaranteeing that the program fulfills the acceptance criteria defined by the same user.

Other important tests deal with the usability of the application. A good way to do this is to simply watch users working with the program. If you’re writing a web shop, you might try an A/B test. Only half of the customers see the new version of the program. Which version of the program generates more revenue?

Starting over

You’ve probably noticed the arrows. Testing is nothing you do once-and-for-all. If you take the user feedback loop seriously, every new version of your program generates new insight on how to do things better. In reality, budget constraints usually terminate the loop at some point. However, a wiser man than me once said: if there’s a software without bugs, it’s probably so simple it wasn’t worth writing in the first place.

Test techniques explained in depth

At this point, it’s high time to explain the test techniques: load tests, MVP tests, guerilla tests, compiler checks, linters, and – yes, that too! – unit tests. This is the topic of the next article in this series.

Wrapping it up

Automated tests are valuable, no doubt about that. But they are only a part of the story. They cover errors we’ve encountered or imagined in the past. Automated tests are not creative. They can’t think about new error scenarios. Quite the contrary: they prevent change. However, modern software development usually is part of the digitization story, which is all about embracing change. So we need the cleverness and intuition of human beings.

It takes all kinds of tests to write good software.

Dig deeper

Wikipedia on Software testing
Original post of Itamar Turner-Trauring (now replaced by a link to the talk)
Itamar’s talk about the big picture of software testing