In our previous article, we have discussed Why a Software Fails?

With all of the examples we have seen so far in our previous posts, what common themes can we identify? There may be several themes that we could draw out of the examples, but one theme is clear: either insufficient testing or the wrong type of testing was done. More and better software testing seems a reasonable aim, but that aim is not quite as simple to achieve as we might expect.

Exhaustive Testing of Complex Systems is Not Possible

With the Ariane 5 rocket launch, a particular software module was reused from the Ariane 4 program. Only part of the functionality of the module was required, but the module was incorporated without changes. The unused functionality of the reused module indirectly caused a directional nozzle to move in an uncontrolled way because certain variables were incorrectly updated. In an Ariane 4 rocket the module would have performed as required, but in the Ariane 5 environment this malfunction in an area of software not even in use caused a catastrophic failure. The failure is well documented, but what is clear is that conditions were encountered in the first few seconds after the launch that were not expected, and therefore had not been tested.

If every possible test had been run, the problem would have been detected. However, if every test had been run, the testing would still be running now, and the ill-fated launch would never have taken place; this illustrates one of the general principles of software testing, which are explained below. With large and complex systems it will never be possible to test everything exhaustively; in fact it is impossible to test even moderately complex systems exhaustively.

In the Ariane 5 case it would be unhelpful to say that not enough testing was done; for this particular project, and for many others of similar complexity, that would certainly always be the case. In the Ariane 5 case the problem was that the right sort of testing was not done because the problem had not been detected.

Testing and Risk

Risk is inherent in all software development. The system may not work or the project to build it may not be completed on time, for example. These uncertainties become more significant as the system complexity and the implications of failure increase. Intuitively, we would expect to test an automatic flight control system more than we would test a video game system. Why? Because the risk is greater. There is a greater probability of failure in the more complex system and the impact of failure is also greater. What we test, and how much we test it, must be related in some way to the risk. Greater risk implies more and better testing.

Testing and Quality

Quality is notoriously hard to define. If a system meets its users' requirements that constitutes a good starting point. In the examples we looked at earlier the online tax returns system had an obvious functional weakness in allowing one user to view another user's details. While the user community for such a system is potentially large and disparate, it is hard to imagine any user that would find that situation anything other than unacceptable. In the top 10 criminals example the problem was slightly different. There was no failure of functionality in this case; the system was simply swamped by requests for access. This is an example of a non-functional failure, in that the system was not able to deliver its services to its users because it was not designed to handle the peak load that materialized after radio and TV coverage.

Of course the software development process, like any other, must balance competing demands for resources. If we need to deliver a system faster (i.e. in less time), for example, it will usually cost more. The items at the corners (or vertices) of the triangle of resources in the below figure are time, money and quality. These three affect one another, and also influence the features that are or are not included in the delivered software.

Resources Triangle

One role for testing is to ensure that key functional and non-functional requirements are examined before the system enters service and any defects are reported to the development team for rectification. Testing cannot directly remove defects, nor can it directly enhance quality. By reporting defects it makes their removal possible and so contributes to the enhanced quality of the system. In addition, the systematic coverage of a software product in testing allows at least some aspects of the quality of the software to be measured. Testing is one component in the overall quality assurance activity that seeks to ensure that systems enter service without defects that can lead to serious failures.

Deciding When ‘Enough is Enough’

How much testing is enough, and how do we decide when to stop testing?

We have so far decided that we cannot test everything, even if we would wish to. We also know that every system is subject to risk of one kind or another and that there is a level of quality that is acceptable for a given system. These are the factors we will use to decide how much testing to do.

The most important aspect of achieving an acceptable result from a finite and limited amount of testing is prioritization. Do the most important tests first so that at any time you can be certain that the tests that have been done are more important than the ones still to be done. Even if the testing activity is cut in half it will still be true that the most important testing has been done. The most important tests will be those that test the most important aspects of the system: they will test the most important functions as defined by the users or sponsors of the system, and the most important non-functional behavior, and they will address the most significant risks.
The next most important aspect is setting criteria that will give you an objective test of whether it is safe to stop testing, so that time and all the other pressures do not confuse the outcome. These criteria, usually known as completion criteria, set the standards for the testing activity by defining areas such as how much of the software is to be tested.

Priorities and completion criteria provide a basis for planning (which will be covered in above figure applies. In the end, the desired level of quality and risk may have to be compromised, but our approach ensures that we can still determine how much testing is required to achieve the agreed levels and we can still be certain that any reduction in the time or effort available for testing will not affect the balance—the most important tests will still be those that have already been done whenever we stop.

To check your level of understanding, I would like to ask few questions now.

1. Describe the interaction between errors, defects and failures.
2. Software failures can cause losses. Give three consequences of software failures.
3. What are the vertices of the ‘triangle of resources’?

You may follow the complete series of Fundamentals of Testing articles here:

Why a Software Fails?
Keeping Software Test Under Control
What Testing is and What Testing Does
Software Testing Principles
Fundamental Software Test Processes
Psychology of Software Testing
Testers Code of Ethics
ISTQB Sample Questions