In our previous article, we have discussed What Testing is and what Testing Does?
Stage error is found Comparative cost Requirements $1 Coding $10 Program testing $100 System testing $1,000 User acceptance testing $10,000 Live running $100,000
Testing is a very complex activity, and the software problems described earlier highlight that it can be difficult to do well. We now describe some general testing principles that help testers, principles that have been developed over the years from a variety of sources. These are not all obvious, but their purpose is to guide testers, and prevent the types of problems described previously.
Testing Shows the Presence of Bugs
Running a test through a software system can only show that one or more defects exist. Testing cannot show that the software is error free. Consider whether the top 10 wanted criminals website was error free. There were no functional defects, yet the website failed. In this case the problem was non-functional and the absence of defects was not adequate as a criterion for release of the website into operation.
In our coming posts, we shall discuss retesting, when a previously failed test is rerun, to show that under the same conditions, the reported problem no longer exists. In this type of situation, testing can show that one particular problem no longer exists.
Although there may be other objectives, usually the main purpose of testing is to find defects. Therefore tests should be designed to find as many defects as possible.
Exhaustive Testing is Impossible
If testing finds problems, then surely you would expect more testing to find additional problems, until eventually we would have found them all. We discussed exhaustive testing earlier when looking at the Ariane 5 rocket launch, and concluded that for large complex systems, exhaustive testing is not possible. However, could it be possible to test small pieces of software exhaustively, and only incorporate exhaustively tested code into large systems?
Exhaustive testing – a test approach in which all possible data combinations are used. This includes implicit data combinations present in the state of the software/data at the start of testing.
Consider a small piece of software where one can enter a password, specified to contain up to three characters, with no consecutive repeating entries. Using only western alphabetic capital letters and completing all three characters, there are 26 × 26 × 26 input permutations (not all of which will be valid). However, with a standard keyboard, there are not 26 × 26 × 26 permutations, but a much higher number, 256 × 256 × 256, or 224. Even then, the number of possibilities is higher. What happens if three characters are entered, and the ‘delete last character’ right arrow key removes the last two? Are special key combinations accepted, or do they cause system actions (Ctrl + P, for example)? What about entering a character, and waiting 20 minutes before entering the other two characters? It may be the same combination of keystrokes, but the circumstances are different. We can also include the situation where the 20-minute break occurs over the change-of-day interval. It is not possible to say whether there are any defects until all possible input combinations have been tried.
Even in this small example, there are many, many possible data combinations to attempt.
Unless the application under test (AUT) has an extremely simple logical structure and limited input, it is not possible to test all possible combinations of data input and circumstances. For this reason, risk and priorities are used to concentrate on the most important aspects to test. Both ‘risk’ and ‘priorities’ are covered later in more detail. Their use is important to ensure that the most important parts are tested.
Early Testing
When discussing why software fails, we briefly mentioned the idea of early testing. This principle is important because, as a proposed deployment date approaches, time pressure can increase dramatically. There is a real danger that testing will be squeezed, and this is bad news if the only testing we are doing is after all the development has been completed. The earlier the testing activity is started, the longer the elapsed time available. Testers do not have to wait until software is available to test.
Work-products are created throughout the software development life cycle (SDLC). As soon as these are ready, we can test them. Requirement documents are the basis for acceptance testing, so the creation of acceptance tests can begin as soon as requirement documents are available. As we create these tests, it will highlight the contents of the requirements. Are individual requirements testable? Can we find ambiguous or missing requirements?
Many problems in software systems can be traced back to missing or incorrect requirements. The use of reviews can break the ‘error—defect—failure’ cycle. In early testing we are trying to find errors and defects before they are passed to the next stage of the development process. Early testing techniques are attempting to show that what is produced as a system specification, for example, accurately reflects that which is in the requirement documents. Ed Kit (Kit, 1995) discusses identifying and eliminating errors at the part of the SDLC in which they are introduced. If an error/defect is introduced in the coding activity, it is preferable to detect and correct it at this stage. If a problem is not corrected at the stage in which it is introduced, this leads to what Kit calls ‘errors of migration’. The result is rework. We need to rework not just the part where the mistake was made, but each subsequent part where the error has been replicated. A defect found at acceptance testing where the original mistake was in the requirements will require several work-products to be reworked, and subsequently to be retested.
Comparative cost to correct errors
This is known as the cost escalation model.
What is undoubtedly true is that the graph of the relative cost of early and late identification/correction of defects rises very steeply as shown in below figure.
Defect Clustering
Problems do occur in software! It is a fact. Once testing has identified (most of) the defects in a particular application, it is at first surprising that the spread of defects is not uniform. In a large application, it is often a small number of modules that exhibit the majority of the problems. This can be for a variety of reasons, some of which are:
- System complexity.
- Volatile code.
- The effects of change upon change.
- Development staff experience.
- Development staff inexperience.
The Pesticide Paradox
Running the same set of tests continually will not continue to find new defects. Developers will soon know that the test team always tests the boundaries of conditions, for example, so they will test these conditions before the software is delivered. This does not make defects elsewhere in the code less likely, so continuing to use the same test set will result in decreasing effectiveness of the tests. Using other techniques will find different defects.
For example, a small change to software could be specifically tested and an additional set of tests performed, aimed at showing that no additional problems have been introduced (this is known as regression testing). However, the software may fail in production because the regression tests are no longer relevant to the requirements of the system or the test objectives. Any regression test set needs to change to reflect business needs, and what are now seen as the most important risks.
Testing is Context Dependent
Different testing is necessary in different circumstances. A website where information can merely be viewed will be tested in a different way to an e-commerce site, where goods can be bought using credit/debit cards. We need to test an air traffic control system with more rigour than an application for calculating the length of a mortgage.
Risk can be a large factor in determining the type of testing that is needed. The higher the possibility of losses, the more we need to invest in testing the software before it is implemented.
For an e-commerce site, we should concentrate on security aspects. Is it possible to bypass the use of passwords? Can ‘payment’ be made with an invalid credit card, by entering excessive data into the card number? Security testing is an example of a specialist area, not appropriate for all applications. Such types of testing may require specialist staff and software tools.
Absence of Errors Fallacy
Software with no known errors is not necessarily ready to be shipped. Does the application under test match up to the users' expectations of it? The fact that no defects are outstanding is not a good reason to ship the software.
Before dynamic testing has begun, there are no defects reported against the code delivered. Does this mean that software that has not been tested (but has no outstanding defects against it) can be shipped? We think not!
To check your understanding, I would again like to ask you some questions:
Why is ‘zero defects’ an insufficient guide to software quality?
Give three reasons why defect clustering may exist.
Briefly justify the idea of early testing.
You may follow the complete series of Fundamentals of Testing articles here:
Why a Software Fails?
Keeping Software Test Under Control
What Testing is and What Testing Does
Software Testing Principles
Fundamental Software Test Processes
Psychology of Software Testing
Testers Code of Ethics
ISTQB Sample Questions
To see all articles of ISTQB-ISEB Foundation guide, see here:
Software Testing-ISTQB ISEB Foundation Guide
Post a Comment