Prepared By:
Steven J. Offenbacher
217-60-8270
Prepared For:
Philip Hausler
The Johns Hopkins University
Establishment of a test team and their involvement early in the development process has proven to be a key element which leads to a successful and usable product. Unfortunately many software and system developers fail to understand the importance and benefits of testing early in the development life cycle. Management fails to understand and accept the cost impact to both schedule and budget. History has shown that involvement of the test team during the early stages helps define requirements, clarifies interfaces, identifies dependencies and assists in risk identification and analysis. Many potentially successful projects have failed as a result of poorly tested requirements or implementation of untestable requirements. This paper discusses how the early testing aids in the generation of useable software products.
Table of Contents
3. Measurable Costs, Unmeasurable Savings
4. Planning to Test, Test to the Plan
6. Early to Test, Early to Resolve
7. Avoiding the Automated Testing Trap
9. The Phased Inspection Technique
10. Usability Testing Technique
List of Tables
Table 1. An Improved Inspection Process
Table 2. Improved Inspection Process Return on Investment
Table 3. The Phased Inspection Process
Table 4. Phase Inspection Process Results
List of Figures
Figure 1. Down Counter Code Error Extract
Figure 2. Data Base Error Extract
Figure 3. Corrected Down Counter Code Extract
Figure 4. Corrected Database Extract
Figure 5. Full Test Coverage Example Statistics
Software testing is frequently viewed as a necessary evil. Part of this ill feeling can be attributed to poor planning. Delaying testing until late in the software development cycle contributes to this unfavorable view of testing as developers and testers struggle to meet schedule.
The objective of this paper is to convince the reader of the value of testing from the beginning of the development cycle rather than later in the cycle. Numerous publications can be found which concur with this philosophy. The consensus in the industry is that there is a direct correlation which relates the cost of detecting and correcting a fault, with the timing of identifying the fault. Simply stated, the earlier that a fault is detected and removed, the cheaper it is to fix. The benefit of early testing in the development cycle is the removal of the faults and potentially fault causing situations so that they never appear in the delivered product. Thus it should become obvious that the goal of software engineers is to start the testing process as early in the development cycle as possible.
Section 1 presents opening remarks and the objective of this paper. Section 2 identifies the major sections. Section 3 discusses the visibility of early testing. Section 4 defends planning as a test activity. Section 5 presents information about common test deficiencies. Section 6 discusses the benefits of early testing. Section 7 warns against dependency on automated testing. Section 8 explains the utility of inspections as a test activity. Section 9 presents information on an alternate inspection technique. Section 10 discusses the utility of useability testing. Section 11 offers concluding remarks.
3. Measurable Costs, Unmeasurable Savings
Generally, it is easier to quantify the cost of detecting, correcting, and redeploying a software product once it has been released. Industry has found that an increasing portion of their software project budget is dedicated to the maintenance phase. Of that category, the majority of the funds are spent for fault detection and removal rather than product enhancement. It is harder, if not impossible in some cases, to quantify the economic cost benefit from early testing. There are many reasons for this view.
First, the location of the fault has an impact on its effect. If the fault rarely occurs, then one could assert that the fault has little impact on the products success and therefore has little impact on costs. On the other hand, the fault could be so serious that it has a rather large economic impact if it occurs even though it exists in a section of the product that is used relatively little. Consider the following lengthy but pertinent example from personal experience. Although NASA's shuttle program is considered meet the level 5 Software Engineering Institue's Capability Maturity Model rating, the example shows that the same matured development process is not employed throughout the organization [4].
NASA's Tropical Rainfall Measuring Mission, a class B 500 million dollar satellite and the last of NASA's large earth observation spacecraft, has been in development for more than 7 years. The satellite is equipped with various experiment hardware, movable solar array panels to charge the on-board batteries and provide power during daytime orbits, movable antennas for communications with the ground, various sensor used to determine orbit orientations and numerous actuators to control, change and correct orbit orientation. All of this is controlled by the attitude control software.
When the satellite is packaged within the shroud of the launch vehicle for insertion and delivery into orbit, the solar array panel, measuring more than 20 feet each in the deployed configuration, are folded and bolted in a launch configuration using exploding bolts known as pyro bolts. The design of the spacecraft is to have the pyro bolts blow off after the satellite has been deployed from the launch vehicle. Once the pyro bolts are gone, the solar array panels deploy.
While the solar array panels are folded and bolted, they cannot be moved since doing so would damage the drive assembly motors. The attitude control software receives a hardware indication that the spacecraft has separated from the launch vehicle. From this point on, the only mechanism preventing the attitude control software from commanding the solar arrays is a software counter that is initialized with a ground command. This counter is used as a down-counter which decrements at a 2Hz rate. The design of the system is such that this down-counter command is sent once because is it only has meaning at launch. When the down-counter finally decrements to zero, the attitude control software begins to command the solar array panel so they are pointed a the sun. During integration, it was discovered that a software bug could allow the attitude control software to command the solar array panels while they are still partially or even fully bolted.
What was the software bug? An inspection of the code identified two simple errors that together prevented each other from causing catastrophic results. The first error was identified in the attitude determination and control software's command handler. During ground command verification, the code used the wrong operator (shown in italics) when performing bounds checking on the down-counter parameter value in the ground command as shown in figure 1 below.
The second half of this fault was caused by a definition error (shown in italics) in the ground command database that is used to construct ground commands. Specifically, the database defined the counter maximum value as one less than the actual maximum value. The database extract is shown in figure 2.
Figures 3 and 4 show the corrected command processing code and database entry. Together, these errors remained hidden through three years of testing. The error in the database prevented the error in the code from being detected since that code could never be stimulated during testing.
#define DOWN_COUNTER_MAXIMUM 65535
switch (command_type) {
case SET_SEPARATION_COMMAND:
if (command_parameter < DOWN_COUNTER_MAXIMUM) then
down_counter = command_parameter;
else
NotifyFDCManager(DOWN_COUNTER_FAILURE, command_parameter);
down_counter = 0;
endif
break;
Figure 1. Down Counter Code Error ExtractFigure
CMDS ACSETSEPFG DESC="Set Separation Down Counter", FCTN=05
UI SEPCOUNT DESC="H II Separated Counter", MIN=0, MAX=65534
END
Figure 2. Data Base Error ExtractFigure
#define DOWN_COUNTER_MAXIMUM 65535
switch (command_type) {
case SET_SEPARATION_COMMAND:
if (command_parameter <= DOWN_COUNTER_MAXIMUM) then
down_counter = command_parameter;
else
NotifyFDCManager(DOWN_COUNTER_FAILURE, command_parameter);
down_counter = 0;
endif
break;
Figure 3. Corrected Down Counter Code ExtractFigure
CMDS ACSETSEPFG DESC="Set Separation Down Counter", FCTN=05
UI SEPCOUNT DESC="H II Separated Counter", MIN=0, MAX=65535
END
Figure 4. Corrected Database ExtractFigure
If the database error had been detected and corrected prior to launch, and the command processing code error was not detected and corrected, then the setting the down-counter to maximum values using the corrected ground command would allow the command processing error to occur.
The activation of the error code, which could only happen because of operator coding error, could cause the attitude control software to attempt to drive the solar arrays at the wrong time resulting in permanent damage. Without the ability to move and point the arrays at the sun, the on-board batteries would drain and the mission would abruptly end. From this example, we can see how devastating effect of this error in infrequently used code.
Secondly, the cost saving resulting from early testing is hard to quantify because its hard to estimate the effect of propagating something that has been removed early or which has been prevented from occurring. This would be analogous to estimating the cost saving resulting from the installation of uninterruptable power supplies on every system in an office building. We don't know if a power problem is going to occur, and if it does, we don't know if it will cause any damage or the extent of the damage.
4. Planning to Test, Test to the Plan
As the staple software development paradigm, we find that the waterfall model fails to accurately represent reality. Few projects are able to progress from specification through implementation without employing some variation of the spiral model. This is driven by the fact that early in the development process, specifications and requirements are not fully identified or understood. Realizing this, we admit that faults may be introduced at any step of the development process. Obviously software development must progress even with these short comings. For these reasons, testing and test planning must be applied at every step of the development process so that the product and its tests are developed hand-in-hand. Just as poor planning precedes software development errors, we also find that poor test planning causes many testing problems [7].
The purpose of testing is twofold. First, Layman briefly defines testing as being concerned with finding defects that have already been committed to code and about preventing them from getting there in the first place [7]. I feel that the second major objective of testing to not simply showing the correctness of the software or how it reacts to invalid or out-of-bounds data, but equally important is showing the incorrectness of software. In other words, we succeed at effective software testing when we are able to prove incorrect operation and fail at testing when we are unable to discover any faults. As a professional software tester for the past three years, I cannot over emphasize this last point. I continue to find defects in software that was declared to be tested years earlier. These discoveries causes me to wonder how many other defects are awaiting the opportunity to be recognized.
We in industry now recognize that it is impossible to test everything. Calculations of the time required to fully test a simple program are astronomical. As Figure 5 shows, even with tomorrow's powerful platforms will be insufficient to fully test even the simplest of software.
int procedure(int A, int B)
{
return(a + b);
}
Given that the size of an integer is 32 bits, there are 232 possible input
value for each A and B. This leads to 264 or 1.84*1019 possible inputs.
Assuming that 1,000,000 tests are executed each second, it would take 584,942
years to execute all the tests.
*Source Unknown
Figure 5. Full Test Coverage Example StatisticsFigure
Therefore, for testing to be sufficient and effective, we must plan to test. In planning, we decide what will be tested, why it must be tested and how it will be tested. During the planning stage, an early development process activity, areas of risk are identified so that testing can be concentrated on area with a high likelihood of failure or a high cost as a result of failure [7].
As Layman states, "Test planning is itself a form of testing" [7]. Test planning and the involvement of the test team during the planning stages of the project are an invaluable asset. Testers view specifications and the software from the perspective of its testability whereas developers view specifications from the perspective of performance issues and implementation details. In their quest, testers help identify where development criteria are lacking, where specification might be inconsistent or wrong, and help to clarify requirements and design issues [7]. Not only does this help to make the product testable, it provides an opportunity to build in quality. As Eldon Yi states in the concluding remarks on Software Testing In A System Development Process: A Life Cycle Perspective, "To build a quality software product, one must proceed with a careful requirements analysis followed by an effective system design - rather than purposely rely on software testing and debugging to catch and remove the errors that originate from the analysis and design process" [8].
Regardless of the amount and depth of testing, or even the number of times a piece of software has successfully executed, it must be assumed that it contains errors. In fact, during the research phase of this paper, I detected an error with a bespoke software product that had been under test and implementation for the past two years. Due to the dynamics of the environment that this software executes under, the specific environmental conditions had not been emulated by the test system allowing the software fault to remain undetected.
Consider the flawed mirror problem with NASA's $4 billion Hubble Space Telescope preventing sharply focused images and risking mission success goals. Instead of performing actual ground-based tests, computer simulations, believed to be more cost-effective, were used. As has always been the case in the computer field, garbage in results in garbage out. This certainly holds true since bad simulation input data resulted in corrupted and meaningless test results [2]. Fortunately, the Hubble was designed to be serviceable with shuttle missions, so the problem could be corrected, but only after an expensive re-engineering effort.
Orbital Sciences Corporation, a pioneer and leader in the high risk commercial aerospace business, has experience the devastating effect of an undetected software defect first hand. After successfully developing and launching the Pegasus expendable launch vehicle, the need for greater payload capacity was realized. To speed time to market and reduce development costs, computers were used to model aerodynamic characteristics of the extended Pegasus launch vehicle instead of costly wind tunnel testing. Known as the Pegasus XL, its first launch was a failure. Unable to control the attitude of the launch vechile, it was necessary to destroy the launch vechile which resulted in destruction and loss of the payload. An investigation revealed that this multi-million dollar loss was the result of a software defect in the modeling software used to test and validate the extended launch vechile design.
NASA's Tropical Rainfall Measuring Mission (TRMM) utilizes multiple embedded processors, each with specific and unique functions. One of these functions is the attitude, determination and control of the entire spacecraft, which is a collection of highly complex algorithms that are executed within the Attitude Control System (ACS). To test the ACS software, a dynamic simulation/stimulation bespoke system was developed. With unconstrained resources, its models emulate the spacecraft's operational environment of space and duplicates the calculations performed by the ACS. In order to maintain contact with the ground, the ACS must control the position of the spacecraft so it can point a high-gain antenna at one of four geostationary relay communication satellites. The calculations performed by the dynamic simulation system (DSS) are considered the truth models. These results are compared with the ACS calculation results to validate ACS software compliance with requirements. Due to differences in architecture, the processing order of the models in the ACS and the DSS differ. Because the DSS was considered a test system, it was not subjected to the same performance and rigorous testing requirements as the ACS. Throughout testing of the ACS position control algorithm, the ACS and DSS results varied, however they were accepted to be valid since they where within the allowable error tolerances. This resulted in adjustment of initially engineered ACS error tolerances based on the DSS results which were considered to be output from truth models. During performance testing where the ACS position control algorithm is executed for 7 days, the difference errors between the ACS and DSS grew exponentially resulting in violation of the error tolerances. An analysis found a software defect in the DSS position control truth model. After correction, the ACS and DSS results agreed. The net effect was a complete regression test cycle resulting in a re-evaluation of the ACS error tolerances and a three month schedule slip.
These real-life experiences show the importance of having accurate tests and validated test software. Each of these situations described above were a result of insufficient tests which helps to highlight five common test-related characteristics that I've experienced as a tester. These five test characteristics are:
1) Tests fail to use a representative sample of the actual data to be found in the product's domain
2) Tests often fail to represent the actual nominal operational environment found in the product's domain
3) Tests fail to stimulate actual maximum load conditions found in the product's domain
4) Tests fail to stimulate actual timing conditions found in the product's domain (especially true in embedded systems)
5) Test software is not developed with the same rigor as the production software it is designed to test.
As I have found from my testing experience, testers need to be cognizant of the test system's origin. When dealing with custom test fixtures, failure of a test or receipt of unexpected test results cannot automatically be assumed to be a defect in the software under test. Although schedules usually do not make it possible to validate all test cases in detail, a quick check for the five test characteristics will function as a sanity check and help avoid corrupted tests and tests results.
6. Early to Test, Early to Resolve
One of the goals of software testing is verification which verifies that the product successfully fulfills the requirements. The exponential growth of software functionality and complexity has forced the maturation of the testing process. In the early years of software development, testing was viewed solely as a code debug activity. Traditionally, testing occurred after the coding effort [5]. Testing at the software requirements level was simply not done. After years of software development experience, industry is finally embracing the value of testing and test planning during early project stages. Hanna attributes this shift in test philosophy to the increased costs of bug fixes in the form of product maintenance once the product has been deployed into the market place [5]. In addition, the negative reputation from a buggy product could have a significant financial impact on the company and its success.
Software suffers from the "plurity of goals" phenomenon. Current and experimental software development processes are making progress in the area of requirements conflict identification leading to better tests and better software products. Bender believes that more than half of the defects in a system are a result of poor requirements [5]. By examining specifications and requirements, "plurity of goals" can be identified and addressed. This early test activity can help mold a software products specifications and requirements so they are complete, accurate, consistent, unambiguous, and testable before the first design or line of code is written. It also promotes improved product quality since quality can truly be built in as the product is developed [8].
Given the dynamics of specifications and requirements, testing should be viewed as a integral and ongoing component of the software development process. Otherwise, poorly and ineffective tests will be developed. Throughout the software development process, a goal of the test team is to ensure that specifications and requirements remain clear, concise and accurate so effective tests can be developed which can demonstrate the product's compliance.
Consider the impact of poorly stated specifications and requirements on "black-box" testing. With this type of testing, testers have an external view. Normally, little if anything is known of the software's internals such as data structures or implementation details. For example, the following requirement extract from NASA's TRMM ASC software requirements specification should immediately raise many questions: "The ACS will flush and reset the packet pipe to 0 when incorrect or no data is received within a database specified threshold period."
As a black-box tester, initial questions should include: How can I demonstrate compliance with this requirement ? Will my test environment allow me to create the required anomalous conditions (incorrect data or no data) ? What is meant by incorrect data (out-of-range data or non-compliant packet format) ? Doesn't this requirement contain implementation specific information (reset the packet pending pipe to 0) ? Isn't this really two requirements ?
Inspection during the specifications and requirements phases may have yielded the following improved requirement: "The ACS will reset the input packet pipe upon receipt of: 1) Out-of-range data; 2) Invalid packet formats; 3) Loss of input data." Notice the corrected requirement does not specify how to reset, or how to determine a "no data" condition. These are thought to be implementation details. Also notice that incorrect data has been clarified to mean improperly formatted packets and out-of-range data. Using the corrected requirement, testers have a clearer view of the capabilities needed to test this requirement and how to determine compliance.
Testing early in the development life cycle would also uncover invalid requirements and specifications. Consider the following example: "The ACS command processing subsystem shall not process commands not received in the command receiving buffer." One interpretation of this requirement could be the following: "Validate that if you don't send a command to the ACS, it doesn't process it." While this translation might seem ridiculous, it illustrates how individual perception can change the intent of a requirement or specification which in turn, has a dramatic effect on the test. Perhaps a better requirement would be: "The ACS command processing subsystem shall only process commands received in the command receiving buffer." The genesis of this requirement stems from previous software failures that were a result of commands being received and processed through special debug logic.
These types of specification and requirements errors result from a condition I've termed "Legacy Systems Engineering." I use this term to describe specifications and requirements that are levied on software "...because that is how it has been done in the past." Most of the time, customers have no other justification for a requirement. I have also found that software inherits the testing problems from the past along with these antiquated and many times overhead and unnecessary requirements. Applying test rigor early in the software development life cycle helps to circumvent these errors of legacy.
7. Avoiding the Automated Testing Trap
The consensus of studies on the effectiveness of software testing indicates that poor quality and quantity of testing is a direct result of repetitive nature of the process as well as the lack of requirements and domain understanding. Testing is often viewed as an mundane and unrewarding task. My experience in industry indicates that many software developers believe software testing is a necessary and important task. Senior level developers believe that testing can be done by more junior personnel. If we consider the purpose of software testing is to identify and remove bugs, then this former view could be accepted and automated testing in its current form could prove to be very useful and powerful. On the other hand, if we consider its purpose is for the identification and removal of defects, then we must also accept the fact that software testing needs to engineered just as software is engineered.
Automated testing involves the use of tools. These tools help to augment human capability and relieve testers of the mundane and routine tasks. Some tools can even generate test cases and analyze results. Automation of testing has the advantage of adding rigor to the testing process because requiring, enforcing and measuring adherence to test goals and development standards is possible. This is especially true for large projects which often appear to be unmanageable. Automation also allows for better coverage and a much high volume of testing to be completed than manual testing for the same cost and schedule. One of the greatest strengths of automated testing is that the tests are repeatable and therefore so are the results.
The advent of automated testing has created a new type of software, referred to as testware by James Bach. With its roots from software, testware is susceptible to failure just as any other software [1]. Furthermore, Bach points out that testware actually has a higher failure rate because organizations which use automated testing and develop testware fail to apply the same care and professionalism exercised in their production software [1].
Just as Lederer and Prasad point out in their "Nine Management Guidelines for Better Cost Estimating" that users of cost estimating software should not rely on it for accurate estimates, Bach warns automated testing proponents from viewing it as the end all solution to the testing domain problem. He cites three assumptions that entices and seduces automated testing advocates with a false sense of security. First, manual testing can be mapped to a definable sequence of actions [1]. On the contrary, Bach suggests that manual testing is a guided process of exploration and reasoning [1]. I have found this to be very true, especially during functional testing where we often create "what if" scenarios based on what we observe.
Second, that there is utility in repeating a sequence of actions [1]. To this assertion, Bach's view is that once a test case has been executed without detecting a bug, then there is little chance that executing the test again will reveal additional bugs [1]. Generally I agree with this view, however I content that there is utility when testing boundary conditions or time critical software such as that usually found in embedded systems.
The third and most enticing assumption is that automated testing will cost significantly less than comparable manual testing [1]. If we accept this belief, then we must consider testware as vapor-ware, which is created quickly, cheaply and error free out of thin air. Management's acceptance with this third assumption greatly hinders software developers from improving the quality of their testing, their software development process and the ultimately the product's quality. Management must be aware of the indirect overhead costs of automated testing. Test scenarios need to be defined, designed and tested before using them to test the product. As the product's specifications change, mutations of existing tests or creation of additional tests may be required. As staffing changes, a certain amount historical information base is lost. Over time, current staff is left with depending solely on written documentation. It has been my experience that even the most rigorously documented project lacks the reasons why a product was built a certain way, why a certain approach was selected over alternative approaches, or even the alternative approaches. Lack of this information and historical knowledge makes automated test users reluctant to figure out what the suite actually tests, and even more reluctant to changing the test suite [1]. As tests grow, so does the time required to execute the test and analyze the results, all of which increase cost.
Certainly, it is dangerous to automate something that is not understood [1]. While automated testing has a promising future as software engineering matures and more formal methods are utilized, it undoubtedly is not the "Silver Bullet" for testing.
Recall the software defect example featured in section 3. If you read the example carefully, you would notice that the technique used to detect the error was an inspection. The inspection process has be around for many years. In the early years of software development, inspections were usually used solely for code reviews. Because inspections can be done quickly, easily and inexpensively, they have grown in popularity and applicability. In addition to error identification, inspections are also used to validate compliance with established development standards such as coding and documentation standards. Used properly, inspections can be very effective.
In an article by Louie Franz and Jonathan Shihs discussing the value of inspections and early testing of software projects, they describe an inspection process used during the development of a sales and inventory tacking system. One of their goals were to collect metrics so that they could quantify the value of using the inspection process. A testing phase followed the inspection phase to assist in determining the effectiveness of the inspection process.
The project used the basic waterfall model. Within the design, code and test phases, the inspection process identified in table 1 was used.
Improved Inspection
Process
Planning Plan inspection,
identify goals,
identify inspection
team
Kickoff Inspector Training,
inspection role
assignment,
distribution of
inspection material
Preparation Inspection of
materials
independently by each
inspector
Issue and Question Defect logging
Logging
Cause Brainstorming Inspectors brainstorm
on defect cause and
offer resolution
suggestions
Question and Answer Expanded discussions
with author about
specific issues
Rework Address/Fix anomalies
Follow-Up Verification of
defect resolution
Table 1. An Improved Inspection Process
Since the goal of this effort was to determine the value of investing time and effort in early defect detection activities, metrics were collected to measure the inspection effort consisting of: 1) number of critical defects found and fixed; 2) number of noncritical defects found and fixed; 3) total time used by the inspections; and 4) total time saved by inspections [3].
The testing process consisted of test planning, unit, module and system testing. Planning what to test was based on risk assessment to determine where to focus the testing effort. The risk assessment rated the importance to overall system functionality, technical difficulty, and complexity [3]. Two metrics were collected: 1) total number of critical defects; 2) Total testing time for each test phase [3].
The theory proposed by Franz and Shihs is that it should take longer to find and fix a defect at system test or unit test than it does to find the same defect during the inspection phase thus the return on investment is greater as the defect is identified earlier [3]. They present a return-on-investment model that allowed them to compute not only the time spent in inspections and testing, but also the time saved as a result of these processes. Table 2 shows a summary of the return on investment. Based on historical data, they used 20 hours as the cost to find and fix a critical error and 6 hours to find and fix non-critical errors.
Critical Non-Criti Total Total ROI Process Process
(Hours) cal Time Time (Percent) Cost Savings
(Hours) Used Saved (Dollars) (Dollars)
(Hours) (Hours)
Inspection 12 78 90 708 787 13,500 106,200
Test 51 0 310 710 229 46,500 106,500
Total 63 78 400 1418 335 60,000 212,700
Table 2.Improved Inspection Process Return on Investment
To add some reality to their results, I expanded the results to include columns which represent the financial impact of utilizing this improved inspection technique since the decisions made by management that affect testing is often influenced by economics. I arbitrarily choose a man-hour rate of $150.00. While the rate is not important, it does help to emphasize the dramatic impact inspections can have on project costs.
As we can see from data in table 2, the return on investment from inspections is three time greater that the return on investment from testing, giving factual and economic proof of the effectiveness of inspections and the value of early defect detection. It is important to reiterate, however, that the inspections alone cannot be used in place of testing.
9. The Phased Inspection Technique
Freedman and Weinberg report that reviews reduce by a factor of 10, the number of error which propagate through to testing resulting in a 50 to 80 percent reduction in testing costs even after considering the costs of the reviews [6]. Knight and Myers agree that there is considerable benefit in reviews but feel that existing methods are lacking. To address these issues, they propose a phased inspection process which is rigorous so results are repeatable, tailorable so it can serve functions other than error detection such as identifying and quantifying product characteristics like maintainability, and makes efficient use of human and computer resources [6].
The phase inspection process is designed so that it can be applied to any product in the software life cycle, including but certainly not limited to specification analysis, requirements analysis, test plan analysis, software design review and code review.
As its name implies, the phased inspection process consists of a series of coordinated partial inspections. During each phase, inspectors are validating that the product exhibits a specific or related properties, which is identified for each phase. The inspectors chosen for a specific phase are based on the goals of the phase, and are held responsible for assuring product compliance to the stated phase goals [6].
Knight and Myers define two classifications of phases. The first type, single-inspector phase, uses lists of checks to for simple but important properties, such as compliance to coding standards [6]. The multiple-inspector phase uses several inspectors to check for properties that cannot be definitively stated in simple check lists, such as completeness [6]. The properties to be checked are defined in domain-specific and application-specific checklists. Domain-specific checklists focuses the inspection on areas of difficulty in the associated domain, whereas application-specific checklists, developed by the author, challenge the inspector to understand the product be inspected in order to successfully answer the questions [6]. This phase culminates with a reconciliation cycle geared towards harmonizing inspector results [6].
Table 3 summarizes the phased inspection process presented by Knight and Myers showing the application of the process to check production software source code for desired characteristics.
Phase Activity/Goals
1 Check compliance with
required internal
documentation format
2 Check compliance with
required format
3 Check for readability; Check
for compliance with naming
standards
4 Check compliance with good
programming practice
5 Check for proper use of
programming constructs
6 Check functional correctness
Reconciliation Compare inspector results
and resolve discrepancies
Table 3. The Phased Inspection Process Table
In an attempt to determine the feasibility of the phased inspection process, Knight and Myers conducted an experiment which incorporated both single and multiple inspector phases. They seeded the inspection material with errors and collected metrics on the number of seeded errors found and the time required to complete the inspection. The result, summarized in table 4 prove to be quite interesting.
Seeded Unseeded
Phase Seeded Errors Errors Time
Errors Found Found
3 12 9 15 8hr 47min
4 14 10 21 13hr 1min
5 7 5 3 5hr 15min
6 12 6 46 30hr 20min
Totals 45 30 85 57hr 23min
Table 4.Phase Inspection Process Results
First, the inspection resulted in detection of 66 percent of the seeded errors. Seconds, the phased inspection process found 85 additional error in inspection material, 14 of which were found during the reconciliation process. As table 4 shows, a total of 115 errors were found during 57 hours and 23 minutes of inspection time resulting in an impressive error detection rate of 2 per hour. Because the phased inspection process involves reviewing the same product in each phase, the inspector should become more familiar in subsequent phases and be able to concentrate more on the checklist items. Besides the type of items being checked in phase 6, this could explain the high number of unseeded errors identified in this phase. Because of the limited metrics collected, no comment can be made on the effect of computer support. Since the experiment was only executed once, it cannot be determine if the results are repeatable.
Knight and Myers feel that process needs improvement since some of the seeded errors were not detected during any of the phase inspections, and that error were detected during the reconciliation phase [6]. However, since many unseeded error were detected, the process is achieving considerable thoroughness [6]. Although this process is experimental, it certainly has a promising future, especially because of the multiple inspections characteristic. Furthermore, it supports the value of early testing since it can be applied to products throughout the software development life cycle.
10. Usability Testing Technique
A monthly periodical dedicated to software development issues targeted towards windows based applications offers an interesting discussion on a software testing technique referred to as "usability testing." Usability testing follows a scientific approach that involves developing a hypothesis, creating and conducting an experiment, and either proving or disproving the hypothesis through an analysis of the experiment results [9]. As with any test approach, early implementation and planning are the keys to its success. The following paragraphs present a synopsis of the usability testing technique.
First, test goals need to be defined and appropriate test personnel need to be selected. Testers should be qualified. They should properly represent the user base as well as their range of abilities. The next step in the process is to create and validate test scenarios [9]. In this context, a test scenario describes a particular task that is to be accomplished by the test using the software under test. For example, a tester of a spreadsheet product might be given the task of generating a plot of data read from a data file.
The usability testing technique consists of testers, observers and the facilitator participants. The role of the facilitator is to brief the testers, explain the objectives of the test, clarify the tester's role, as well as ensuring that testing progresses. Upon completion, the facilitator debriefs the tester and observers, and consolidates observations from other testers to generate the test result reports [9].
The role of the tester is to accomplish one or more tasks, each identified as a test scenario, using the software under test. The tester is encouraged to verbalize his thoughts as he works through the scenario. This allows the observers, whose main role is to watch the tester, to record how the tester works thorough the problem set, and the assumptions made by the tester. In addition, the observers note which features are used, time to complete the task, problems encountered and any hints given [9].
Due to its simplicity and relatively low cost, this technique can be implemented very early in the software development life cycle allowing early problem identification and resolution thereby significantly reducing schedule, cost budgets. It can be particularly useful when the project involves the use of prototypes. The spiral nature of this technique supports revision of existing test scenarios and development of new test scenarios as the software and understanding of the users goals mature.
The topic of software testing is enormous. This paper has concentrated on the importance of testing early in the software development process as a means to improve software quality, improve test effectiveness and reduce project costs. Obviously, the importance and need for testing cannot be understated. Management recognizes the increased importance of effective testing. This is reflected in larger budget allocations to support testing as well as funding testing earlier in the development process. As software complexity increases, developers are also faced with the implicit requirement of designing the software so that it is testable. Software developers can help supplement the testing effort by promoting testable designs, even at the costs of implementation considerations. This concept has been in other engineering disciplines for years. Imagine building a suspension bridge without testing and simulating the structural properties of the design, yet software often reaches final integration without thorough testing [10].
1. Bach, J., "Test-Automation Snake Oil," Windows Tech Journal, Vol. 5, No. 10, Oct. 1996, pp. 40-44.
2. Collins, R., Miller, K. and Spielman, B., "How Good is Good Enough? An Ethical Analysis of Software Construction and Use," Communications of the ACM, Vol. 37, No. 1, Jan. 1994, pp. 81-91.
3. Franz, L. and Shihs, J., "Estimating the Value of Inspections and Early Testing for Software Projects," IEEE Engineering Management Review, Vol. 23, No. 4, Winter 1995, pp. 82-90.
4. Gibbs, W., "Software Chronic Chrisis," Scientific America, September 1994, pp. 86-95.
5. Hanna, M., "Test Early, Test Often," Software Magazine, Vol. 15, No. 10, Oct. 1995, pp. 59-68.
6. Knight, J. and Myers, A., "An Improved Inspection Technique," Communications of the ACM, Vol. 36, No. 11, Nov. 1993, pp. 51-61.
7. Layman, J., "The Best-Laid Plans," Windows Tech Journal, Vol. 5, No. 10, Oct. 1996, pp. 52-56.
8. Li, E., "Software Testing In a System Development Process: A Life Cycle Perspecitve," Journal of Systems Management, Vol. 41, No. 8, Aug. 1996, pp. 23-31.
9. Petersen, C., "Focus on Usability," Windows Tech Journal, Vol. 5, No. 10, Oct. 1996, pp. 34-38.
10. Quinnel, R., "Kill Bugs Early with Software Test Tools," EDN, Vol. 41, No. 11, May 23, 1996, pp. 89-98.