Also consider references regarding Code Reading, defect detection, and the WWW Formal Technical Review Archive maintained by P. Johnson, Department of Information and Computer Sciences, University of Hawaii.
Abstract:
Abstract: Inspection data is difficult to gather and interpret. At AT&T Bell Laboratories, the authors have defined nine key metrics that project managers can use to plan, monitor, and improve inspections.
Abstract: It is generally accepted that every programmer checks his or her code before submitting it to testing. This process has been formalised by means of "reviews", "walkthroughs", and "inspections". Inspections were used in industry by the author, following Fagan, but were not well-metricated: others have shown that the quantifiable benefits of inspections may be considerable. A final-year project at Staffordshire Univeristy has produced results for software inspection compared both to computer-based testing alone and to computer-based testing following inspection in which metrics were collected throughout the process, allowing comparisons of effectiveness and efficiency.
Abstract: Inspection of software development artifacts have become an integral part of software quality improvement. However, experience tells us that inspection effectiveness (i.e., its capability to find defects) and efficiency (i.e., its cost-effectiveness) vary significantly across organizations or, even more striking, from one inspection to the other in a given organization. Thus, we investigate in this paper the variations across inspections at one of our customersÕ sites. By measuring and modelling their inspection processes, we identified some of the important factors that have an impact on effectiveness and efficiency and that can explain most of their variation. The models we developed show that exponential relationships exist between the inspected document size, the effort spent on preparation, and the resulting effectiveness and efficiency. Moreover, these models appear to make accurate predictions (a goodness of fit of R 2 =0.68 and R 2 =0.89 for effectiveness and efficiency, respectively). We show how they can be practically used for inspection resource planning, quality control, evaluation, and improvement.
Abstract:The goal of software inspections and tests is to reduce the expected cost of software failure over the life of a product. This paper extends the use of defect triggers, the events which cause defects to be discovered,to help evaluate the effectiveness of inspection and test activities. In the case of inspections, the defect trigger is designed as a set of values which associate the skills of the inspector with the discovered defect. Similarly,for tests, the defect trigger values embody the various strategies being used in creating test scenarios.
The usefulness of triggers in evaluating the effectiveness of software inspections and tests is demonstrated by evaluating the inspection and test activities of some software products. These evaluations are used to point to both deficiencies in inspection and test strategies,and progress made in improving such strategies.
Abstract: Most of us pay lip service to the need for software project postmortems, but the literature offers little guidance on how to conduct them. The authors propose a tentative, standard process for conducting postmortem reviews and describe activities, roles, and artifacts of the process.
Abstract: This module consists of a comprehensive examination of the technical review process in the software life cycle. Formal review methodologies are analyzed in detail from the perspective of the review participants, project management and software quality assurance. Sample review agendas are also presented for common types of reviews. The objective of the module is to provide the student with the information necessary to plan and execute highly efficient and cost effectivetechnical reviews.
Abstract: The ability to read and understand a computer program is a critical skill for the software developer, yet this skill is seldom developed in any systematic way in the education or training of software professionals. These materials discuss the importance of program reading, and review what is known about reading strategies and other factors affecting comprehension. These materials also include reading exercises for a modest Ada program and discuss how educators can structure additional exercises to enhance program reading skills.
Abstract: Contrary to common wisdom, formal software inspections, also known as Fagan inspections, can be effectively conducted by teams including members who lack in-depth product knowledge. In fact, the differing viewpoint provided by those without detailed product knowledge may offer a more robust inspection. In this paper we show that an effective inspection can be achieved by a properly prepared and trained team even though they are not familiar with the details of the product or even the domain of the product.
Abstract: This paper considers the role of comprehension during the preparation and defect detection phases of the software inspection process. Software inspection is generally accepted as a useful technique for finding errors in both documents and code. However, there is no general agreement on how defects are best detected and, in particular, how much understanding of the product is required and how that understanding is best achieved. Some inspection processes provide no guidance. Many advocate fairly informal aids such as checklists. Recently more structured techniques, in the form of scenarios, have been proposed. The need for increased comprehension seems particularly relevant to object-oriented technology as a result of inherent features, which appear to increase inter-component dependencies. This paper reviews the Software Engineering literature investigating the role of comprehension and the related topic of program visualisation during the preparation and defect detection phases of inspection. It considers particular features of object-oriented technology that may require enhanced comprehension during inspection. It draws on similarities between software maintenance and software inspection to suggest that there are potential benefits to be obtained in using comprehension techniques and tools, developed for maintenance, during inspection.
Abstract: Over the past decade, software inspections have established an impressive track record for early defect detection and correction, and their use has grown extensively. But during this time software quality requirements have also increased, raising the level of our quality expectations. Industrial engineering has addressed similar increasing quality requirements for hardware by being able to predict problems early in development by using statistical process control (SPC). Applying SPC to software inspection data provides a similar ability to predict and remove latent problems, those that haven't been identified as yet, early in software development before they can propagate and cause later difficulty and expense. The application of SPC to software inspection data to predict defects is defined and illustrated using sample inspection data, and recommendations are made for applying predictive quality control.
Abstract: The standard software development process consists of multiple stages: requirements, coding, system test, and finally delivery. An objective of this process is to minimize the number of faults in delivered code. Root cause analysis shows that many of the faults can be traced back to requirements or design faults. As paxt of the software development process, reviews are conducted to remove these faults before the requirements or design document is passed on to the next step, We have developed a method of instrumenting a review process to record document faults dkcovered by reviewers during their prepmation. Then, using statistical techniques related to capture-recapture methods, we estimate the number of undiscovered faults remaining in the document. The key idea to our method is to look at how many common faults independent reviewers tlnd and then extrapolate to the total number of faults. We do not seed the document with artificial faults -no additionat faults are introduced.
We have applied our methods to 13 review sessions (either feature requirements or feature design) and are in the process of a longitudinal study tracing these features. Our results to date estimate that about 20% of the faults are undetected by reviews. When the predicted number of undetected faults is greater than 209Õo, consideration should be given to reworking design and/or rereviewing the result. One surprising by-product of this study is a quantification of the number of faults found by group reviews. We find that only about 10% of the (dwcovered) document faults are found at the review (90% are found in preparation) and that the lead time to schedule a review is about ten working days.
This paper presents new studies and experiences that enhance the use of the inspection process and improve its contribution to development of defect-free software on time and at lower costs. Examples of benefits are cited followed by descriptions of the process and some methods of obtaining the enhanced results.
Software inspection is a method of static testing to verify that software meets its requirements. It engages the developers and others in a formal process of investigation that usually detects more defect in the product-and at a lower cost-than does machine testing. Users of the method report veryÊsignificant improvements in quality that are accompanied by lower development costs and greatly reduced maintenance efforts. Excellent results have been obtained by small and large organizations in all aspects of new development as well as in maintenance. There is some evidence that developers who participate in the inspection of their own product actually create fewer defects in future work. Because inspections formalize the development process, productivity and quality enhancing tools can be adopted more easily and rapidly.
Abstract: Substantial net improvements in programming quality and productivity have been obtained through the use of formal inspections of design and of code. Improvements are made possible by a systematic and efficient design and code verification process, with well-defined roles for inspection participants. The manner in which inspection data is categorized and made suitable for process analysis is an important factor in attaining the improvements. It is shown that by using inspection results, a mechanism for initial error reduction followed by ever-improving error rates can be achieved.
Summary:This 470 page volume is arguably the most comprehensive discussion of inspection available. Chapters include: The historical background of inspection; the benefits and cost of inspection; overview of software inspection; four chapters on the individual phases of the inspection process; the inspection leader, installation and training; overcoming the difficulties; and six chapters presenting case studies at Applicon, Cray Research, Thorn EMI, Racal Redac, Sema Group, and IBM. (summary by Johnson)
Abstract: This paper describes a Bull US Applied Research Laboratory project to build a collaborative inspection and review system called Scrutiny using Conversation Builder from the University of Illinois at Urbana-Champaign. The project has several distinct aspects: technology oriented research, prototype building, experimentation and tools deployment/technology transfer. Described are the design of the current operational version of Scrutiny for inspection-only, the evolutionary design of Scrutiny to handle various forms of review, and some initial thoughts on integration with other CASE frameworks and tools. The problem domain selected, the development environment, lessons learned thus far, some ideas from related work, and the problems anticipated are discussed here.
A 20-year-old inspection technique has served developers well in the quest for software quaity improvement. But radical changes are on the horizon that may seriously steer the furture of formal review.
Abstract: Formal technical review is acknowledged as a preeminant software quality improvement method. The "inspection" review method, first introduced by Michael Fagan twenty years ago, has led to dramatic improvements in software quality. It has also led to a myopia within the review community, which tends to view inspection-based methods as not just effective, but as the optimal approach to formal technical review. This article challenges this view by presenting a taxonomy of software review that shows inspection to be just one among many valid approaches. The article then builds upon this framework to propose seven guidelines for the radical redesign and improvement of formal technical review during the next twenty years.
Abstract: Formal technical review (FTR) is a cornerstone of software quality assurance. However, the labor-intensive and manual nature of review, along with basic unresolved questions about its process and products, means that review is typically under-utilized or inefficiently applied within the software development process. This paper discusses our initial experiments using CSRS, an instrumented, computer-supported cooperative work environment for software review that reduces the manual, labor-intensive nature of review activities and supports quantitative study of the process and products of review. Our results indicate that CSRS increases both the breadth and depth of information captured per person-hour of review time, and that its design captures interesting measures of review process, products, and effort.
Abstract: Formal technical review (FTR) is an essential component of all modern software quality assessment, assurance, and improvement techniques, and is acknowledged to be the most cost-effective form of quality improvement when practiced effectively. However, traditional FTR methods such as inspection are very diffcult to adopt in organizations: they introduce substantial new up-front costs, training, overhead, and group process obstacles. Sustained commitment from high-level management along with substantial resources is often necessary for successful technology transfer of FTR. Since 1991, we have been designing and evaluating a series of versions of a system called CSRS: an instrumented, computer-supported cooperative work environment for formal technical review. The current version of CSRS includes an FTR method definition language, which allows organizations to design their own FTR method, and to evolve it over time. This paper describes how our approach to computer supported FTR can address some of the issues in technology transfer of FTR.
Abstract: Reviews are seen as an important aid to achieving high quality software. But, there is considerable variance in the literature. Three seminal review types - Fagan's inspection, Freedman and Weinburg's technical review, and Yourdon's structured walkthrough are briefly described. A framework is presented to identify core features and variants. The core components represent an implicit normative theory of reviews; the variants indicates possible variations to the theory. There are indications emerging that this theory may be flawed. This framework provides a foundation from which to empirically test the core components of the theory and its variants.
Abstract: The most common techniques for detecting defects in software artifacts are inspection and test-ing. Since both techniques are effort consuming, they are often presented as being counterparts or even rivals rather than as being complementary. Hence, few controlled empirical studies investigate the effects of inspection and testing on software quality when applied in sequence. This paper contributes a controlled experiment to shed light on this issue. Twenty subjects per-formed sequentially code inspection and structural testing using different coverage values as test criteria on a C-code module. We adopted this sequence because it is recommended for use in industry.
The results of this experiment show that inspection significantly outperforms structural testing with respect to (cost-)effectiveness for defect detection. Furthermore, the experimental results indicate little evidence to support the hypothesis that structural testing detects defects of a particular class that were missed by inspection and vice versa. These findings lead us to the conclusion that inspection and structural testing do not complement each other well. In fact, prior inspection seems to hinder the (cost-)effectiveness of structural testing. Since inspection out-performs structural testing and since 39 percent (on average) of the defects were not detected at all, it might be more valuable to apply inspection together with other testing techniques, such as boundary value analysis, to achieve a better defect coverage.
We are aware that a single experiment does not provide conclusive evidence. Hence, we consider it only one step in the determination of the optimal mix of defect detection techniques. Additional research as well as replication of this experiment are required to make further progress into this direction.
Abstract: It is widely accepted that software development technical reviews ( SDTRs ) (inspections ) are a useful technique for finding defects in software products. Recent debates centre around the need for review meetings ( Porter and Votta 1994, Porter et al 1995, McCarthy et al 1996, Lanubile and Visaggio 1996 ). This paper presents the findings of an experiment that was conducted to investigate the performance advantage of interacting groups over average individuals and nominal groups. We found that interacting groups outperform the average individuals, as is implicit in the normative SDTR theory ( Kim et al 1995 ). The source of performance advantage of interacting groups is not in finding defects, but rather in discriminating between true defects and false positives. We also found that nominal groups might be an alternative review design in situations where individuals exhibit a low level of false positives.
Abstract: Software inspection is an eective method of defect detection. Recent research activity has considered the development of tool support to further increase the efficiency and effectiveness of inspection, resulting in a number of prototype tools being developed. However, no comprehensive evaluations of these tools have been carried out to determine their effectiveness in comparison with traditional paper-based inspection. This issue must be addressed if tool-supported inspection is to become an accepted alternative to, or even replace, paper-based inspection.
This paper describes a controlled experiment comparing the eectiveness of tool-supported software inspection with paper-based inspection, using a new prototype software inspection tool known as ASSIST (Asynchronous/Synchronous Software Inspection Support Tool). 43 students used ASSIST and paper-based inspection to inspect two C++ programs of approximately 150 lines. The subjects performed both individual inspection and a group collection meeting, rep- resenting a typical inspection process. It was found that subjects performed equally well with tool-based inspection as with paper-based, measured in terms of the number of defects found, the number of false positives reported, and meeting gains and losses.
Abstract: The benefits of the object-oriented paradigmare widely cited. At the same time, inspection is deemed to be the most cost-effective means of detecting defects in software products. Why then, is there no published experience, let alone quantitative data, on the application of inspection to object-oriented systems? We describe the facilities of the object-oriented paradigm and the issues that these raise when inspecting object-oriented code. Several problems are caused by the disparity between the static code structure and its dynamic runtime behaviour. The large number of small methods in object-oriented systems can also cause problems. We then go on to describe three areas which may help mitigate problems found. Firstly, the use of various programming methods may assist in making object-oriented code easier to inspect. Secondly, improved program documentation can help the inspector understand the code which is under inspection. Finally, tool support can help the inspector to analyse the dynamic behaviour of the code. We conclude that while both the object-oriented paradigm and inspection provide excellent benefits on their own, combining the two may be a difficult exercise, requiring extensive support if it is to be successful.
Abstract:Software inspection is a widely used method for finding defects in all types of software development documents. There are many variations on the method, each of which is designed tobe used under certain circumstances or to address some perceived fault in those which already exist. A desirable attribute of inspections is that they are rigorous, i.e. the process is executed identically for each inspection that takes place. This allows feedback from each inspection to give guidance on expected future performance and also to suggest future improvements. Recent work in tool support for inspection is designed to tackle the issue of enforcing a rigorous inspection, but these tools tend to concentrate on enforcing only one, usually proprietary, inspection variation. This paper investigates existing inspection methods and derives a generic inspection process which can be used to describe any of these methods. This process is then used to determine a notation for describing any inspection, which can consequently be used as input to an inspection support tool, allowing the support of any inspection method.
Abstract: This document is a catalog to be used with The Craft of Software Testing, a text I use in on-site and correspondence courses.
For each procedure (or other chunk of code), scan the catalog, frst checking whether the question applies anywhere, next whether the answer is yes. A yes answer means a probable bug. The questions arepredominantly for bugs that dynamic testing is poor at discovering. Other bugs are better found via testing.
This catalog is to be kept short. A catalog with too many entries will not be used.
Abstract: We hypothesize that inspection meetings are far less effective than many people believe and that meetingless inspections are equally effective. However, two of our previous industrial case studies contradict each other on this issue. Therefore, we are conducting a multi-trial, controlled experiment to assess the benefits of inspection meetings and to evaluate alternative procedures.
The experiment manipulates four independent variables- (1) the inspection method used (two methods involve meetings, one method does not), (2) the requirements specification to be inspected (there are two), (3) the inspection round (each team participates in two inspections), and (4) the presentation order (either specification can be inspected first).
For each experiment we measure 3 dependent variables: (1) the individual fault detection rate, (2) the team fault detection rate, and (3) the percentage of faults originally discovered after the initial inspection phase (during which phase reviewers individually analyze the document).
So far we have completed one run of the experiment with 21 graduate students in the computer science at the University of Maryland as subjects, but we do not yet have enough data points to draw definite conclusions. Rather than presenting preliminary conclusions, this article (1) describes the experiment's design and the provocative hypotheses we are evaluating, (2) summarizes our observations from the experiment's initial run, and (3) discusses how we are using these observations to verify our data collection instruments and to refine future experimental runs.
Abstract: Software inspection is one of the best methods of verifying software documents. Software inspection is a complex process, with many possible variations, most of which have received little or no evaluation. This paper reports on the evaluation of one component of the inspection process, detection aids, specifically using Scenario or Checklist approaches. The evaluation is by subject-based experimentation, and is currently one of three independent experiments on the same hypothesis. The paper describes the experimental process, the resulting analysis of the experimental data, and attempts to compare the results in this experiment with the other experiments.
The Inspection Method is a proven technique for finding and removing defects in specifications, software, documentation, and other deliverables as early as possible. Inspection applies the concepts of statistical process control to produce high-quality deliverables at minimum cost. Inspections can be used on ANY written document -- specifications, source code, contracts, test plans, test cases, etc. Inspections Methodology was developed by Michael Fagan of IBM ("Design and code inspections top reduce errors in program development", 1979), and has undergone continuous quality improvement itself. Fagan updated his paper in 1986 ("Advances in Software Inspections"), but a key contributor has been Tom Gilb ("Managing the Software Process"). Inspections have been used at IBM, Bell-Northern Research, Tandem, and many other corporations to find defects faster, and hence at lower cost.
Abstract: Software engineering research has focused primarily on software construction, neglecting software maintenance and evolution. Observed is a shift in research from synthesis to analysis. The process of reverse engineering is introduced as an aid in program understanding. This process is concerned with the analysis of existing software systems to make them more understandable for maintenance, re-engineering, and evolution purposes. Presented is reverse engineering technology developed as part of the Rigi project. The Rigi approach involves the identification of software artifacts in the subject system and the aggregation of these artifacts to form more abstract system representations. Early industrial experience has shown that software engineers using Rigi can quickly build mental models from the discovered abstractions that are compatible with the mental models formed by the maintainers of the underlying software.
Abstract: Although there exists a multitude of different inspection processes, the basic process has remained unchanged since it was first defined by Fagan in 1976. The process has as its central component an inspection meeting which all participants attend. But is this meeting cost effective? Recent work suggests this is not the case.
An inspection model that dispenses totally with the need for the inspectors to be in the same place at the same time is presented. It replaces the meeting with further individual inspections combined with asynchronous communication between inspectors.
A prototype tool has been developed that supports the asynchronous model. In contrast to a previously developed asynchronous inspection tool, it uses electronic mail as the basis for communication and the reasons for this approach are discussed.
The inspection model is evaluated in comparison with the traditional, meeting-oriented approach on a number of criteria. An initial attempt was made to gain quantitive data by carrying out a small-scale experiment, but whilst encouraging results being obtained, the number of subjects was too low for any significant conclusions to be drawn. Larger scale experiments are planned for the future to obtain more data.
The Software Formal Inspections Guidebook is designed to support the inspection process of software developed by and for NASA. This document provides information on how to implement a recommended and proven method for conducting formal inspections of NASA software.
This Guidebook is a companion document to NASA Standard 2202-93, Software Formal Inspections Standard, approved April 1993, which provides the rules, procedures, and specific requirements for conducting software formal inspections. Application of the Formal Inspections Standard is optional to NASA program or project management. In cases where program or project management decide to use the formal inspections method, this Guidebook provides additional information on how to establish and implement the process.
The goal of the formal inspections process as documented in the above-mentioned Standard and this Guidebook is to provide a framework and model for an inspection process that will enable the detection and elimination of defects as early as possible in the software life cycle. An ancillary aspect of the formal inspection process incorporates the collection and analysis of inspection data to effect continual improvement in the inspection process and the quality of the software subjected to the process.
The Software Formal Inspections Standard is designed to support the inspection process of software developed for NASA. Its goal is to provide a framework and model for an inspection process that will detect and eliminate defects as early as possible in the software life cycle. This Standard will have been successfully applied if it accomplishes the following:
Abstract:Disseminating information, maintaining artifact consistency, and scheduling coordinated activities are critical problems in any large-scale, software development. Inadequate management of this "process overhead" can increase rework eort, decrease quality, and lengthen interval. These problems are greatly magnified when a development team is divided across two or more geographcally separate locations.
For example, in traditional development settings, conflicts in scheduling meetings account for a significant portion of inspection interval [] . In a distributed development, inspection interval is lengthened still more by delays resulting from time-zone mismatches, travel to meetings, and long-distance (sometimes international) mailings.
In this article we present a tool, Hypercode, that supports meetingless software inspections with geographically-distributed reviewers. HyperCode is a platform independent tool, developed on top of an internet browser, that integrates seamlessly into the current development process. By seamless we mean the tool produces a paper ow that is almost identical to the current inspection process, and is consistent with ISO certification. Furthermore, HyperCode's user acceptance has been excellent.
More importantly we evaluated and compared the cost-effectiveness of HyperCode inspections with that of manual inspections. We found that the cost savings from reduced paper work and the time savings from faster distribution of the inspection package have been substantial. These savings together with the seamless integration into the existing process appear to be the major reasons for tools acceptance.
From our viewpoint as experimentalists, however, this acceptance came too readily and too easily: our control group insisted on using HyperCode. Therefore, we were unable to directly assess HyperCode's impact of inspection quality. Nevertheless, by using historical data we can show that meetingless inspections (like those supported ny HyperCode) are at least as eective as traditional inspection with meetings.
Abstract: We have conducted an industrial experiment to assess the cost-benefit tradeoffs of several softwareinspection processes. Our results to date explain the variation in observed effectiveness very well, butare unable to satisfactorily explain variation in inspection interval.
In this article we examine the effect of a new factor - process environment - on inspection interval(calendar time needed to complete the inspection). Our analysis suggests that process environmentdoes indeed influence inspection interval. in particular, we found that non-uniform work priorities,time-varying workloads, and deadlines have significant effects.
Moreover, these experiences suggest that regression models are inherently inadequate for interval modeling, and that queueing models may be more effective.
Abstract:In a previous experiment, we determined how various changes in three structural elements of the software inspection process (team size, and number and sequencing of session), altered effectiveness and interval. our results showed that such changes did not significantly influence the defect detection reate, but that certain combinations of changes dramatically increased the inspection interval.
We also observed a large amount of unexplained variance in the data, indicating that other factors much be affecting inspection performance. The nature and extent of these other factos now have to be determined to ensure that they had not biased our earlier results. Also, identifying these other factors might suggest additional ways to improve the efficiency of inspection.
Acting on the hypothesis that the "inputs" into the inspection process (reviewers, authors, and code units) were significant sources of variation, we modeled their effects on inspection performance. We found that they were responsible for much more variation in defect detection than was process structure. This leads us to conclude that better defect detection techniques, not better process structures, at the key to improving inspection effectiveness.
The combined effects of process inputs and process structure on the inspection interval accounted for only a small percentage of the variance in inspection interval. Therefore, there still remain other factors which need to be identified.
Abstract: We conducted a long-term experiment to compare the costs and benefits of several different software inspection methods. These methods were applied by professional developers to a commercial software product they were creating. Because the laboratory for this experiment was a live development effort, we took special care to minimize cost and risk to the project, while maximizing our ability to gather useful data. This article has several goals: 1) to describe the experiment's design and show how we used simulation techniques to optimize it, 2) to present our results and discuss their implications for both software practitioners and researchers, and 3) to discuss several new questions raised by our findings. For each inspection, we randomly assigned three independent variables: 1) the number of reviewers on each inspection team (1, 2, or 4), 2) the number of teams inspecting the code unit (1 or 2), and 3) the requirement that defects be repaired between the first and second team's inspections. The reviewers for each inspection were randomly selected without replacement from a pool of 11 experienced software developers. The dependent variables for each inspection included inspection interval (elapsed time), total effort, and the defect detection rate. Our results showed that these treatments did not significantly influence the defect detection effectiveness, but that certain combinations of changes dramatically increased the inspection interval.
Abstract: Software review is a fundamental tools for software quality assurance. Nevertheless, there are significant controversies as to the most efficient and effective review method. On of the most important questions currently being debated is the utility of meetings. Although almost all industrial review methods are centered around the inspection meeting, recent findings call their value into question. In prior research the authors of this paper separately and independently conducted controlled experimental studies to explore this issue. This paper presents new research to understand the broader implications of these two studies. To do this, we designed and carried out a press of "reconciliation" in which we established a common framework for the comparison of the two experimental studies, reanalyzed the experimental data with respect to this common framework , and compared the results. Through this process we found may striking similarities between the results of the two studies, strengthening their individual conclusions. It also revealed interesting differences between the two experiments, suggesting important avenues for future research.
Abstract: Software requirements specifcations (SRS) are often validated manually. One such process is inspection, in which several reviewers independently analyze all or part of the specification and search for faults. These faults are then collected at a meeting of the reviewers and author(s).
Usually, reviewers use Ad Hoc or Checklist methods to uncover faults. These methods force all reviewers to rely on non-systematic techniques to search for a wide variety of faults. We hypothesize that a Scenario-based method, in which each reviewer uses different, systematics techniques to search for different, specific classes of faults, will have a significantly higher success rate.
We evaluated this hypothesis using a 3x2 partial factorial randomized experimental design. Forty eight graduate students in computer science participated in the experiment. They were assembled into sixteen, three-person teams. Each team inspected two SRS using some combination of Ad Hoc, Checklist or Scenario methods.
For each inspection we performed four measurements: 1) individual fault dectection rate, 2 ) team fault detection rate, 3) percentage of faults first identified at the collection meeting (meeting gain rate), and 4) percentage of faults first identified by an individual but never reported at the collection meeting (meeting loss rate).
The experimental results are that 1) the Scenario method had a higher fault detection rate than either Ad Hoc or Checklist methods, 2) Scenario reviewers were more effective at detecting the faults their scenarios are design to uncover, and were no less effective at detecting other faults than both Ad Hoc or Checklist reviewers, 3) Checklist reviewers were no more effective than Ad Hoc reviewers, and 4) Collection meetings product no net improvement in fault detection rate - meeting gains were offset by meeting losses.
Abstract: Software requirements specifications (SRS) are usually validated by inspections, in which several reviewers read all or part of the specification and search for defects. We hypothesize that diflerent methods for conducting these searches may have significantly diflerent rates of success.Using a controlled experiment, we show that a Scenario-based detection method, in which each reviewer executes a specific procedure to discover a particular class of defects has a higher defect detection rate than either Ad Hoc or Checklist methods. We describe the design, execution, and analysis of the experiment so others may reproduce it and test our results for different kinds of software developments and different populations of software engineers.
Abstract: A software tool called EXPLAINER has been developed for helping programmers perform new tasks by exploring previously worked-out examples. EXPLAINER is based on cognitive priniciples of learning from examples and problem solving by analogy. The interface is based on the principle of making examples accessible through multiple presentation views and multiple representation perspectives. Empirical evaluation has shown that programmers using EXPLAINER exhibit less variablility in their performance compared to programmers using a commericially available, searchable on-line manual. These results are related to other studies of programmers and to current methodologies in software engineering.
Abstract: We developed a software tool called EXPLAINER for helping programmers complete new tasks by exploring previously worked-out examples. The implementation is based on the principle of making examples accessible through multiple perspectives and, specifically, perspectives that emphasize the programming plans underlying an example. The initial version of EXPLAINER used a simple, semantic network to represent multiple perspectives. A frame-based knowledge representation language called FrameTalk provides a more structured means of representing examples in EXPLAINER. Moreover, FrameTalk provides mechanisms that avoid deficiencies that arise when concept taxonomies must serve the dual purpose of representing specialization and composition of attributes.
Abstract: Software inspections are widely regarded as a cost-effective mechanism for removing defects in software, though performing them does not always reduce the number of customer-discovered defects. We present a case study in which an attempt was made to reduce such defects through inspection training that introduced program comprehension ideas. The training was designed to address the problem of understanding the artifact being reviewed, as well as other perceived deficiencies of the inspection process itself. Measures, both formal and informal, suggest that explicit training in program understanding may improve inspection effectiveness.
Abstract:This paper describes a software engineering experiment designed to confirm results from an earlier project which measured fault detection rates in user requirements documents (URD), The experiment described in this paper involves the creation of a standardized URD with a known number of injected faults of specific type. Nine independent inspection teams were given this URD with instructions to locate as many faults as possible using the N-fold requirements inspection technique developed by the authors. Resulte obtained from this experiment confirm earlier conclusions about the low rate of fault detection in requirements documents using formal inspections and the advantages to be gained using the N-fold inspection method. The experiment also provides new results concerning variability in inspection team performance and the relative difficulty of locating different classes of SRD faults.
Abstract:
Abstract: The importance of Software Review or Formal Technical Review (FTR) and its benefits have been well documented. However, there are many variations of the method in practice, especially those related to the group process. This paper discusses a new approach to how organizations can build their own review systems that are most suitable to them. Our basic approach is to use CSRS modeling languages to characterize the review method descriptively. The language descriptions are then compiled to generate the corresponding review systems. CSRS modeling languages are developed based on FTR framework which models both variations in the group process and review strategies exhibited by current FTR methods.
Abstract: This document describes a pilot experiment that compares the cost effectiveness of a group-based review method (EGSM) to that of an individual-based review method (EIAM) using CSRS. In this pilot study, no signifcant differences in revieweffectiveness and reviewcost were found. This document provides completedetails on the procedures and outcomes from this pilot study, as well as the lessons learned which will be applied to an upcoming experimental study.
Abstract: Formal Technical Review (FTR) plays an important role in modern software development. It can improve the quality of software products and the quality and productivity of their development processes. However, the effectiveness of current FTR practice is hampered by uncertainty and ambiguity. This research investigated two issues. First, what differences exist among current FTR methods? Second, what are potential review factors that impact upon the effectiveness of these methods? The approach taken by this research was to first develop a FTR framework, based on a review of literature in the field. The framework allows one to determine the similarities and differences between the review process of FTR methods, as well as to identify potential review factors. Specifically, it describes a review method in terms of seven components of a review process: phase, objective, degree of collaboration, synchronicity, role, technique, entry/exit criteria. By looking at the values of individual components, one can compare and contrast different FTR methods. Furthermore, by investigating these values empirically, one can methodically improve the practice of FTR. Second, a computer based review system, called CSRS, was developed to implement the framework. The system provides a set of declarative modeling languages, which allow one to create a wide variety of FTR methods, or to design experiments to compare the performance of two or more review methods, or to evaluate a set of review factors within a method. Finally, this research involved an empirical study using CSRS to investigate the effectiveness of a group process versus an individual process in finding program faults. Two review methods/systems were implemented using CSRS: EGSM (used by real groups) and EIAM (used by nominal groups). The experiment involved 24 groups of students (3 students per group), each reviewing two sets of source code, once using EGSM and once using EIAM. The experiment found that there were no significant differences in detection effectiveness between the two methods, that synergy was observed in EGSM but did not contribute significantly to the total faults found, and that EGSM incurred higher cost than EIAM, but was significantly more effective in filtering out false positives.
Comments should be sent to
Richard Upchurch (rupchurch@umassd.edu)
This document
Created: June 18, 1996
by RLU
Modified: January 27, 1998