Bibliography

Defects and Defect Detection


Also consider references regarding Code Reading, reviews and inspections, and the WWW Formal Technical Review Archive maintained by P. Johnson, Department of Information and Computer Sciences, University of Hawaii.


Barnard, J. & A. Price (1994) Managing Code Inspection Information. IEEE Software, March 1994, p. 59-69.

 

Basili, V. R. & R. W. Selby (1987) Comparing the Effectiveness of Software Testing Strategies. IEEE Trans. SE, 13. p. 1278-1296.

Abstract: This study applies an experimentation methodology to compare three state-of-the-practice software testing techniques: a) code reading by stepwise abstration, b) functional testing using equivalence partitioning, and c) structural testing using 100 percent statement coverage criteria. The study compares the strategies in three aspects of software testing: fault detection effectiveness, fault detection cost, and classes of faults detected. Thirty-two professional programmers and 42 advanced students applied the three techniques to four unit-sized programs in a fractional factorial experimental design. The major results of this tsudy are the followin. 1) With the professional programmers, code reading detected more software faults and had a higher fault detection rate than did functional or structural testing, but functional and structural testing were not different in fault detection rate. 2) In one advanced student subject group, code reading and functional testing were not different in faults found, but were both superior to structural testing, while in the other advanced students subject group there was no difference among the techniques. 3) With the advanced students subjects, the three techniqu3es were not differeent in fault detection rate. 4) Number of faults observed, fault detection rate, and total effort in detection depended on the type of software tested. 5) Code reading detected more interface faults than did the other methods. 6) Functional testing detected more control faults than did the other methods. 7) When asked to estimate the percentage of faults detected, code readers gave the most accurate estimates while functional testers gave the least accurate estimates.

[Biffl03] Biffl, Stefan and Halling, Michael (2003) Investigating the Defect Detection Effectiveness and Cost Benefit of Nominal Inspection Teams. IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. 29, NO. 5, MAY 2003. Pages: 385-397.

Abstract:

AbstractÑInspection is an effective but also expensive quality assurance activity to find defects early during software development. The defect detection process, team size, and staff hours invested can have a considerable impact on the defect detection effectiveness and cost-benefit of an inspection. In this paper, we use empirical data and a probabilistic model to estimate this impact for nominal (noncommunicating) inspection teams in an experiment context. Further, the analysis investigates how cutting off the inspection after a certain time frame would influence inspection performance. Main findings of the investigation are: 1) Using combinations of different reading techniques in a team is considerably more effective than using the best single technique only (regardless of the observed level of effort). 2) For optimizing the inspection performance, determining the optimal process mix in a team is more important than adding an inspector (above a certain team size) in our model. 3) A high level of defect detection effectiveness is much more costly to achieve than a moderate level since the average cost for the defects found by the inspector last added to a team increases more than linearly with growing effort investment. The work provides an initial baseline of inspection performance with regard to process diversity and effort in inspection teams. We encourage further studies on the topic of time usage with defect detection techniques and its effect on inspection effectiveness in a variety of inspection contexts to support inspection planning with limited resources.

Card, D. N. (1998) Learning from our mistakes with defect causal analysis . IEEE Software, January/February 1998, p. 56-63.

 

[Chatzigeorgiou03] Chatzigeorgiou, A. & Antoniadis, G. (2003) Efficient management of inspections in software development projects. Information and Software Technology 45(10). pp 671-680.

Abstract:

During the last two decades a universal agreement has been established on the fact that software inspections play a fundamental role in improving software quality. The number of software organizations that have incorporated formal reviews in their development process is constantly increasing and the belief that efficient inspections can not only detect defects but also reduce cycle time and lower costs is spreading. However, despite the importance of the inspections in a software development project, scheduling of inspections has not been given the necessary attention so far. As a result, inspections tend to accumulate towards internal project deadlines, possibly leading to excess overtime costs, quality degradation and difficulties in meeting milestones. In this paper, data from a major telecommunications software project is analyzed in an effort to illustrate the problems that can arise from inefficient planning of inspections and their related activities.

Chillarege, R., I. S. Bhandari, J. K. Chaar, M. J. Halliday, D. S. Moebus, B. K. Ray, & M. Wong (1992) Orthogonal Defect Classification - A Concept for In-Process Measurements. IEEE Transactions on Software Engineering, SE-18. p. 943-956.

This paper describes orthogonal defect classification (ODC), a concept that enables in-process feedback to developers by extracting signatures on the development process from defects. The ideas are evolved from an earlier finding that demonstrates the use of semantic information from defects to extract cause-effect relationships in the development process. This finding is leveraged to develop a systematic framework for building measurement and analysis methods. This paper

Duncan, I. & D. Robson (1991) An Exploratory Study of Common Coding Faults in C Programs. Computer Science, University of Durham, DH1 3LE, UK. (pdf) (Computer Science Technical Report 2/96)

Abstract: Large scale code testing can be made viable by determining and searching for the most probable faults in the system under examination. The results of an exploratory survey carried out to determine the common error factors for code written in C indicates that the program task and the programmer experience are important considerations. Using this information, testing can be directed efficiently towards the removal of prevalent coding faults. The results of the survey can also be useful in determining features of C which are likely to become the subject of corrective maintenance.

Duncan, I., D. Robson & M. Munro (1996) Defect Detection in Code. Computer Science, University of Durham, DH1 3LE, UK. (Computer Science Technical Report 2/96) (pdf)

Abstract: To allow testers to know the types of faults they are looking for and to detect fault commonality and criticality, it is important to categorise code defects. The paper reviews testing techniques and taxonomies and considers fault clustering and isolation.

Dunsmore, A., M. Roper, & M. Wood (2000) The Role of Comprehension in Software Inspection. Journal of Systems and Software 52, p. 121-129.

Abstract:

Ebrahimi, N. B. (1997) On the Statistical Analysis of the Number of Errors Remaining in a Software Design Document After Inspection. IEEE Transactions on Software Engineering, SE-23. p. 529-532.

Abstract: Sometimes complex software systems fail because of faults introduced in the requirements and design stages of the development process. Reviewing documents related to requirements and design by several reviewers can remove some of these faults but often a few remain undetected until the software is developed. In this paper, we propose a procedure leading to the estimate of the number of faults which are not discovered. The main advantage of our procedure is that we do not need the standard assumption of independence among reviewers.

Fagan, M. E. (1986) Advances in Software Inspections. IEEE Transactions on Software Engineering, SE-12. p. 744-751.

This paper presents new studies and experiences that enhance the use of the inspection process and improve its contribution to development of defect-free software on time and at lower costs. Examples of benefits are cited followed by descriptions of the process and some methods of obtaining the enhanced results.

Software inspection is a method of static testing to verify that software meets its requirements. It engages the developers and others in a formal process of investigation that usually detects more defect in the product-and at a lower cost-than does machine testing. Users of the method report veryÊsignificant improvements in quality that are accompanied by lower development costs and greatly reduced maintenance efforts. Excellent results have been obtained by small and large organizations in all aspects of new development as well as in maintenance. There is some evidence that developers who participate in the inspection of their own product actually create fewer defects in future work. Because inspections formalize the development process, productivity and quality enhancing tools can be adopted more easily and rapidly.

Fagan, M. E. (1976) Design and Code Inspections to Reduce Errors in Program Development. IBM Systems Journal, 15. p. 182-211.

Abstract: Substantial net improvements in programming quality and productivity have been obtained through the use of formal inspections of design and of code. Improvements are made possible by a systematic and efficient design and code verification process, with well-defined roles for inspection participants. The manner in which inspection data is categorized and made suitable for process analysis is an important factor in attaining the improvements. It is shown that by using inspection results, a mechanism for initial error reduction followed by ever-improving error rates can be achieved.

[Fenton99] Fenton, N. & Neil, M. (1999) A Critique of Software Defect Prediction Models. IEEE Transactions on Software Engineering, Vol. 25, No. 5, September/October 1999, p. 675-689.

Abstract:

Many organizations want to predict the number of defects (faults) in software systems, before they are deployed, to gauge the likely delivered quality and maintenance effort. To help in this numerous software metrics and statistical models have been developed, with a correspondingly large literature. We provide a critical review of this literature and the state-of-the-art. Most of the wide range of prediction models use size and complexity metrics to predict defects. Others are based on testing data, the ÒqualityÓ of the development process, or take a multivariate approach. The authors of the models have often made heroic contributions to a subject otherwise bereft of empirical studies. However, there are a number of serious theoretical and practical problems in many studies. The models are weak because of their inability to cope with the, as yet, unknown relationship between defects and failures. There are fundamental statistical and data quality problems that undermine model validity. More significantly many prediction models tend to model only part of the underlying problem and seriously misspecify it. To illustrate these points the "Goldilock's Conjecture," that there is an optimum module size, is used to show the considerable problems inherent in current defect prediction approaches. Careful and considered analysis of past and new results shows that the conjecture lacks support and that some models are misleading. We recommend holistic models for software defect prediction, using Bayesian Belief Networks, as alternative approaches to the single-issue models used at present. We also argue for research into a theory of "software decomposition" in order to test hypotheses about defect introduction and help construct a better science of software engineering.

Gintell, J. , J. Arnold, M. Houde, J. Kruszelnicki, R. McKenney, & G. Memmi (1993) Scrutiny: A Collaborative Inspection and Review System. US Applied Research Lab, Bull HN Information Systems, Inc.

Abstract: This paper describes a Bull US Applied Research Laboratory project to build a collaborative inspection and review system called Scrutiny using Conversation Builder from the University of Illinois at Urbana-Champaign. The project has several distinct aspects: technology oriented research, prototype building, experimentation and tools deployment/technology transfer. Described are the design of the current operational version of Scrutiny for inspection-only, the evolutionary design of Scrutiny to handle various forms of review, and some initial thoughts on integration with other CASE frameworks and tools. The problem domain selected, the development environment, lessons learned thus far, some ideas from related work, and the problems anticipated are discussed here.

Jones, C. (1996) Software Defect Removal Efficiency. IEEE Computer, 29, April 1996, p. 94-95.

 

Kamsties, E. & C. M. Lott (1995) An Empirical Evaluation of Three Defect-Detection Techniques. (pdf), University of Kaiserslautern, Germany, ISERN-95-02.
Note: The version available online, doesn't contain solutions for experiment exercises.

Abstract: We replicated a controlled experiment first run in the early 1980's to evaluate the effectiveness and efficiency of 50 student subjects who used three defect-detection techniques to observe failures and isolate faults in small C programs. The three techniques were code reading by stepwise abstraction, functional (black-box) testing, and structural (white-box) testing. Two internal replications showed that our relatively inexperienced subjects were similarly effective at observing failures and isolating faults with all three techniques. However, our subjects were most efficient at both tasks when they used functional testing. Some significant differences among the techniques in their effectiveness at isolating faults of different types were seen. These results suggest that inexperienced subjects can apply a formal verification technique (code reading) as effectively as an execution-based validation technique, but they are most efficient when using functional testing.

Lanubile, F. & G. Visaggio (1996) Assessing Defect Detection Methods for Software Requirements Inspections Through External Replication. University of Bari,Italy, ISERN-96-01. (pdf)

Abstract: This paper presents the external replication of a controlled experiment which compared three defect detection techniques (Ad Hoc, Checklist, and Defect-based Scenario) for software requirements inspections, and evaluated the benefits of collection meetings after individual reviews. The results of our replication were partially different from those of the original experiment. Unlike the original experiment, we did not find any empirical evidence of better performance when using scenarios. To explain these negative findings we provide a list of hypotheses. On the other hand, the replication confirmed one result of the original experiment: the defect detection rate is not improved by the collection meetings.

The external replication was made possible by the existence of an experimental kit provided by the original investigators. We discuss what difficulties we encountered in applying the package to our environment, having different cultures and skills. We also discovered some critical problems in the original experiment which can be considered threats to its internal validity. Using our results, experience and suggestions, other researchers will be able to improve the original experimental design before attempting further replications.

Laitenberger, O. & J. DeBaud (2000) An Encompassing Life Cycle Centric Survey of Software Inspection. Journal of Systems and Software 51, p. 5-31.

Abstract:

Leszak, M., Perry, D. E., & Stoll, D. (2000) A case study in root cause defect analysis. Proceedings of the 22nd international conference on on Software Engineering June 4 - 11, 2000, Limerick Ireland, p. 428-437.

Abstract:

There are three interdependent factors that drive our software development processes: interval, quality and cost. As market pressures continue to demand new features ever more rapidly, the challenge is to meet those demands while increasing, or at least not sacrificing, quality. One advantage of defect prevention as an upstream quality improvement practice is the beneficial effect it can have on interval: higher quality early in the process results in fewer defects to be found and repaired in the later parts of the process, thus causing an indirect interval reduction.

We report a retrospective root cause defect analysis study of the defect Modification Requests (MRs) discovered while building, testing, and deploying a release of a transmission network element product. We subsequently introduced this analysis methodology into new development projects as an in-process measurement collection requirement for each major defect MR.

We present the experimental design of our case study discussing the novel approach we have taken to defect and root cause classification and the mechanisms we have used for randomly selecting the MRs to analyze and collecting the analyses via a web interface. We then present the results of our analyses of the MRs and describe the defects and root causes that we found, and delineate the countermeasures created to either prevent those defects and their root causes or detect them at the earliest possible point in the development process.

We conclude with lessons learned from the case study and resulting ongoing improvement activities.

Major, M. & MacGregor, J. (1999) Using Guided Inspection to Validate UML Models. .

 

O'Neil, D. (1997) Issues in Software Inspection. IEEE Software, 14, January/February 1997, p. 18-19.

 

Porter, A., Siy, H., Mockus, A. & Votta, L. (1998) Understanding the Sources of Variation in Software Inspections. ACM Transactions on Software Engineering and Methodology, Vol. 7, No. 1, January 1998, Pages 41-79.

Abstract:

In a previous experiment, we determined how various changes in three structural elements of the software inspection process (team size and the number and sequencing of sessions) altered effectiveness and interval. Our results showed that such changes did not significantly influence the defect detection rate, but that certain combinations of changes dramatically increased the inspection interval. We also observed a large amount of unexplained variance in the data, indicating that other factors must be affecting inspection performance. The nature and extent of these other factors now have to be determined to ensure that they had not biased our earlier results. Also, identifying these other factors might suggest additional ways to improve the efficiency of inspections. Acting on the hypothesis that the ÒinputsÓ into the inspection process (reviewers, authors, and code units) were significant sources of variation, we modeled their effects on inspection performance. We found that they were responsible for much more variation in defect detection than was process structure. This leads us to conclude that better defect detection techniques, not better process structures, are the key to improving inspection effectiveness. The combined effects of process inputs and process structure on the inspection interval accounted for only a small percentage of the variance in inspection interval. Therefore, there must be other factors which need to be identified.

Porter, A. & L. Votta (November/December 1997) What Makes Inspections Work? IEEE Software, p. 99-102.

 

Porter, A. A., H. P. Siy, C. A. Toman, & L. G. Votta (1997) An Experiment to Assess the Cost-Benefits of Code Inspections in Large Scale Software Development. IEEE Trans. on Software Engineering, 23. p. 329-346.

Abstract: We conducted a long-term experiment to compare the costs and benefits of several different software inspection methods. These methods were applied by professional developers to a commercial software product they were creating. Because the laboratory for this experiment was a live development effort, we took special care to minimize cost and risk to the project, while maximizing our ability to gather useful data. This article has several goals: 1) to describe the experiment's design and show how we used simulation techniques to optimize it, 2) to present our results and discuss their implications for both software practitioners and researchers, and 3) to discuss several new questions raised by our findings. For each inspection, we randomly assigned three independent variables: 1) the number of reviewers on each inspection team (1, 2, or 4), 2) the number of teams inspecting the code unit (1 or 2), and 3) the requirement that defects be repaired between the first and second team's inspections. The reviewers for each inspection were randomly selected without replacement from a pool of 11 experienced software developers. The dependent variables for each inspection included inspection interval (elapsed time), total effort, and the defect detection rate. Our results showed that these treatments did not significantly influence the defect detection effectiveness, but that certain combinations of changes dramatically increased the inspection interval.

Porter, A. A. & P. M. Johnson (1997) Assessing Software Review Meetings: Results of a Comparative Analysis of Two Experimental Studies. IEEE Trans. on Software Engineering, 23. p. 129-145.

Abstract: Software review is a fundamental tools for software quality assurance. Nevertheless, there are significant controversies as to the most efficient and effective review method. On of the most important questions currently being debated is the utility of meetings. Although almost all industrial review methods are centered around the inspection meeting, recent findings call their value into question. In prior research the authors of this paper separately and independently conducted controlled experimental studies to explore this issue. This paper presents new research to understand the broader implications of these two studies. To do this, we designed and carried out a press of "reconciliation" in which we established a common framework for the comparison of the two experimental studies, reanalyzed the experimental data with respect to this common framework , and compared the results. Through this process we found may striking similarities between the results of the two studies, strengthening their individual conclusions. It also revealed interesting differences between the two experiments, suggesting important avenues for future research.

Porter, A. A., L. G. Votta, Jr., & V. R. Basili (1995) Comparing Detection Methods for Software Requirements Inspections: A Replicated Experiment. IEEE Trans. on Software Engineering, 21. p. 563-575.

Abstract: Software requirements specifcations (SRS) are often validated manually. One such process is inspection, in which several reviewers independently analyze all or part of the specification and search for faults. These faults are then collected at a meeting of the reviewers and author(s).

Usually, reviewers use Ad Hoc or Checklist methods to uncover faults. These methods force all reviewers to rely on non-systematic techniques to search for a wide variety of faults. We hypothesize that a Scenario-based method, in which each reviewer uses different, systematics techniques to search for different, specific classes of faults, will have a significantly higher success rate.

We evaluated this hypothesis using a 3x2 partial factorial randomized experimental design. Forty eight graduate students in computer science participated in the experiment. They were assembled into sixteen, three-person teams. Each team inspected two SRS using some combination of Ad Hoc, Checklist or Scenario methods.

For each inspection we performed four measurements: 1) individual fault dectection rate, 2 ) team fault detection rate, 3) percentage of faults first identified at the collection meeting (meeting gain rate), and 4) percentage of faults first identified by an individual but never reported at the collection meeting (meeting loss rate).

The experimental results are that 1) the Scenario method had a higher fault detection rate than either Ad Hoc or Checklist methods, 2) Scenario reviewers were more effective at detecting the faults their scenarios are design to uncover, and were no less effective at detecting other faults than both Ad Hoc or Checklist reviewers, 3) Checklist reviewers were no more effective than Ad Hoc reviewers, and 4) Collection meetings product no net improvement in fault detection rate - meeting gains were offset by meeting losses.

Rifkin, S & Deimel, L. (2000) Program Comprehension Techniques Improve Software Inspections: A Case Study . Proceedings of the 8th International Workshop on Program Comprehension (IWPC'00) .

Abstract:

Software inspections are widely regarded as a cost-effective mechanism for removing defects in software, though performing them does not always reduce the number of customer-discovered defects. We present a case study in which an attempt was made to reduce such defects through inspection training that introduced program comprehension ideas. The training was designed to address the problem of understanding the artifact being reviewed, as well as other perceived deficiencies of the inspection process itself. Measures, both formal and informal, suggest that explicit training in program understanding may improve inspection effectiveness.

Rifkin, S. & L. Deimel (1994) Applying Program Comprehension Techniques to Improve Software Inspections. Presented at the 19th Annual NASA Software Engineering Laboratory Workshop, Greenbelt, MD, Nov. 30-Dec. 1, 1994. (pdf)

Abstract: Software inspections are widely regarded as a cost-effective mechanism for removing defects in software, though performing them does not always reduce the number of customer-discovered defects. We present a case study in which an attempt was made to reduce such defects through inspection training that introduced program comprehension ideas. The training was designed to address the problem of understanding the artifact being reviewed, as well as other perceived deficiencies of the inspection process itself. Measures, both formal and informal, suggest that explicit training in program understanding may improve inspection effectiveness.

Russell, G. W. (January 1991) Experience with Inspection in Ultra-Scale Developments. IEEE Software. p. 25-31

 

[Sauer00] Sauer, C., Jeffery, D. R., Land, L. & Yetton, P. (2000) The Effectiveness of Software Development Technical Reviews: A Behaviorally Motivated Program of Research. IEEE Transactions on Software Engineering, Vol. 26, No. 1, January 2000, p. 1-14.

Abstract:

Software engineers use a number of different types of software development technical review (SDTR) for the purpose of detecting defects in software products. This paper applies the behavioral theory of group performance to explain the outcomes of software reviews. A program of empirical research is developed, including propositions to both explain review performance and identify ways of improving review performance based on the specific strengths of individuals and groups. Its contributions are to clarify our understanding of what drives defect detection performance in SDTRs and to set an agenda for future research. In identifying individuals' task expertise as the primary driver of review performance, the research program suggests specific points of leverage for substantially improving review performance. It points to the importance of understanding software reading expertise and implies the need for a reconsideration of existing approaches to managing reviews.

Seaman, C. & Basili, V. (1998) Communication and Organization: An Empirical Study of Discussion in Inspection Meetings. IEEE Transactions on Software Engineering, Vol. 24, No. 7, July 1998.

Abstract:

This paper describes an empirical study that addresses the issue of communication among members of a software development organization. In particular, data was collected concerning code inspections in one soft- ware development project. The question of interest is whether or not organizational structure (the network of relationships between develop- ers) has an effect on the amount of e ort expended on communication between developers. The independent variables in this study are various attributes of the organizational structure in which the inspection participants work. The dependent variables are measures of the communication effort expended in various parts of the code inspection process, focusing on the inspection meeting. Both quantitative and qualitative methods were used, including participant observation, structured interviews, generation of hypotheses from eld notes, statistical tests of relationships, and interpretation of results with qualitative anecdotes. The study results show that past and present working relationships between inspection participants a ect the amount of meeting time spent in different types of discussion, thus affecting the overall inspection meeting length. Reporting relationships and physical proximity also have an effect. The contribution of the study is a set of well-supported hypotheses for further investigation.

[Seaman97] Seaman, C. & Basili, V. (1997) An Empirical Study of Communication in Code Inspections. Proc. of the 1997 International Conference on Software Engineering, Boston, MA, May 17-24, 1997.

Abstract:

This paper describes an empirical study which addresses the issue of communication among members of a software development organization. In particular, data was collected concerning code inspections in one software development project. The question of interest is whether or not organizational structure (the network of relationships between developers) has an effect on the amount of e ort expended on communication between developers. Both quantitative and qualitative methods were used, including participant observation, structured interviews, generation of hypotheses from field notes, some simple statistical tests of relationships, and interpretation of results with qualitative anecdotes. The study results show that past and present working relationships between inspection participants affect the amount of meeting time spent in different types of discussion, thus affecting the overall meeting length. Reporting relationships and physical proximity also have an e ect, as well as the point in the project that an inspection occurs. All but the last of these factors are or- ganizational structure relationships. The contribution of the study is a set of well-supported hypotheses for further investigation.

Siy, H. P. & Votta, L. () Does The Modern Code Inspection Have Value? .

Abstract:

For years, it was believed that the value of inspections is in finding and fixing defects early in the development process. Otherwise, the cost to find and fix them later is much higher. However, in examining code inspection data, we are finding that inspections are beneficial for an additional reason. They make the code easier to understand and change. An analysis of data from a re- cent code inspection experiment shows that 60% of all issues raised in the code inspections are not problems that could have been uncovered by latter phases of testing or field usage because they have little or nothing to do with the visible execution behavior of the software. Rather, they improve the maintainability of the code by making the code conform to coding standards, minimizing redundancies, improving language proficiency, improving safety and portability, and raising the quality of the documentation. We conclude that even if advances in software technology have diminished the value of inspections as a defect detection tool, in most cases, it continues to be of value as a maintenance tool.

Siy, H. P. (1996) Identifying the Mechanisms to Improve Code Inspection Costs and Benefits . PhD. Dissertation, University of Maryland.

Abstract:

Software inspections have long been considered to be an effective way to detect and remove defects from software. However, there are costs associated with carrying out inspections and these costs may outweigh the expected benefits.

It is important to understand the tradeoffs between these costs and benefits. We believe that these are driven by several mechanisms, both internal and external to the inspection process. Internal factors are associated with the manner in which the steps of the inspection are organized into a process (structure), as well as the manner in which each step is carried out (technique). External ones include differences in reviewer ability and code quality (inputs), and interactions with other inspections, the project schedule, personal calendars, etc. (environment).

We started a study to identify the mechanisms that strongly influence an inspection's costs and effectiveness. Most of the existing literature on inspections have discussed how to get the most benefit out of inspections by proposing changes to the process structure, but with little or no empirical work conducted to demonstrate how they worked better and at what cost.

We hypothesized that these changes will affect the defect detection effectiveness of the inspection, but that any increase in effectiveness will have a corresponding increase in inspection interval and effort. We evaluated this hypothesis with a controlled experiment on a live development project at Lucent Technologies, using professional software developers.

We found that these structural changes were largely ineffective in improving the effectiveness of inspections, but certain treatments dramatically increased the inspection interval. We also noted a large amount of unexplained variance in the data suggesting that other factors must have a strong influence on inspection performance.

On further investigation, we found that the inputs into the process (reviewers and code units) account for more of the variation than the original treatment variables, leading us to conclude that better techniques by which reviewers detect defects, not better process structures, are the key to improving inspection effectiveness.

Weiss, A. R. & Kimbrough, K. (1995) Inspection Guidelines and Standards

 

Weller, E. F. (1993) Lessons from Three Years of Inspection Data. IEEE Software, September 1993, p. 38-45.

Abstract:

*Wood, M., M. Roper, A. Brooks, J. Miller (1997) Comparing and Combining Software Defect Detection Techniques: a Replicated Empirical Study. Proceedings of The Sixth European Software Engineering Conference / Fifth ACM SIGSOFT Symposium on the Foundations of Software Engineering, September 1997, Mehdi Jazayeri, Helmut Schauer (Eds.), Lecture Notes in Computer Science, Volume 1301, pp262-277.

Abstract: This report describes an empirical study comparing three defect detection techniques: a) code reading by stepwise abstraction, b) functional testing using equivalence partitioning and boundary value analysis, and c) structural testing using branch coverage. It is a replication of a study that has been carried out at least four times previously over the last 20 years. This study used 47 student subjects to apply the techniques to small C programs in a fractional factorial experimental design. The major findings of the study are: a) that the individual techniques are of broadly similar effectiveness in terms of observing failures and finding faults, b) that the relative effectiveness of the techniques depends on the nature of the program and its faults, c) these techniques are consistently much more effective when used in combination with each other. These results contribute to a growing body of empirical evidence that supports generally held beliefs about the effectiveness of defect detection techniques in software engineering.

Wohlin, C., Höst, M. & Ohlsson, M. D. (2000) Understanding the Sources of Software Defects: A Filtering Approach. Proceedings of the 8th International Workshop on Program Comprehension (IWPC'00) .

Abstract:

This paper presents a method proposal of how to use product measures and defect data to enable understanding and identification of design and programming constructs that contribute more than expected to the defect statistics. The paper describes a method that can be used to identify the most defect-prone design and programming constructs and the method proposal is illustrated on data collected from a large software project in the telecommunication domain. The example indicates that it is feasible, based on defect data and product measures, to identify the main sources of defects in terms of design and programming constructs. Potential actions to be taken include less usage of particular design and programming constructs, additional resources for verification of the constructs and further education into how to use the constructs.


See also the WWW Formal Technical Review Archive maintained by P. Johnson, Department of Information and Computer Sciences, University of Hawaii.
To Top of Software Process Collection
To UMass Dartmouth
To CIS Department at UMass Dartmouth

Comments should be sent to

Richard Upchurch (rupchurch@umassd.edu)
Computer and Information Science Department
University of Massachusetts Dartmouth
285 Old Westport Rd.
N. Dartmouth, MA 02747-2300
RUpchurch@umassd.edu

This document
Created: March 8, 1996
by RLU

Modified: August 11, 2003