These references concern research in how programmers, novice and expert, construct and understanding of source code. Included in the items below are those empirical studies aimed at validating theoretical models, such as Brooks83, articles related to tools support for the code reading activity, and articles related to instructional attempts to help students gain expertise. Curriculum has paid little attention to the area of students ability in code reading, yet from both a testing and maintenance perspective it would appear that helping students develop strategies and techniques that are founded in cognitive principles and supported by empirical evidence is important. The initial material in this document was taken from Deimel and Naveda (1990) Reading Computer Programs: Instructor's Guide and Exercises. CMU/SEI-90-EM-3. Their annotations for articles are left intact as Comments D&N:. Remarks added by those of us on this project are indicated by Comments UMaD: Certain of the references included in this collection are available locally in either ps or pdf format depending on the quality of the conversion process (to pdf). Check the link to ascertain the file type prior to transfer. Those items, that only exist in hardcopy, that the project has a copy are marked with an asterisk. |
Abstract: In dealing with the legacy systems, one often encounters poorly documented and heavily maintained software. Lack of understandability of these systems complicates the task of software maintenance, making it time consuming and limiting the possibilities of the evolution of the system. We present a methodology that helps the programmers to understand programs. Our approach is compatible with the "top-down theory" of software understanding, where the programmer creates a chain of hypotheses and subsidiary hypotheses, concerning the properties of the code. Then he/she looks for evidence (beacons) in the code. Our approach shortens the process of hypotheses creation and verification, and allows recording of successful hypotheses for the future maintenance. All information needed for understanding is recorded in layers of annotations. An experiment was conducted to investigate how the proposed methodology helps in program understanding. A tool supporting the methodology, is presented.
[Rajlich97] Rajlich, V. (1997) Incremental Redocumentation with Hypertext. Published in the Proceedings of Euromicro'97, March 17 - 19, 1997 in Berlin, Germany. (hypertext.pdf)Abstract: Redocumentation is the recovery and recording ofsoftware comprehension. Since software comprehension is the most expensive part of softwaremaintenance, redocumentation is the key to software maintainability. This paper describes the process andthe tools of incremental redocumentation where the comprehension of the software is recorded in hypertext,in the style of the World Wide Web. The paper describes the tools which support redocumentation, andgives several examples.
[Ramalingam97] Ramalingam, V. & S. Wiedenbeck (1997) An Empirical Study of Novice Program Comprehension in the Imperative and Object-Oriented Styles. Empirical Studies of Programmers, pp. 124-139. (PDF)Abstract: The objective of this study was to determine whether the mental representation of object-oriented program differs from imperative programs for novice programmers. In our study novices who had little or no previous programming experience studied and answered questions about three imperative and three object-oriented programs. The questions targeted information categories making up the program model and the domain model representations of the programs. It was found that there was a sharp contrast beteen the mental representaions of the imperative and object-opriented programs. While the comprehension of the imperative programs was better overall than that of th eobject-oriented programs, the mental representations of the imperative programs focused on program-level knowledge. On the other hand, the mental representations of the object-oriented programs focused more strongly on domain-level knowledge. The results tend to support the view that language notations differ in how well they support the extraction of various kinds of information.
[Rathke94] Rathke, C. & D. Redmiles (1994) Improving the Explanatory Power of Examples by a Multiple Perspectives Representation. Proceedings of the 1994 East-West Conference on Computer Technologies in Education (EW-ED '94). P. Busilovsky, S. Dikareva, J. Greer and V. Petrushin. Crimea, Ukraine, 1994, pp. 195-200.Abstract:
[Redmiles93] Redmiles, D. Reducing the Variability of Programmers' Performance Through Explained Examples, Human Factors in Computing Systems,INTERCHI'93 Conference Proceedings (Amsterdam, The Netherlands), ACM, 1993, pp. 67-73. (pdf)Abstract: A software tool called EXPLAINER has been developed for helping programmers perform new tasks by exploring previously worked-out examples. EXPLAINER is based on cognitive priniciples of learning from examples and problem solving by analogy. The interface is based on the principle of making examples accessible through multiple presentation views and multiple representation perspectives. Empirical evaluation has shown that programmers using EXPLAINER exhibit less variablility in their performance compared to programmers using a commericially available, searchable on-line manual. These results are related to other studies of programmers and to current methodologies in software engineering.
[Redmiles94] Redmiles, D. Improving the Explanatory Power of Examples by a Multiple Perspectives Representation. (ps) or (pdf)Abstract: We developed a software tool called EXPLAINER for helping programmers complete new tasks by exploring previously worked-out examples. The implementation is based on the principle of making examples accessible through multiple perspectives and, specifically, perspectives that emphasize the programming plans underlying an example. The initial version of EXPLAINER used a simple, semantic network to represent multiple perspectives. A frame-based knowledge representation language called FrameTalk provides a more structured means of representing examples in EXPLAINER. Moreover, FrameTalk provides mechanisms that avoid deficiencies that arise when concept taxonomies must serve the dual purpose of representing specialization and composition of attributes.
[Rifkin00]Rifkin, S & Deimel, L. (2000) Program Comprehension Techniques Improve Software Inspections: A Case Study . Proceedings of the 8th International Workshop on Program Comprehension (IWPC'00) .Abstract: Software inspections are widely regarded as a cost-effective mechanism for removing defects in software, though performing them does not always reduce the number of customer-discovered defects. We present a case study in which an attempt was made to reduce such defects through inspection training that introduced program comprehension ideas. The training was designed to address the problem of understanding the artifact being reviewed, as well as other perceived deficiencies of the inspection process itself. Measures, both formal and informal, suggest that explicit training in program understanding may improve inspection effectiveness.
[*Rifkin94] Rifkin, S. & L. Deimel (1994) Applying Program Comprehension Techniques to Improve Software Inspections. Presented at the 19th Annual NASA Software Engineering Laboratory Workshop, Greenbelt, MD, Nov. 30-Dec. 1, 1994.Abstract: Software inspections are widely regarded as a cost-effective mechanism for removing defects in software, though performing them does not always reduce the number of customer-discovered defects. We present a case study in which an attempt was made to reduce such defects through inspection training that introduced program comprehension ideas. The training was designed to address the problem of understanding the artifact being reviewed, as well as other perceived deficiencies of the inspection process itself. Measures, both formal and informal, suggest that explicit training in program understanding may improve inspection effectiveness.
Comments UMaD: Provides convincing data to suggest that inspections may not be as effective as possible because those that review the artifact use weak comprehension strategies - "we may have paid too much attention to the global software review process and too little attention to the conduct of an individual and perhaps weighty process, namely the actual review of the software product". The data presented seems to confirm that explicit attention to comprehension strategies has benefits through improving the number of defects identified through the inspection process. One confounding feature of this is that subjects were allowed to created their own inspection process. Second, the data suggests that comprehension skills can be improved through training (explicit attention to process and strategy). A particularly interesting note from the graphics comparing the groups over time (pages 9-10). It appears that the groups not trained in comprehension but using a "better" process identify more of the types of errors usually captured through testing. The implication of this is that the number of post release bugs does not drop, rather the amount of time in testing is reduced but customer complaints continue. Hence, part of the comprehension training may encourage the reviewer to identify that class of error related to the problem domain.
[Robbins96a] Robbins, J., Hilbert, D., Redmiles, D. (1996) Extending Design Environments to Software Architecture Design. Proceedings of the 11th Annual Knowledge-Based Software Engineering (KBSE-96) Conference (Syracuse, NY), IEEE Computer Society, Los Alamitos, CA, September 1996, pp. 63-72.Abstract:
[Robbins96b] Robbins, J. E. & D. F. Redmiles (1996) Software Architecture Design From the Perspective of Human Cognitive Needs.Abstract: Much attention in software engineering research today is focussed on the notion of software architectures. The major motivation is that software architectures provide the appropriate level of abstraction to support the design of complex systems. The research has quickly evolved to the degree that design environments have been implemented to support software architects in creating new designs by combining components within architectural styles. We follow the same motivation with a different focus. We report on a software architecture design environment called Argo. Argo differs from other approaches by being paying attention to the human, cognitive needs software architects have during design as much as the repre-sentation and manipulation of the architecture itself. We emphasize the primary considerations by contrasting an analysis of the human, cognitive design process with a systems, software design process. The corresponding, key elements are illustrated through a design scenario with Argo. Human-centered features in Argo focus on the application of critics for providing design feedback, design processes for supporting critics, and multiple architectural perspectives for aiding human designers.
Comments UMaD: The paper discusses the authors' rationale in designing and building a design environment, Argo. They claim to concern cognitive design theory in the design of the environment. In particular, the authors contend they support reflection-in-action (Schön), opportunistic design (Guindon) and comprehension and problem solving (through the use of multiple representations). The mapping of these theories to the particular environment is useful. The notion of design critic in the system is supported through noticing what cause breakdowns (lack of domain knowldege, lack of solution knowledge, lack of good process understanding) and how designers handle breakdowns during the design process. With an active critic information regarding rule violations or suggestions can be incorporated early in the process. Also, in support of opportunistic design, the system helps maintain not only the current representation of the system under construction but also a representation of the process the designer is using. By aiding the designer in managing the notion of process it become easier for the individual to manage the volume of to do activities.
[*Robson91] Robson, D. J., Bennett, K. H., Cornelius, B. J. , & Munro, M. (1991) Approaches to Program Comprehension. J. Systems Software, 14, p. 79-84.Abstract: Software maintenance is recognized as the most expensive phase of the software life cycle. The maintenance programmer is frequently presented with code with little or no supporting documentation, so that the understanding required to modify the program come mainly form the code. This paper discusses some of the current approaches to theories of program comprehension and the tools for assisting the maintenance programmer with this problem.
[Rosenblum91]Rosenblum, D. S. & Alexander L. Wolf (1991) Representing Semantically Analyzed C++ Code with Reprise. From the Proceedings of the USENIX C++ Conference, Washington, DC, April 22-25, 1991.Abstract: A prominent stumbling block in the spread of the C++ programming language has been a lack of programming and analysis tools to aid development and maintenance of C++ systems. One way to make the job of tool developers easier and to increase the quality of the tools they create is to factor out the common components of tools and provide the components as easily (re)used building blocks. Those building blocks include lexical, syntactic, and semantic analyzers, tailored database derivers, code annotators and instrumentors, and code generators. From these building blocks, tools such as structure browsers, dataflow analyzers, program/specification verifiers, metrics collectors, compilers, interpreters, and the like can be built more easily and cheaply. We believe that for C++ programming and analysis tools the most primitive building blocks are centered around a common representation of semantically analyzed C++ code. In this paper we describe such a representation, called Reprise (REPResentation Including SEmantics). The conceptual model underlying Reprise is based on the use of expressions to capture all semantic information about both the C++ language and code written in C++. The expressions can be viewed as forming a directed graph, where there is an explicit connection from each use of an entity to the declaration giving the semantics of that entity. We elaborate on this model, illustrate how various features of C++ are represented, discuss some categories of tools that would create and manipulate Reprise representations, and briefly describe our current implementation. This paper is not intended to provide a complete definition of Reprise . Rather, its purpose is to introduce at a high level the basic approach we are taking in representing C++ code.
[Rugaber92] Rugaber, S. & Victoria Tisdale (????) Software Psychology Requirments for Software Maintenance Activities. Software Engineering Research Center, Georgia Institute of Technology. swpi/gatech/softpsych.pdf[Rugaber92] Rugaber, S. (1992) Program Comprehension for Reverse Engineering. swpi/gatech/aaai.pospap.pdf
[Rugaber96] Rugaber, S. (1996) Program understanding. in Allen Kent and James G. Williams (eds.), Encyclopedia of Computer Science and Technology, 35, Marcel Dekker, pp. 341-368. swpi/gatech/encyc.pdf
[Rugaber97] Rugaber, S. (1997) An Example of Program Understanding. College of Computing, Georgia Tech, GIT-CC-98-14. swpi/gatech/comments.pdf
[Rugaber00] Rugaber, S. (2000) The use of domain knowledge in program understanding. Annals of Software Engineering 9, pp. 143-192. swpi/gatech/annals.pdf
Abstract: Program understanding is an essential part of all software maintenance and enhancement activities. As currently practiced, program understanding consists mainly of code reading. The few automated understanding tools that are actually used in industry provide helpful but relatively shallow information, such as the line numbers on which variable names occur or the calling structure possible among system components. These tools rely on analyses driven by the nature of the programming language used. As such, they are adequate to answer questions concerning implementation details, so called what questions. They are severely limited, however, when trying to relate a system to its purpose or requirements, the why questions.
Application programs solve real-world problems. The part of the world with which a particular application is concerned is that applicationŐs domain. A model of an application's domain can serve as a supplement to programming-language-based analysis methods and tools. A domain model carries knowledge of domain boundaries, terminology, and possible architectures. This knowledge can help an analyst set expectations for program content. Moreover, a domain model can provide information on how domain concepts are related.
This article discusses the role of domain knowledge in program understanding. It presents a method by which domain models, together with the results of programming-language-based analyses, can be used to answers both what and why questions. Representing the results of domain-based program understanding is also important, and a variety of representation techniques are discussed. Although domain-based understanding can be performed manually, automated tool support can guide discovery, reduce effort, improve consistency, and provide a repository of knowledge useful for downstream activities such as documentation, reengineering, and reuse. A tools framework for domain-based program understanding, a dowser, is presented in which a variety of tools work together to make use of domain information to facilitate understanding. Experience with domain-based program understanding methods and tools is presented in the form of a collection of case studies. After the case studies are described, our work on domain-based program understanding is compared with that of other researchers working in this area. The paper concludes with a discussion of the issues raised by domain-based understanding and directions for future work.
[*Shaft95] Shaft, T. M. & I. Vessey (1995) The Relevance of Application Domain Knowledge: The Case of Computer Program Comprehension. Information Systems Research, 6. p. 286-299.Abstract: The field of software, has, to date, focused almost exclusively on application-independent approaches. In this research, we demonstrate the role of application domain knowledge in the processes used to comprehend computer programs. Our research sought to reconcile two apparently conflicting theories of computer program comprehension by proposing a key role for knowledge of the application domain under examination. We argue that programmers use more top-down comprehension processes when they are familiar with the application domain. When the application domain is unfamiliar programmers use processes that are more bottom-up in nature. We conducted a proctocol analysis study of 24 professional programmers comprehending programs in familiar and unfamiliar application domains. Our findings confirm our thesis.
[Shneiderman79] Shneiderman, B., and R. Mayer. Syntactic Semantic Interactions in Programmer Behavior: A Model and Experimental Results. Intl. J. Comp. & Info. Sciences 8, 3 (June 1979), 219-238.Comments D&N: This paper presents a cognitive framework for describing behaviors involved in program composition, comprehension, debugging, modification, and the acquisition of new programming concepts, skills, and knowledge. An information processing model is presented which includes a long-term store of semantic and syntactic knowledge, and a working memory in which problem solutions are constructed. New experimental evidence is presented to support the model of syntactic/semantic interaction. The authors present their cognitive model of programmer behavior, the syntactic/ semantic model. They suggest that this model is useful in explaining a variety of behaviors, including program reading and program writing. The authors hypothesize that programmers retain both semantic and syntactic knowledge in long-term memory, and that they use short-term and working memories in performance of various program-related tasks. Semantic knowledge and syntactic knowledge are largely independent in this model. Semantic knowledge is multilayered and substantially language-independent; syntactic knowledge applies to particular programming languages. Shneiderman and Mayer describe how their model applies to program reading, program writing, debugging, and learning programming languages. They conclude their paper with brief discussions of experiments that they offer as supporting evidence for their theory.
In program comprehension, according to this theory, the reader constructs a multileveled internal semantic structure to represent the program, a process of encoding from the program syntax, which is not memorized directly. The internal structure is built by recognizing the function of program components and fragments as chunks. These pieces are then aggregated until a description of the entire program is available.
This is a paper everyone should read. It presents a typical cognitive model in an approachable way, and shows how such models are used and verified. It also offers insight into programmer behavior. Yet, the structural complexity of the syntactic/semantic model makes the model seem less useful than it should be, primarily because a totally adequate model would be very much richer in processing details. The processes reified in this model are largely implicit in other comprehension models. Shneiderman's and Mayer's mental model of a program is quite similar to that of Brooks [Brooks78] and Letovsky [Letovsky86a]. Their description of the assimilation process, however, is strictly bottom-up.
[Shneiderman80] Shneiderman, B. Software Psychology: Human Factors in Computer and Information Systems. Cambridge, Mass.: Winthrop, 1980.Comments D&N: Software Psychology is a handbook for the application of psychology to computer-related issues. Shneiderman provides a crash course on methods of psychological research and proceeds to discuss topics from program reading to team organization and the design of interactive systems. Although this volume was written a decade ago, it remains an invaluable reference on psychological factors related to the computer. The book contains an extensive bibliography.
[Shull98] Shull, F., F. Lanubile, & V. Basili (1998) Investigating Reading Techniques for Framework Learning. ISERN-98-16.Abstract: The empirical study described in this paper addresses software reading for construction: how application developers obtain an understanding of a software artifact for use in new system development. This study focuses on the processes developers would engage in when learning and using object-oriented frameworks. We analyzed 15 student software development projects using both qualitative and quantitative methods to gain insight into what processes occurred during framework usage. The contribution of the study is not to test predefined hypotheses but to generate well-supported hypotheses for further investigation. The main hypotheses we produce are that example-based techniques are well suited to use by beginning learners while hierarchy-based techniques are not because of a larger learning curve. Other more specific hypotheses are proposed and discussed.
[*Soloway84] Soloway, E. & K. Ehrlich (1984) Empirical Studies of Programming Knowledge. IEEE Trans. on Software Engineering, SE-10, p. 595-609.Abstract:
[*Soloway88a] Soloway, E. B. Adelson, & K. Ehrlich (1988) Knowledge and Processes in the Comprehension of Computer Programs. In M. Chi, R. Glaser, & M. Farr (eds.), The Nature of Expertise. Lawrence Erlbaum Associates. p. 129-152.Introduction: We have been investigating the cognitive underpinnings of how programmers-novices and experts-read and write computer program.s Out approach has been to employ a cycle of constructing theory, carrying out empirical studies, and building and testing AI programs that embody our theory. In this chapter we present our current view on the knowledge and processing strategies programmers employ in attempting to comprehend compute programs. We first present an experiment that supports our claims as to the composition of an expert programmer's knowledge base. Next, we propose pressing strategies that may be at work in comprehending programs. As support for these latter mechanisms, we draw on our experiences in building a computer program that attempts to understand computer programs written by novices.
[*Soloway88b] Soloway, E. J. Pinto, S. Letovsky, D. Littman, & R. Lampert (1988) Designing Documentation to Compensate for Delocalized Plans. CACM,31, p. 1259-1267.Abstract:
[Spafford92] Spafford, E. & C. Viravan. (1992) Experimental Designs: Testing a Debugging Oracle Assistant. SERC-TR-120-P, August 1992.Abstract:
This paper documents the design of an experiment to test a debugging oracle assistant. A debugging oracle is responsible for judging correctness of program parts or program states. A programmer usually acts as a debugging oracle. The goal of debugging oracle assistant is to improve the programmer's speed and accuracy. Factors that complicate or design process include: (1) programmer variability, (2) interaction between programmers and programs, (3) interaction between programs and faults, (4) possible confounding experimental factors, (5) any learning effect from the assistance, (6) any learning effect from the program, and (7) the lack of experienced programmers for our experimental studies.
This paper explained the rational behind our design. It explains why the above factors can make other choices, such as a Latin Square design, produce misleading results. It questions the validity of the so-called within-subjects factorial design when the experimental factors exclude programmers. It explains the factors related to programs, programmers, and faults that we need to control. It also explains why we prefer to use analysis of covariance to reduce experimental error caused by programmer variability instead of grouping programmers by expertise.
The paper also covers types of analysis to (1) test our hypotheses, (2) verify assumptions behind the analysis of variance, (3) verify assumptions behind the analysis of covariance, and (4) estimate adequate sample size. Lastly, we define the inference space to which we can generalize the experimental results.
[Spafford93] Spafford, E. & C. Viravan. (1993) Pilot Studies on Debugging Oracle Assistants. SERC-TR-134-P, March 1993.Abstract:
A debugging oracle is decision maker during a debugging process. Three major decisions during typical debugging sessions are on the identities, the locations, and the repairs of faults. A programmer usually acts as a debugging oracle. Our research objective is to help him in his decision-making process with a debugging oracle assistant. To enhance our understanding of both the debugging oracle and the debugging oracle assistant, we studied how 14 expert programmers debug a C program with over 4300 executable lines of code including real faults of omission. Four different forms of debugging oracle assistance were tested. The outcome of the studies provided insight to programmers' needs and the forms of assistants which fulfill them. We find that information alone does not improve debugging performance. The two assistants that helped programmers make more accurate decisions on faults observed when programmers needed help and provided unsolicited and customized assistance for each programmer. This customized assistance came in the form of hints, questions, confirmation, and/or explanation.
Our preliminary results are supported by research on Decision Support Systems (DSS) and Critic Systems. The problem with debugging assistants we identified match the problems identified for DSS. The desirable features are also the characteristics of Critic Systems, we have reason to believe that a desirable debugging oracle assistant is a debugging critic.
[Storey97a]Storey, M., K. Wong, F.D. Fracchia & H. A. Müller (1997) On Integrating Visualization Techniques for Effective Software Exploration. In the Proceedings of IEEE Symposium on Information Visualization (InfoVis'97), Phoenix, Arizona, U.S.A., October 20-21, 1997. (pdf)Abstract: This paper describes the SHriMP visualization technique for seamlessly exploring software structure and browsing source code, with a focus on effectively assisting hybrid program comprehension strategies. The technique integrates both pan+zoom and fisheye-view visualization approaches for exploring a nested graph view of software structure. The fisheye-view approach handles multiple focal points, which are necessary when examining several subsystems and their mutual interconnections. Source code is presented by embedding code fragments within the nodes of the nested graph. Finer connections among these fragments are represented by a network that is navigated using a hypertext link-following metaphor. SHriMP combines this hypertext metaphor with animated panning and zooming motions over the nested graph to provide continuous orientation and contextual cues for the user. The SHriMP tool is currently being evaluated in several user studies. Observations of users performing program understanding tasks with the tool are discussed.
[Storey97b] Storey, M. D., K. Wong, & H. A. Müller (1997) How Do Program Understanding Tools Affect How Programmers Understand Programs? In the Proceedings of WCRE'97, Amsterdam, Holland, October 1997. (pdf)Abstract: In this paper, we explore the question of whether program understanding tools enhance or change the way that programmers understand programs. The strategies that programmers use to comprehend programs vary widely. Program understanding tools should enhance or ease the programmer's preferred strategies, rather than impose a fixed strategy that may not always be suitable. We present observations from a user study that compares three tools for browsing program source code and exploring software structures. In this study, 30 participants used these tools to solve several high-level program understanding tasks. These tasks required a broad range of comprehension strategies. We describe how these tools supported or hindered the diverse comprehension strategies used.
[Storey97c] Storey, M. D., F.D. Fracchia, & H. A. Müller (1997) Cognitive Design Elements to support the Construction of a Mental Model During Software Visualization. Proceedings of the 5th International Workshop on Program Comprehension, Dearborn, Michigan, U.S.A., pages 17-28, May 28-30, 1997. (pdf)Abstract: The scope of software visualization tools which exist for the navigation, analysis and presentation of software information varies widely. One class of tools, which we refer to as software exploration tools, provide graphical representations of software structures linked to textual views of the program source code and documentation. This paper describes a hierarchy of cognitive issues which should be considered during the design of a software exploration tool. The hierarchy of cognitive design elements is derived through the examination of program comprehension cognitive models. Examples of how existing tools address each of these issues are provided.
[Storey97d] Storey, M. D., F.D. Fracchia, & H. A. Müller (1997) Rigi: A Visualization Environment for Reverse Engineering. Proceedings of the International Conference on Software Engineering (ICSE'97), Boston, U.S.A., pages 606-607, May 17-23, 1997.Abstract: The Rigi reverse engineering system provides two contrasting approaches for presenting software structures in its graph editor. The first displays the structures through multiple, individual windows. The second (newer) approach, Simple Hierarchical Multi-Perspective (SHriMP) views, employs fisheye views of nested graphs. We compare and contrast these two interfaces for visualizing software graphs, and provide results from user experiments.
[*Storey96] Storey, M. D., H. A. Müller, & K. Wong (1996) Manipulating and Documenting Software Structures. Software Visualization.Abstract: An effective approach to program understanding involves browsing, exploring, and creating views that document software structures at multiple levels of abstraction. While exploring the many relationships in a multi-million line legacy software system, one can easily lose context. One approach to alleviate this problem is to visualize these structures using fisheye-view techniques. This chapter introduces Simple Hierarchical Multi-Perspective (SHriMP) views. The SHriMP visualization technique has been incorporated into the Rigi reverse engineering system, greatly enhancing its capabilities for documenting software abstractions. The applicability and usefulness of SHriMP views are illustrated with selected software visualization tasks.
[Storey95] Storey, M. D. & H. A. Müller (1995) Manipulating and Documenting Software Structures using SHriMP Views. Proceedings of the 1995 International Conference on Software Maintenance (ICSM'95), Opio (Nice), France, pages 275-284, October 16-20, 1995. (pdf)Abstract:
[*Tenny88] Tenny, T. (1988) Program Readability: Procedures Versus Comments. IEEE Trans. Software Engineering, 14, 9, 1271-1279.Comments D&N: A 3 by 2 factorial experiment was performed to compare the effects of procedure format (none, internal, or external) with those of comments (absent or present) on the readability of a PL/I program. The readability of six editions of the program, each having a different combination of these factors, was inferred from the accuracy with which students could answer questions about the program after reading it. Both extremes in readability occurred in the program editions having no procedures: without comments the procedureless program was the least readable and with comments it was the most readable.
An interesting paper that defines readability within the context of maintenance: a program is readable if information needed to maintain it is easily found by reading the code. The author formalizes this definition by expressing readability as the average number of right answers to a series of questions about the program in a given length of time.
The experiment reports that six versions of the same program were used to explore the effects of comments versus the inclusion of procedures. Four editions of the program included procedures that performed the major subtasks. Both internal and external (i.e., separately compiled) procedure definitions were used. Two of the programs were procedureless. Commented and uncommented versions of each program version were used as well. The same set of questions accompanied each of the programs. Scores were tabulated, and ANOVA and F-tests were performed to determine the statistical significance of the differences between the mean scores.
The reported results are somewhat surprising. The procedureless program with comments was the least readable, whereas the same program with no comments was the most readable. As far as this particular program is concerned, however, the author concludes that procedures have little effect on readability, whereas comments do seem to have an effect. Yet, there are compelling reasons to believe that a large program is more readable with the modules expressed as separate procedures. Thus, [While] it would be unwise to extrapolate these results to all programs, they do indicate that procedures can have little effect on the readability of programs below a certain size. The results reported by the author differ qualitatively from results obtained by himself on a previous experiment in which the procedureless program got higher scores than the program with internal procedures, with or without comments. Possible explanations for these differences are explored.
Aside from the statistical value of this experiment, the author's questions (which are included in the paper) are of much pedagogical value. Instructors are encouraged to read it. This information may be of limited value to beginning students. Advanced students may find this paper interesting nevertheless.
[Thomas90[ Thomas, E. J., and P. W. Oman. A Bibliography of Programming Style. ACM SIGPLAN Notices 25, 2 (Feb. 1990), 7-16.Comments D&N: A lightly annotated bibliography of nearly 100 references on programming style, broadly construed. The Thomas and Oman serves as a helpful complement to this bibliography.
[Tilley96] Tilley, S. R. & D. B. Smith (1996) Coming Attractions in Program Understanding. Software Engineering Institute, Carnegie Mellon University. CMU/SEI-96-TR-019. (pdf)Abstract: Program understanding is the (ill-defined) deductive process of acquiring knowledge about a software artifact through analysis, abstraction, and generalization. This report identifies some of the emerging technologies in program understanding. We present technical capabilities currently under development that may be of significant benefit to practitioners within five years. Three areas of work are explored: investigating cognitive aspects, developing support mechanisms, and maturing the practice.
[Tilley97] Tilley, S. R. (1997) Discovering DISCOVER. Software Engineering Institute, Carnegie Mellon University. (CMU/SEI-97-TR-012 ESC-TR-97-012).(local copy in SEI folder)Abstract: This report describes investigations into DISCOVER, a modern software development and maintenance environment. The study is guided by a framework for classifying program understanding tools that is based on a description of the canonical activities that are characteristic of the reverse engineering process. Implications of this work for advanced practitioners, researchers and tool developers, and the framework itself are discussed.
[Tilley98a] Tilley, S. R. (1998) Coming Attractions in Program Understanding II: Highlights of 1997 and Opportunities in 1998. Software Engineering Institute, Carnegie Mellon University. (CMU/SEI-98-TR-001).(local copy in SEI folder)Abstract: This report highlights important developments in program-understanding work in 1997 and outlines some of the opportunities for the field in 1998. A framework of three focus areas is used to categorize research and development activities in program understanding: investigating cognitive aspects, developing support mechanisms, and maturing the practice. Although significant progress was made in these areas, the rapid changes in the software engineering landscape are giving rise to several new challenges. Three of the most important in the coming year are leveraging the Web, black-box understanding, and the Year 2000 problem.
[Tilley98b] Tilley, S. R. (1998) A Reverse-Engineering Environment Framework. Software Engineering Institute, Carnegie Mellon University. (CMU/SEI-98-TR-005). (local copy in SEI folder)Abstract: This report describes a framework for reverse-engineering environments used to aid program understanding. The framework is based on a descriptive model that categorizes important support mechanism features based on a hierarchy of attributes. The attributes include cognitive model support, reverse-engineering tasks, canonical activities that are characteristic of the reverse-engineering process, quality attributes supported by the reverse-engineering environment, and miscellaneous characteristics.
Tiemens89 Tiemens, T. (1989) Cognitive Models of Program Comprehension. Software Engineering Research Center, Georgia Institute of Technology. swpi/gatech/cogmodels.pdfAbstract: This paper describes some cognitive models in program comprehension. The goal is to use this knowledge about cognitive models to produce a tool (the Cognitive a Support Tool, CST) which can reduce the amount of effort needed to understand program. The models discussed here were derived from both a human perspective and from a source code perspective. After reviewing these models, a synthesis section suggests some implications of the information presented. Finally, a section describing a sample interactive sessions with CST is presented.
[Tonella97]Tonella, P. , G. Antoniol, R. Fiutem E. Merlo (1997) Points-to Analysis for Program Understanding. Proceedings of IWPC'97, May 28-30, 1997, Dearborn, MI. (wpc97.pdf)Abstract: Real world programs (in languages like C) heavily make use of pointers. Program understanding activities are thus made more diffcult, since pointers affect the memory locations that are referenced in a statement, and also the functions called by a statement, when function pointers are used. The programmer needs to build a mental model of the memory use and of the pointers to its locations, in order to comprehend the functionalities of the system.
This paper presents an efficient flow insensitive context insensitive points-to analysis algorithm capable of dealing with the features of the C code. It is extremely promising with regard to scalability, because of the low complexity. The results are valuable by themselves, as their graphical display represents the points-to links between locations. They are also integrated with other program understanding techniques like, e.g., call graph construction, slicing, plan recognition and architectural recovery.
[Vans99] Vans, A., von Mayrhauser, A., &. Somlo, G. (1999) Program Understanding Behavior During Corrective Maintenance of Large-scale Software. International Journal of Human-Computer Studies, v 51, n 1, July 1999, p31-70.Abstract This paper reports on a software understanding field study of corrective maintenance of large-scale software. Participants were professional software maintenance engineers. The paper reports on the general understanding process, the types of actions programmers preferred during the debugging task, the level of abstraction at which they were working and the role of hypotheses in the debugging strategies they used. The results of the observation are also interpreted in terms of the information needs of these software engineers. We found that programmers work at all levels of abstraction (code, algorithm, application domain) about equally. They frequently switch between levels of abstraction. The programmers' main concerns are with what software does and how this is accomplished, not why software was built a certain way. These questions guide the work process. Information is sought and cross-referenced from a variety of sources from application domain concepts to code-related information, outpacing current maintenance environments' capabilities which are mostly stratified by information source, making cross-referencing difficult.
[*Viravan94] Viravan, C. (1994) Enhancing Debugging Technology. Ph.D. Dissertation, Purdue University.Comments UMaD: This work detail the design of a debugging critic. As a critic the system offers advice and guidance as the individual works at removing defects from software. In the process of elaborating the requirements for the critic, the author ran several empirical studies involving a "live" critic. The structure of these experiments, with the methods and material used, may provide an initial direction for constructing comprehension activities where the goal is debugging.
[vonMayrhauser93] von Mayrhauser, A. & A. M. Vans. (1993) From Code Understanding Needs to Reverse Engineering Tool Capabilities. Procs. Sixth International Workshop on Computer-Aided Software Engineeing, July 19-23, 1993, Singapore, p. 230-239. (case93.pdf)Abstract: Maintenance frequently consumes more resources than new software development. A major portion of the maintenance eort is spent on the reverse engineering activity of understanding existing software. If we can learn more about how programmers understand code successfully, we can build better tools to support the understanding process. This contributes to higher quality and improved eciency of maintenance tasks. We present an integrated code comprehension model and our experiences with it in an industrial setting. We use audio-taped, think-aloud reports to investigate how well our integrated code comprehension model works during industrial maintenance activities ranging from code fixes to enhancements, code leverage, and reuse. We analyze the tapes for information needs during maintenance activities and derive tool capabilities accordingly.
[vonMayrhauser94] von Mayrhauser, A. & A. M. Vans. (1994) Comprehension Processes During large Scale Maintenance. Procs. International Conference of Software Engineering ICSE 16, May 1994 Sorrento, Italy, p. 39-48. (icse.pdf)Abstract: We present results of observing maintenance engineers working with industrial code at actual maintenance tasks. Protocol analysis is used to explore how code understanding might differ for small versus large scale code. The experiment confirms that cognition processes work at all levels of abstraction simultaneously as programmers build a mental model of the code. Cognition processes emerged at three levels of aggregation representing lower and higher level strategies of understanding. They show differences in what triggers them and how they achieve their goals. Results are useful for defining core competencies which maintenance engineers need for their work and for documentation and development standards.
Comments UMaD: The experiments documented in this paper support, perhaps enhance, the author's metamodel of program comprehension. Perhaps more importantly, the paper elaborates specific processes and competencies that appear during professional maintenance programmers work on real artifacts. The results of the protocol analysis yields several important results: 1) maintenance programmers use a multi-level approach which switches between the three postulated area (domain model, situation model, and program model), 2) maintenance activities are described by a small set of cognititive processes which aggregate into higher level process (on several level of abstration), and 3) current practice does not support effective understanding activity as it compartmentalizes in separate documents ( this conclusion may support the needs for more hpertext view of software where the requirements, design and code (perhaps other documents) can be cross referenced providing navigational links between dependencies (how close is this notion to that of literate programming by Knuth?).
[*vonMayrhauser95] von Mayrhauser, A. & A. M. Vans. (1995) Program Understanding: Models and Experiments. In M. Yovits & M. Zelkowitz (eds.), Advances in Computers, Vol 40. San Diego: Academic Press. p. 1-38.Abstract: Models of how programmers understand code they have not written have been developed and analyzed for many years. These models describe program comprehension at various levels of detail. This paper puts them in perrspective, particularly with regard to specialized maintenance tasks versus general code understanding needs. Experiments support some, but not all, comprehension models. We analyze models and their validation experiments to see what the current state of knowledge about program comprehension offers. Open issues point to a need for experimental studies with experienced software engineer working on specific maintenance tasks and large-scale code in state-of-the-art environments.
Comments UMaD: Perhaps the best source for a complete assessment of the state of program comprehension research. This book chapter attempts to organize the field and elaborate on empirical findings that validate which of the suggested models. The research reviewed apprears to be rather complete. The authors review the major camps (6), and provide an overview of the theoretical position and empirical support. Also included is the authors metamodel for program comprehension that offers single model that incorporates both top-down and bottom-up comprehension strategies.
[vonMayrhauser96a] von Mayrhauser, A. & A. M. Vans. (1996) On the Role of Hypotheses during Opportunisitic Understanding While Porting Large Scale Code. Procs. Fourth International Workshop on Program Comprehension (IWPC '96), March 1996, Berlin, Germany. (cmp4.pdf)Abstract: Hypotheses are major drivers of program comprehension. We report on a case study observing an experienced software engineer porting a large software system and the role of hypotheses in accomplishing the porting task. Observations confirm some existing theoretic models and experimental finding, but not all. While generalization based on a case study is of necessity limited, the results could be the basis for further experiments. They also point to information that would help novices to become experts faster.
[*vonMayrhauser96b] von Mayrhauser, A. & A. M. Vans. Identification of Dynamic Comprehension Processes During Large Scale Maintenance. IEEE Trans. Software Engineering 22, 1996, pp. 424-437. (mayrhauserIEEE96.pdf)Abstract: We present results of observing professional maintenance engineers working with industrial code at actual maintenance tasks. Protocol analysis is used to explore how code understanding might differ for small versus large scale code. The experiment confirms that cognition processes work at all levels of abstraction simultaneously as programmers build a mental model of the code. Analysis focused on dynamic properties and processes of code understanding. Cognition processes emerged at three levels of aggregation representing lower and higher level strategies of understanding. They show differences in what triggers them and how they achieve their goals. Results are useful for defining information which maintenance engineers need for their work and for documentation and development standards.
Comments UMaD: The hallmark of von Mayrhauser's approach is the use of professionals working on real products doing real tasks. This provides her with the opportunity to explore questions which aren't possible for those using students as subjects. Furthermore, her explorations of comprehension models is from the perspective of the maintenance programmer. Hence, her model development is couched much more in terms of comprehension and less from the developer's view. For me this is one of the important points. It would appear that the act of designing requires a certain set of skills but the act of maintenance requires a different set, especially in the realm of understanding and strategies for understanding foreign code. Perhaps a significant finding is that maintenance programmers invoke comprehension strategies opportunisticly across levels of abstraction (that is the three major components of the metamodel: 1) domain, situation, and program model). The results indicate:
Abstract: This paper reports on a software understanding experiment during re-engineering of large-scale software. Participants were professional software maintenance engineers. The paper explains the general understanding process, the information needs of these software engineers during their tasks, and the tool capabilities that would help them to be more productive.
]vonMayrhauser97] von Mayrhauser, A. & A. M. Vans. Programming Understanding Behavior During Debugging of Large Scale Software Empirical Studies of Programmers, 1997, pp. 157-179. (p157-von_mayrhauser.pdf)Abstract: This paper reports on a software understanding experiment during corrective maintenance of large- scale software. Participants were professional software maintenance engineers. The paper reports on the general understanding process, the types of actions programmers preferred during the debugging task, and the level of abstraction at which they were working. The results of the observation are also interpreted in terms of the information needs of these software engineers during the debugging task.
]*vonMayrhauser98a] von Mayrhauser, A. (1998) From Program comprehension to Software Maintenance Support Tools. Colorado Advanced Software Institute Technical Report, CASI-TR-98-06.Comments UMaD: This technical report details the complete design and implementation of an assessment of the Lemma tool for IBM. Basically the issue is: does the tool effectively support the task(s) of a maintenance programmer?
To adequately answer the question von Mayrhauser uses an experimental design comparing Lemma users with non-Lemma users on maintenance tasks. The data collected allowed the PI to determine the actions of the programmers and the influence of the tool (this is through comparison). In general, based on von Mayrhauser's metalmodel of comprehension, the focus was on the information needs of the programmer and how they gain the information with and without the CASE tool.
]*vonMayrhauser98b] von Mayrhauser, A. & A. Vans (1998) Program Understanding Behavior During Adaption of Large Scale Software. Proceedings of the 6th International Workshop on Program Comprehension, June 24-26, 1998, Ischia, Italy, pp. 164-172.Abstract: We report on a software understanding study during adaption of large-scale software. Participants were professional maintenance engineers. The paper reports on the general understanding process, the types of actions programmers preferred druing the adaptation task, and the level of abstraction at which they were working. The results of the observation are also interpreted in terms of the information needs of these software engineers.
Comments UMaD:
The results supported the use of the tool, but beyond that the report describes how the features of the tool supported the maintenance programmer. In addition there are recommendations for other features that appeared to be recommended from the study.
NOTE: There are a number of things in von Mayrhauser's work that deserve additional consideration, aside from her general metamodel: 1) information types (and sources), 2) action types, and 3) comprehension processes.
[Walker98b] Walker, R., G. Murphy, B. Freeman-Benson, D. Wright, D. Swanson, & J. Isaak (1998) Visualizing Dynamic Software System Information through High-level Models. In the 1998 ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications, October 1998. (oopsla98-viz.pdf)Abstract: Dynamic information collected as a software system executes can help software engineers perform some tasks on a system more effectively. To interpret the sizable amount of data generated from a system's execution, engineers require tool support. We have developed an off-line, flexible approach for visualizing the operation of an object-oriented system at the architectural level. This approach complements and extends existing profiling and visualization approaches available to engineers attempting to utilize dynamic information. In this paper, we describe the technique and discuss preliminary qualitative studies into its usefulness and usability. These studies were undertaken in the context of performance tuning tasks.
[Walker98a] Walker, R., E. Baniassad, & G. C. Murphy (1998) An Initial Assessment of Aspect-Oriented Programming . UBC Computer Science Technical Report TR-98-12. (UBC-CS-TR-98-12.pdf)Abstract: The principle of separation of concerns has long been used by software engineers to manage the complexity of software system development. Programming languages help software engineers explicitly maintain the separation of some concerns in code. As another step towards increasing the scope of concerns that can be captured cleanly within the code, Kiczales and colleagues have introduced aspect-oriented programming. In aspect-oriented programming, explicit language support is provided to help modularize design decisions that cross-cut a functionally-decomposed program. Aspect-oriented programming is intended to make it easier to reason about, develop, and maintain certain kinds of application code. To investigate these claims, we conducted two exploratory experiments that considered the impact of aspect-oriented programming, as found in AspectJ version 0.1, on two common programming activities: debugging and change. Our experimental results provide insights into the usefulness and usability of aspect-oriented programming. Our results also raise questions about the characteristics of the interface between aspects and functionally-decomposed core code that are necessary to accrue programming benefits. Most notably, the separation provided by aspect-oriented programming seems most helpful when the interface is narrow (i.e., the separation is more complete); partial separation does not necessarily provide partial benefit.
[Weinberg71] Weinberg, Gerald M. The Psychology of Computer Programming. New York: Van Nostrand Reinhold, 1971.Comments D&N: Weinberg devotes the first chapter of his well-known book to program reading, remarking ruefully that [e]ven programmers do not read programs. He suggests that there is much to learn from reading both good and bad programs. Most of the chapter is devoted to examples of the factors affecting what actually gets coded: limitations of the machine, the implementation language, and the programmer; historical accidents; and evolving specifications.
[Weiser81] Weiser, Mark. Program Slicing. Proc. 5th Int. Conf. on Software Eng. New York: IEEE, 1981, 439-449.Abstract: Program slicing is a method used by experienced computer programmers for abstracting from programs. Starting from a subset of a program's behavior, slicing reduces that program to a minimal form which still produces that behavior. The reduced program, called a slice, is an independent program guaranteed to faithfully represent the original program within the domain of the specified subset of behavior.
Finding a slice is in general unsolvable. A dataflow algorithm is presented for approximating slices when the behavior subset is specified as the values of a set of variables at a statement. Experimental evidence is presented that these slices are used by programmers during debugging. Experience with two automatic slicing tools is summarized. New measures of program complexity are suggested based on the organization of a program's slices. Being able to find a program slice simplifies analysis of a program. Even though program slicing cannot be fully automated, the concept of a slice is a useful one.
Comments D&N: Weiser explains slicing by pointing out that, when fixing a bug, an experienced programmer usually focuses only on those parts of the program that may obviously have something to do with the bug in question. Other parts of the program are ignored, effectively having been deleted in the programmer's mind from the code being studied. Programmers apply this same technique when making program improvements or modifications.
The paper considers the slicing of block-structured programs written in a Pascal-like language. A slice must have two desirable properties: (1) it must have been obtained from the original program by statement deletion, and (2) the behavior of the slice must be the same as that of the original program, as observed through the domain of the specified subset of behavior. Characterizations of programs in terms of flow graphs are explained, and meaning is given to a slice within those contexts. To make the problem of finding a program's slice tractable, Weiser introduces a weaker definition of slice and gives sufficient conditions for statement inclusion. Weiser also introduces a number of slice-based complexity metrics and discusses their computation.
The paper is quite technical and is recommended only for teachers and advanced students. It does, however, provide a name for and some analysis of an intuitive, widely used comprehension strategy.
[Weiser82] Weiser, M. (1982) Programmers Use Slices When Debugging. CACM, 26. p. 446-452.Abstract: Computer programmers break apart large programs into smaller coherent pieces. Each of these pieces: functions, subroutines, modules, or abstract datatypes, is usually a contiguous piece of program text. The experiment reported here show that programmers also routinely break programs into one kind of coherent piece which is not contiguous. When debugging unfamiliar programs programmers use program pieces called slices which are sets of statements related by their flow of data. The statements in a slice are not necessarily textually contiguous, but may be scattered through a program.
Comments UMaD: This paper reports an experiment which claims support for the use of program slicing by programmers. The idea behind slicing is extracting a code fragment from the program that has a series of lines that are important for dealing with a particular variable in a different line. It is suggested that explicity attention to slicing, teaching students to create and use slices, could improve debugging skill. It should prove instructive to compare the results found here with those of Viravan.
[Welty96] Welty, C. (1996) An Integrated Representation for Software Development and Discovery. PhD Dissertation Rensselaer Polytechnic Institute. (html)Abstract: This thesis presents research that is focused on making software systems more understandable at the code-level, the level of specification provided by programming languages and at which software is typically maintained in practice. The goal has been to make a software system easier to maintain by presenting a better represented and more understandable code-level. Previous efforts to achieve this goal in the area of Software Information Systems have been successful in limited ways. The main emphasis of this research has been to extend the underlying representations used by Software Information Systems and demonstrate how these extensions better serve the goal. The focus of these improvement efforts is primarily on two areas: representing the code-level knowledge more completely, and incorporating explicit domain knowledge.
The first focus has produced an ontology of code-level knowledge in an object-oriented software system. The explicit nature of the ontology makes possible the automatic detection of side-effects, delocalized plans, and vestigial code in methods, and greatly facilitates the implementation of graphical tools for finding information about the software. While these features in themselves clearly serve to make a software system more understandable, the relative ease with which they were implemented is further evidence that the represention does offer improvements towards meeting this goal.
Through the second focus it was discovered that there was a serious obstacle to merging software domain and code-level knowledge: code-level knowledge is second order with respect to domain knowledge. A facility to represent this kind of second-order relationship was developed which avoids the problems of undecidability inherent in fully second-order systems. This facility allows for domain knowledge to be represented explicitly at the code-level.
The representation has been tested on a sample domain, Knowledge-Based Email Distribution, and examples are given to show how the representation makes the software easier to understand. The approach is also compared to other systems used for understanding and maintenance.
[Whitney95] Whitney, M., K. Kontogiannis, J. H. Johnson, M. Bernstein, B. Corrie, E. Merlo, J. G. McDaniel, Renato De Mori, H. A. Müller, J. Mylopoulos, M. Stanley, S. R. Tilley, & K. Wong (1995) Using an Integrated Toolset for Program Understanding. Proceedings of CASCON '95, Toronto, Ontario, November 7-9, 1995, (pp. 262-274). (NRC39177.pdf)Abstract: This paper demonstrates the use of an integrated toolset for program understanding. By leveraging the unique capabilities of individual tools, and exploiting their power in combination, the resultant toolset is able to facilitate specific reverse engineering tasks that would otherwise be difficult or impossible. This is illustrated by applying the integrated toolset to several typical reverse engineering scenarios, including code localization, data flow analysis, pattern matching, system clustering, and visualization, using a mid-size production program as the reference system.
Comments UMaD:
[Wiedenbeck99] Wiedenbeck, S. & Ramalingam, V. (1999) Novice comprehension of small programs written in the procedural and object-oriented styles. International Journal of Human-Computer Studies, v 51, n 1, July 1999, p71-87 .Abstract This research studied the comprehension of small procedural and object-oriented programs by novice programmers. The objective was to find out what kinds of information novice programmers extract from small programs and to infer from this the mental representation formed during program comprehension. In particular, the question was whether novices' mental representations focus more on domain-level or program-level knowledge and whether the mental representation of object-oriented program differ from procedural programs. The experiment indicated that novices tend to develop a mental representation of small object-oriented programs strong in function-related knowledge, but weaker in data flow and program-related knowledge. By contrast, novices' mental representations of small procedural programs were stronger in program-related knowledge. The results are discussed in terms of theories of program comprehension and programming pedagogy.
[Wilde89] Wilde, N., and S. M. Thebaut. The Maintenance Assistant: Work in Progress. J. Syst. and Software 9, 1 (Jan. 1989), 3-17.Comments D&N: The Maintenance Assistant project at the Florida/Purdue Software Engineering Research Center seeks to develop methodologies and tools in the complex tasks associated with making changes to software systems. Three broad approaches are currently being explored: dependency analysis involves capturing the dependencies between different entities in a software system and the development of tools to present and analyze these dependencies. Reverse engineering involves the identification or recovery of program requirements and/or design specifications that can aid in understanding and modifying it. Program change analysis involves methods for analyzing differences between two versions of a program in order to understand a change that has been made and detect possible maintenance-induced errors. A strength of the project has been the very close relationship with the industrial affiliates of the Software Engineering Research Center. It is hoped that these organizations will be able to apply the methodologies currently being explored in their own software projects and in tools to be used by their clients.
This paper surveys a number of program maintenance techniques currently in use in industry and under prototype development at the Software Engineering Research Center (SERC). The work described is expected to produce tools that are language-independent, semi-automatic (with human interaction required), and potentially applicable to programs of any size.
The author discusses four broad classifications of dependency analysis: data flow dependencies, definition dependencies, calling dependencies, and functional dependencies. A prototype tool is under development to assist the programmer in exploring these dependencies. Components of the system all utilize a single program database. The prototype handles only C programs.
SERC's reverse engineering effort focuses on identifying a useful model of program comprehension. The initial goal is to establish a framework for identifying and assessing the effectiveness of strategies and techniques that either aid the comprehension process directly or partially automate it. In connection with this, SERC has surveyed some 120 program reading tools currently in use. A summary of their findings is presented in the paper.
Finally, program change analysis to assess the impact of program change is discussed. Change analysis tools can be used to help programmers identify unexpected side effects, to guide management in the allocation of resources, or to gauge the system's vulnerability to newly introduced errors. The paper reports on SERC's strategies based on incremental data flow analysis.
This paper does a good job of surveying existing tools and suggesting the nature of those that might become available in the future. Recommended reading for teachers and advanced students.
[Wilde90] Wilde, N. Understanding Program Dependencies. Curriculum Module SEI-CM-26, Software Engineering Institute, Carnegie Mellon University, Pittsburgh, Pa., Aug. 1990.Capsule Description: A key to program understanding is unraveling the interrelationships of program components. This module discusses the different methods and tools that aid a programmer in answering the questions: How does this system fit together? and If I change this component, what other components might be affected?
Comments D&N:This curriculum module discusses some of the important relationships that may exist among elements of a program. Wilde briefly discusses program comprehension and what is known about it. The bulk of the module treats dependencies among data items, types, program units, and source files: what they are, how to find them, how they can be presented to the program reader, and what tools are available to help the reader deal with them.
The author's interest is principally in what the maintainer needs to know about how program components work together. Even within this context, Wilde's scope is narrow. Nonetheless, this module is useful in its making explicit some of what the program reader may need to learn from a program.
Like all SEI curriculum modules, this report is addressed to teachers, although its concise overview may appeal to students as well.
Comments UMaD: There are a number of different types of dependencies that can be derived from the source code. Wilde presents a classification system in section 2. In section 3, he introduces techniques for finding such dependencies. The instructional considerations are sparse, and typically relate to the structure of a project. There is no discussion of the nature of the dependency and the cognitive value a particular representation has on the type of task.
[Wilde95a[ Wilde, N., S. W. Dietrich, & F. Caliss (1995) Designing Knowledge-Base Tools for Program Comprehension: A Comparison of EDATS & IMCA. Software Engineering Research Center, University of Florida, SERC-TR-79-F.Executive Summary: Since software engineers spend a large proportion of their time trying to understand computer programs, many tools have been proposed to help them with this task. The construction of such tools raises a series of specification and design issues and requires a careful choice among alternative user interfaces, tool architectures, and knowledge representations.
This paper describes and compares two such tools, the Extensible Dependency Analysis Tool Set (EDATS) and the Inter-Module Code Analysis system (IMCA). EDATS was developed as a project of the Software Engineering Research Center while IMCA is an ongoing research effort at Arizona State University.
A case study is presented showing how each tool would be used to support typical program comprehension tasks. Though the two tools have quite similar objectives, their designs are radically different, leading to interesting contrasts in flexibility and ease of use.
[Williams99] Williams, M. & Buehler, J. (1999) Comparison of visual and textual languages via task modeling. International Journal of Human-Computer Studies, v 51, n 1, July 1999, p89-115 .Abstract In order for comparative studies of programming languages to be meaningful, differences between the languages need to be carefully studied and well understood. Languages that appear to differ only in syntax (for example, visual vs. textual syntax) may in fact differ greatly in usability. Such differences can confound comparative studies unless they are controlled for. In this paper, we examine the usefulness of fine-grained task modeling for studying the usability of programming languages. We focus on program entry, and demonstrate how to create models of program entry tasks for both visual and textual languages. We also demonstrate how to derive performance time estimates from the models using keystroke-level analysis. A by-product of the model building is a collection of functional-level models that can serve as building blocks for modeling higher-level visual programming tasks. We then report on a comparative study of languages with the same semantics but different syntax (visual and textual). Model-based time predictions of program entry tasks were compared to observed times from an empirical study. The time estimates for the visual condition greatly overestimated the observed times. The primary source of the overestimates appeared to be the time estimate for pointing with the mouse. We then look at three different approaches to improving program entry models. We report on a separate study to calibrate the mouse-pointing time estimate, and demonstrate improved correlation between predicted and observed times with the new estimate. We also apply task modeling to program editing activities, in order to model error recovery behavior during program entry. Finally, we discuss language-specific customization of the keystroke-level operator for mental preparation. We conclude that task modeling is a useful technique for studying differences in the usability of programming languages at the keystroke level.
[*Win9?] Win, N. W. (199?) Capturing and Documenting Literate C Programs. (pdf)Introduction: Documentation is crucially important as an aid to understanding the system. Even when the code is designed so that changes can be carried out eciently, the design principles and design decisions are often not recorded in a form that is useful to future maintainers. Documentation is the aspect of software engineering most neglected by both acaedemic researchers and practitioners. It is common to hear a programmer saying that the code is it's own documentation.
When documentation is written, it is usually poorly organised, incomplete and imprecise. Often the coverage is random: a programmer or manager decides that a particular idea is clever and write a memo about it while other topics, equally important, are ignored. In other situations, where documentation is a contractual requirement, a technical writer, whose does not understand the system, is hired to write the documentation. The resulting documentation is ignored by the maintenance programmers because it is not accurate. Some projects keep two sets of books: there is the official documentation, written as required for the contract, and the real documentation.[4] Documentation that seems clear and adequate to its authors is often about as clear as mud to the programmer who must maintain the code 6 months or 6 years later. Even when the information is present, the maintenance programmer doesn't know where to look for it. It is almost as common to find that the same topic is covered twice, but that the statements in the documentation are inconsistent with each other and the code.
A major step in showing the aging of older software, and often rejuvenating it, is to upgrade the quality of the documentation. Often documentation is neglected by the maintenance programmers because of their haste to correct problems reported by custommers or to introduce failures demanded by the market. When they do document their work, it is often by means of a memo that is not integrated into the previously existing documentation, but simply added to it. If the software is really valuable, the resulting unstructured documentation can, and should, be replaced by carefully structured documentation that has been reviewed to be complete and correct.
Knuth introduced the term literate programming to describe the concept that programming should produce works of literature. In [2], Knuth presents literate programming as a view of programming in which the primary purpose of a program is to communicate to other humans how the author wants the computer to perform a task, rather than the traditional view in which the program has seen as a way of instructing a computer. To support literate programming, Knuth designed a system called WEB; in WEB the Pascal program source code is combined with its documentation in a single file, which can then be processed to create either a document intended for human reading or a program to be handed to a compiler. The document created for human reading provides a rich set of cross-references and indexes, to help the reader nd his way arround the document.
Literate programming is used in program development phase, to write the program with its source code together. This paper describes the use of literate programming in software maintenance phase of the software life cycle.
[Wong96] Wong, K. (1996) On Inserting Program Understanding Technologies into the Software Change Process. Proceedings of the IEEE Fourth Workshop on Program Comprehension, (Berlin, Germany; March 29-31, 1996). IEEE Computer Society Press, March 1996.Abstract: Program understanding technologies can be applied objectively in the analysis phase of a software change process. The analysis phase naturally follows a goal-driven metaprocess. Described are issues involved with inserting program understanding technology into existing practice and into such a metaprocess. The implied processes of program understanding and reverse engineering tools play an important role. These issues pose major problems for the acceptance of redocumentation tools such as Rigi, an evolvable reverse engineering tool. An example using Rigi and its analysis methodology for change-impact analysis is considered.
[Woods96] Woods, S. & Q. Yang (1996) The Program Understanding Problem: Analysis and a Heuristic Approach.ICSE '96. Proceedings of the 18th international conference on Software Engineering, March 25-29, 1996, Berlin, Germany. (waterloo/p6-woods.pdf)Abstract: Program understanding is the process of making sense of a complex source code. This process has been considered as computationally difficult and conceptually complex. So far no formal complexity results have been presented, and conceptual models difer from one researcher to the next. In this paper we formally prove that program understanding is NP-hard. Furthermore, we show that even a much simpler subpmblem remains NP-hard. However we do not despair by this result, but rather offer an attractive problem-solving model for the program understanding problem. Our model is built on a framework for solving Constraint Satisfaction Problems, or CSPS, which are known to have interesting heuristic solutions. Specifically we can represent and heuristically address previous and new heuristic approaches to the program understanding problem with both existing and specially designed constraint propagation and search algorithms.
Comments should be sent to
Richard Upchurch (rupchurch@umassd.edu) This document
Created: March 8, 1996
by RLU
Modified: June 8, 2001