So whether we like or not, we will be co-existing with lot of software. And by definition understanding existing software becomes critical. This article is about my thoughts on how to ease the life of the programmer who is banished to understand other's code. More often than not there is little or no documentation, both inside the code and outside (what do you mean documentation, you have the code!). The challenge this programmer is faced with is to extract design ideas out of the code. Before a programmer embarks on this journey she needs to understand the goals of why she is doing it very clearly. It could taking ownership of the software, fixing a defect or adding a new feature. The following thoughts come to mind.
- The first step in understanding a software system is to go outside in. One needs to understand as much as possible about the execution context, functionality exposed to the user, external interfaces to the world and so on. This information is embedded in User Interfaces so "playing around" with the system is obviously desirable. Some times this is not possible. The programmer has to satisfy herself with reading user manuals or rely on user descriptions or even use only reports and other artifacts generated by the system under consideration.
- A neglected aspect in this regard are the test cases for the system. If the quality assurance for the system is up to date and complete, then the test cases are an extremely rich source information about the behaviour of the system. The programmer needs to classify the test cases and start looking at "end to end" cases. Executing these test cases on a running system is very desirable.
- The next step is to understand the deployment aspects of the system. If one is faced with a compiled system, the build scripts embed a lot of information about the dependencies within the system. The process of coming to binary executables from code can be quite complex. Understanding this process is critical in understanding the software. There are a few tools that help the programmer in this process - for example one tool coverts build scripts into a graphical representation that is much easier to comprehend.
- The next step is to start getting into the code. Many times programmer start understanding a software system inside out, that is start looking at the code first. This is not a good idea as it can be very confusing leading to wasted time. The better way is go outside in. However these three steps will be needed to be done back and forth many times. Coming back to understand source code there are quite a few advances in this area.
- One of the ways to start is to set up source code as a "project" in an appropriate Integrated Development Environments (IDE). The IDE support for modern languages is phenomenal and one needs to harness this for understanding code. For one navigating the code becomes extremely easy.
- Another point to consider here is Program Understanding has developed into a science in the last two decades and there are tools now availble to automate this process. Some examples:
- See University of Wisconsin's program slicing tool page here: http://www.cs.wisc.edu/wpis/html/
- The C Information Abstraction system from Bell Labs - http://www.uni-koblenz.de/FB4/Institutes/IST/AGEbert/Teaching/WS0405/RE/ChNiRa90.pdf.
- JDepend (http://clarkware.com/software/JDepend.html) gives a lot of information about the structure of a Java program.
- It is rarely that a programmer gets into understanding large programs just for the sake of understanding it or for leisure. More often than not, the effort is driven by a need to change the behaviour of the software under consideration - either add a new functionality or fix a defect. In this case the programmer has a definite goal. She can now start narrowing on to this goal. The ideal situation is to understand all the flows related to the module being changed and have a good sense of the impact of changing the code.
- An interesting note here - as part of doing the above a programmer mentally "slices" the program in various ways. In computer science this is studied as "Program Slicing". There are a few tools that can help the programmer here, but more importantly having this theoretical awareness helps the programmer to have a better perspective on how to go about achieving ones goals.