Michael W. Bigrigg, Copyright 2004-2007
Software robustness should not be an afterthought or left to the end of the software development process to be considered. In my former life of commercial software development, we fol-lowed a simple cycle of: edit, compile, and then run. The run part consisted of running the application in a manner that would exercise the portion of code just developed. We then analyzed the results of running the application, used that information for feedback, and our cycle would start again. It was straightforward to test the majority of our code. Running the application in a failure-free environment to test the correct behavior was easy. The exceptional conditions that the application could run into and upon which we coded behavior, was not so easily tested.
In our initial implementation of our software exception injection (SWEI) system, FlakyIO, we were able to determine that a number of common utility programs were not totally robust to the presence of exceptions. The applications would either blissfully ignore exceptions or simply did not accurately check for the correct exception condition. Beyond reporting the robustness benchmarking numbers, we took it upon ourselves to attempt to fix these robustness problems. There were only two dozen applications. Applications that did not accurately check the errors properly, yet still had error handling code, could be fixed faster. It was difficult to understand the correct behavior and an appropriate way of handling the error when no error handling code was present. We recognized that after weeks of attempting to fix the errors, we needed to return to our real work.
But it has left a deep mark on us. While the exception injection systems that we developed can be used as a tool to test the robustness of an application, its real power is its incorporation into the software development process. We also discovered that even with our understanding of reliability, we began to approach the testing results the way I remember developing commercial software years ago. We began to do whatever it took to make the errors go away. It is unfortunate, but a testing mechanism left to the end of the development process was viewed as more of an incon-venience than a help. Therefore, we have built SWEI systems to help programmers exercise their exception handling code.
This book overviews the framework that we used to show you how to implement your own FlakyIO system based on the model we developed, outlining the key points that must be understood to build a SWEI system. We will explain some of our fundamental understanding of exception handling. The bulk of this book discusses the results of the design and implementation of many of our SWEI systems. We will also present some observations that arose out of our exception injection experimentations.
We are not developing an explicit software fault injection (SWFI) system for I/O. A fault injection system for I/O would be to remove the data device or flip bits, to stress the system based on known ways the system fails. Fault injection will perturbate the data or the system in an at-tempt to raise an error or exception. Exception injection explicitly raises the exception directly. Our SWEI does not contain fault models, which identify the faults upon which exceptions may be introduced into the system. Our injection is based on a listing of all the possible exceptions that may be raised in the course of execution. A fault model may be necessary for non-I/O based systems that contain function calls with side effects and is mentioned in the future works section.
There are two approaches to the detection of missing exception handling. The first one is static analysis through the use of compiler techniques along with formal methods. These systems do not provide explicit validation of exception handling; they provide the mechanisms to allow for exception handling validation. The second approach is dynamic analysis or testing, often in a similar form to fault injection.
While fault injection systems will test the exception handling code, it does not provide a means of exercising a single exception handler. This is necessary to integrate exception handling testing into the software development cycle.
The Flaky Family consists of several systems that have been developed using the Flaky exception injection model. It is a model and not an integrated system, simply because the components are so diverse that it was not possible to make one single flaky system that encompassed every-thing. The FlakyIO family consists of SWEI for languages such as FlakyC, FlakyJava, and FlakyPHP, for handheld systems such as FlakyPalm and FlakyWinCE, for networking systems such as FlakyNet and FlakyCORBA, and for systems such as FlakyDisk. Overall the results show that every system has application instances that do not do correct exception handling.