Michael W. Bigrigg, Copyright
2004-2007
Computing devices, in particular storage devices are no longer passive entities. Technology trends are increasing the computing power that can be cost-effectively collocated with embedded computing components, such as disk drives, graphics and network interface cards. The benefit of embedded processors for bringing products to market faster using embedded processors is well known.
These devices communicate with the host via a network using a communication protocol. A good example is a SCSI (small computer system interface) disk. A SCSI disk is a device that combines storage with a processor. SCSI is also the communication protocol between a host computer and a storage device. The communication between the host and these devices act as a client-server computing environment.
The processor on the device performs activities such as disk scheduling and buffer management. In addition, the processors are used to gather information on the state of the disk, making the device somewhat self-aware. Information such as the nature of error in the form of exceptions is given from the device to the host computer.
A robust system is a chicken and egg problem. In order to create a system that is responsive to exceptions from underlying devices, the majority of devices must convey the exceptional conditions. In order for the devices to be built to convey their exceptional conditions, the operating system must be built to acknowledge and use the exceptions. Given a reliability-aware device, a disk, we explore the capabilities of operating systems to handle the exceptions in an appropriate manner. Anecdotal evidence has suggested that some operating systems will hang applications when the SCSI disk is not available. The question we tried to answer was: What conditions lead to the operating system hanging?
Exceptions are different than faults. A SCSI disk fault injection system would emulate such faults as bits flipped in the messages during transmission, lost messages, or the device stopped responding. Systems must be resilient first to faults, and then given the correct request-reply messages, the system must also be resilient to exceptions.
An exception is an informational message from a sub-system or underlying device that conveys additional information. Typically an exception is a designation of failure, but often provides more information about the cause of the exception.
For us to test its exception handling ability, the network communication protocol must support exceptions. SCSI embeds exceptions into its protocol as a characteristic of a reply message.
We implemented an exception-injection SCSI disk emulator, called FlakyDisk, to understand the response of the operating system I/O subsystems to faults. We converted a SCSI target mode device system available in FreeBSD 4.8. The base system uses a desktop computer running in SCSI target mode to act as a SCSI disk. An initiator machine is attached to the same SCSI bus, and communicates with the target machine as if it were a SCSI disk. The target machine emulates a SCSI hard disk. It is seen as a hard disk from initiator machine's view. The running OS is FreeBSD 4.8 Release. It contains a SCSI card that supports target mode. We used and Adaptec AHC. The SCSI commands are captured within the operating system and are then passed to a user-level application for processing.
The user-level application on the target machine responsible for the SCSI command processing was modified to insert the flaky engine, which placed its location on the system side. The boundary was drawn between the client and server processing. The SCSI protocol provides exception support.
We tested Solaris 7, Solaris 8, and FreeBSD 4.8. In all situations, we initially formatted our FlakyDisk, and populated it with files without injecting any exceptions. We then emulated a transient and then a permanent failure with results in Table 1. Most configurations resulted in the operating systems handling correctly (HC) the error. Solaris 7 permanent exception injection resulted in an infinite loop (L).
First, we ran the FlakyDisk emulating a transient failure. We allowed five read operations to complete successfully, and then reported a failure back to the initiating machine on the sixth read request. All subsequent read requests were reported as successful. All systems (Solaris 7, Solaris 8, and FreeBSD) retransmitted the read request after the initial failure. Next, we ran the Flaky-Disk emulating a catastrophic failure in which the first five read commands were reported as having completed successfully, and then all subsequent read operations were reported back to the initiator machine as having failed.
FlakyDisk is an exception-injection tool that tests the operating system's ability to acknowledge and respond to exceptions generated from a storage device via the SCSI protocol. It can emulate transient or permanent failures. We insert code, similar to an interceptor, between the operating system and the SCSI disk. We could put the interceptor on either side of the request/reply stream between the operating system and the disk shown in Figure 1. In our implementation, we insert code on the side of the SCSI disk.
The request/reply between the application and the runtime system is bound by the language implementation. In the C I/O library fread call, the reply is transmitted via the return value of the function call. It will return up to but no more than the number of bytes that have been requested. A return value of 0 does not signify an error, but that no data is currently available such as at the end of a file. It is a negative return value that will signify an error/exception. When a disk fails or raises an exception, other than some measure of additional reliability support, the exception should be propagated up to the application to be handled.
We need a working system that will allow us to put in our software module to intercept and modify the messages. Rather than modify an actual SCSI disk, we used a SCSI emulation system. In our implementation of a FlakyDisk, we converted a SCSI target mode device system available in FreeBSD 4.8, which emulates a SCSI disk using a standalone computer.
A SCSI card in target mode makes the machine it is on act as a SCSI client. The emulation system uses a desktop computer running in SCSI target mode to act as a SCSI disk. A host machine is attached to the same SCSI bus, and communicates with the target machine as if it were a SCSI disk. The set-up is shown in Figure 2.
The target machine emulates a SCSI hard disk using a SCSI card that support target mode. We used an Adaptec AHC. The target machine operating system is FreeBSD 4.8 Release. The target machine is then seen as a hard disk from the host machine's view.
Our exception injection system is composed on three parts: the engine, the script processor, and the identification of behavior. The engine will physically insert the exception into the reply based on the processing of a user-supplied script. Informational messages are displayed to allow the user to determine if the behavior is correct.
The exception injection engine is a software component that is added into a working system to intercept requests and either hands them off to the real working system for correct processing, or to generate an exception to be passed to the caller in lieu of a reply.
For instance, the operating system may ask to read a sector of data. The working system returns the data as payload in a reply message. Our FlakyDisk could, alternatively, construct a reply message containing a real exception to be passed back to the host as a reply message. The SCSI protocol encodes exceptions into the reply message. To insert an exception, we modify the reply message.
The host machine runs the file system in the configuration that we wish to test. We used the hard disk emulated by the target machine as a second hard disk on its own SCSI bus. The primary hard disk containing the operating system was a non-SCSI drive so as not to interfere with the SCSI messages being delivered by the target machine.
The emulation system has the SCSI messages captured within the operating system of the target machine and then passed to a user-level application for processing. We modified the user-level application on the target machine to insert our flaky engine that will selectively return a failure to the host machine. Typically the command is processed normally by the emulation system and is reported as completing successfully. It is also the case that we guarantee that the command does complete successfully.
We initiate the user-level application, which does the SCSI device emulation, by calling our modified scsi_target program to emulate a disk on SCSI bus 0, with SCSI ID of 3. The test_file is a file on the target machine to be used as local storage.
The FlakyDisk system is customizable to allow the user to determine when an exception should be raised and for how long should the exception be continued to be raised. This is not a time-based measurement but instead based on the number of command iterations that have been processed. It models the type of behavior we want to test, namely, the response of the operating system to device exceptions. For instance, we can identify that the third read request only should return an exception to model a transient error.
Our FlakyDisk system reads a file, scsierror.txt, which lists the exceptions that the user wants to inject. Each line in the file describes an error and has the following syntax:
SCSI_CMD: This is the name of the SCSI command the user wants to generate an exception for. The valid commands are:
ITER: An error will be generated when the FlakyDisk program receives the (iteration_number)th command. There are three different forms for this parameter:
n : An error is generated at the nth iteration of the command.
n-m: An error is generated for each iteration between the nth and the mth iteration.
n+ : An error is generated for any iteration greater than or equal to n.
KEY, CODE, EXT: The sense key, sense code, and the ext sense code are parameters to return in the error report. They specify the type of error that was encountered. You can find a list of the possible values that these parameters can take at http://www.arkeia.com/resources/scsi_rsc.html.
You can use the same SCSI_CMD several times if you want to generate several errors at different times for the same type of command. Figure 3 shows an example of the 10th READ_6 command and the 5th WRITE_6 command should raise exceptions.
The understanding of correct behavior requires a higher-level knowledge of what the behavior should be.
The file cmdlog.txt logs each command receives by FlakyDisk. It indicates when the command was received and displays the input parameters. Then it indicates when the command ended. Figure 4 contains an example cmdlog.txt file. We log the requests coming in designated as "cmd received". The request information includes the command, such as INQUIRY or READ(6), and the associated parameters. We give an instance number to each command. This allows the user to run the system normally and then use the log to determine when an exception should be inserted. We also log the corresponding reply designated as "response sent". If we send an exception rather than a normal response, we identify the exception information sent.
INQUIRY #1 cmd received, Lun: 0
INQUIRY #2 cmd received, Lun: 0
INQUIRY #3 cmd received, Lun: 0
REQUEST SENSE #1 cmd received, Lun: 0
TEST UNIT READY #1 cmd received, Lun: 0
Check Condition sent, Error code: 112, Sense Key: 6, Sense Code: 41, Ext Sense Code: 1
REQUEST SENSE #2 cmd received, Lun: 0
READ CAP #1 cmd received, Lun: 0, Logical Block: 0
READ CAP #2 cmd received, Lun: 0, Logical Block: 0
READ(6) #1 cmd received, Lun: 0, Logical Block: 0, Length: 1
READ(6) #2 cmd received, Lun: 0, Logical Block: 33, Length: 1
WRITE(6) #1 cmd received, Lun: 0, Logical Block: 768, Length: 4
We tested Solaris 7, Solaris 8, and FreeBSD 4.8. In all situations, we initially formatted our FlakyDisk, and populated it with files without injecting any exceptions. We then emulated a transient and then a permanent failure with results already reported in Table 1.
We ran the FlakyDisk emulating a transient failure. We allowed five read operations to complete successfully, and then reported a failure back to the initiating machine on the sixth read request. All subsequent read requests were reported as successful. All systems (Solaris 7, Solaris 8, and FreeBSD) retransmitted the read request after the initial failure. Thus, all system tested were resilient to transient exceptions.
This time we then ran the FlakyDisk emulating a catastrophic failure in which the first five read commands were reported as having completed successfully, and then all subsequent read operations were reported back to the initiator machine as having failed.
Solaris 7 continued to send the read request. The net result was that the application responsible for the read request was hung. This did not interfere with other applications running on the machine.
Solaris 8 and FreeBSD 4.8 did not continue to indefinitely attempt to read. They properly sent the exception up the call stack and informed the application of the exception.