Whilst there is clear evidence that the approach can be expected to deliver some increase in reliability compared with a single version, there is. Jan 27, 2017 currently, there are three categories of software fault. Software fault tolerance techniques are designed to allow a system to tolerate software faults that remain in the system after its development. The philosophy in fault tolerance software is totally different. Software fault tolerance is a necessary component to construct the next generation of highly available and reliable computing systems from embedded systems to data warehouse systems. Fault tolerant system is one that can provide continue correct performance of its specified tasks in presence of failure. Diversity works well in nature where it is the basis of natural selection, a phenomenon that helps biological populations survive as they are challenged by hazards in their environments. Seventeenth international symposium on fault tolerant. Faulttolerant software assures system reliability by using. Single version techniques aim to improve the fault tolerance of a software component by adding to it mechanisms for fault detection, containment, and recovery. Fault tolerance can be achieved by the following techniques. Design diversity is the generation of different implementations codes from.
Dec 06, 2018 fault tolerance is the way in which an operating system os responds to a hardware or software failure. Assessment of data diversity methods for software fault tolerance. To handle faults gracefully, some computer systems have two or more. Jun 17, 2019 fault tolerance is a concept used in many fields, but it is particularly important to data storage and information technology infrastructure. Fault tolerance the goal of fault tolerance methods is to include safety features in the software design or source code to ensure that the software will respond correctly to input data errors and prevent output and control errors software faults are what we commonly call bugs. The key challenge then is how to provide highly dependable software. Data diversity is described, and the results of a pilot study are presented. A much stronger assumption is that ideal diverse software would exhibit failure. Dd has been said to be orthogonal to design diversity 8. Fault tolerance also resolves potential service interruptions related to software or logic errors. The employment of data diversity involves obtaining a related set of points in. Software fault tolerance via environmental diversity.
Both schemes are based on software redundancy assuming that the events of coincidental software failures are rare. The nversion approach to faulttolerant software depends on a generalization of the multiple computation methodthat has beensuccessfully appliedto the tolerance ofphysical faults. Since correctness and safety are really system level concepts, the need and degree to use software fault tolerance is directly dependent. Basic fault tolerant software techniques geeksforgeeks. Possible options include full disk encryption, database. Software fault tolerance using data diversity attention. All fault tolerance techniques must use some form of redundancy to tolerate faults. Fault tolerant technology is a capability of a computer system, electronic system or network to deliver uninterrupted service, despite one or more of its components failing.
Software fault tolerance relies either on design diversity or on single design using robust data. Therefore fault tolerance is achieved by using diversity in the data space. Fault tolerance requirements, limits, and licensing. Assessment of data diversity methods for software fault. Data diverse software fault tolerance techniques n complements design diversity by compensating for design diversity s limitations n involves obtaining a related set of points in the program data space, executing the same software on those points in the program data space, and then using a decision algorithm to determine the resulting output. Thus, data diversity works in cases when a single channel thus relying on time redundancy or identicallyredundant i. Feb 26, 2020 software fault tolerance is a necessary component, as it provides protection against errors in translating the requirements and algorithms into a programming language. Software fault tolerance cmu ece carnegie mellon university. Ammann abstractcrucial computer applications require extremely reliable software. Abstractnowadays the reliability of software is often the main goal in the software development process. Introduction to fault tolerance techniques and implementation. Software engineering software fault tolerance javatpoint. A sihft technique can provide an inexpensive alternative to hardware andor information redundancy.
Fault tolerance is the property that enables a system to continue operating properly in the event of the failure of or one or more faults within some of its components. Ammann and knight a1123 proposed data diversity as a software fault tolerance strategy to complement design diversity. It would be very difficult to sum it up in one article since there are multiple ways to achieve fault tolerance in software. To adequately understand software fault tolerance it is important to understand the nature of the problem that software fault tolerance is supposed to solve. Tolerance of design faults 431 retry, restart and reboot. When a fault occurs, these techniques provide mechanisms to. In order to complement design diversity in the quest for fault tolerance software, there exits several data diversity. Fault tolerance through automated diversity in the. Fault tolerant strategies fault tolerance in computer system is achieved through redundancy in hardware, software, information, andor time. Software fault tolerance is the ability of computer software to continue its normal operation despite the presence of system or hardware faults. There are two basic techniques for obtaining faulttolerant software.
Data encryption has the goal of encrypting data so when the data is stolen it cant be read. A recent survey emphasizes on the most recent advances in the field. Fault tolerance is a concept used in many fields, but it is particularly important to data storage and information technology infrastructure. Data redundancy for the detection and tolerance of software. Data diversity as a complementary software fault tolerance strategy to design diversity was. Fault tolerance ft provides a plausible method for improving reliability claims in the presence of systematic failures in software. Data diverse software fault tolerance techniques n complements design diversity by compensating for design diversity s limitations n involves obtaining a related set of points in the program data space, executing the same software on those points in the program data. Diversity has been employed widely in engineering also and has become an important part of computer engineering. There are two types of software fault tolerance techniques. Written by joe kozlowicz on thursday, september 20th 2018 categories. We discuss a new view of fault tolerance of software based systems. Cpus that are used in host machines for fault tolerant vms must be compatible with vsphere vmotion or improved with enhanced vmotion. One of the main concerns in software safety critical applications is to ensure sufficient reliability if one cannot prove the absence of faults.
The objective of creating a fault tolerant system is to prevent disruptions arising from a single point of failure, ensuring. In this context, fault tolerance refers to the ability of a computer system or storage subsystem to suffer failures in component hardware or software parts yet continue to function without a service interruption and without losing data or. Software fault tolerance carnegie mellon university. Fault tolerance patching digital signatures redundancy hashing fault tolerance patching.
Assessment of data diversity methods for software fault tolerance based on. This paper describes an approach to merge these two areas, providing a framework to utilise objectoriented approaches to achieve software fault tolerance incorporating both design and data diversity techniques. The tandem data shows that it is not necessary for software to be inherently. Systematic and design diversity software techniques for. The different areas of software diversity are discussed in surveys on diversity for fault tolerance or for security. The regions of the input space that cause failure for certain experimental programs are discussed, and data reexpression, the way in which alternate input data sets can be obtained, is examined. Currently the areas of software fault tolerance and objectoriented techniques have been developed separately. Software fault tolerance by design diversity cuhk cse. It is reasonable to assume that some software ft techniques offer more protection than others, but the relative effectiveness of different software. Review of software fault tolerance methods for reliability enhancement of realtime software systems. It offers you a thorough understanding of the operation of critical software fault tolerance techniques and guides you through their design, operation and performance.
One of the main concerns in safetycritical software is to ensure sufficient reliability because proof of the absence of systematic failures has proved to be an unrealistic goal. Before using vsphere fault tolerance ft, consider the highlevel requirements, limits, and licensing that apply to this feature. An approach to software fault tolerance, phd dissertation, university of. The following cpu and networking requirements apply to ft. Sc high integrity system university of applied sciences, frankfurt am main 2. The regions of the input space that cause failure for certain experimental pro. It is plausible that some software ft techniques offer increased protection than others. Fault tolerance refers to the ability of a system computer, network, cloud cluster, etc. Fault tolerance through automated diversity in the management. This chapter concentrates on software fault tolerance based on design diversity. Structuring redundancy for software fault tolerance robust software. Fault tolerant software assures system reliability by using protective redundancy at the software level.
Definition and analysis of hardware and softwarefault. Data diversity fault tolerance design the software ft architecture in this research uses dd, a complementary approach to design diversity. Fault tolerant software architecture stack overflow. These principles deal with desktop, server applications andor soa. An approach to software fault tolerance, digest ftcs17. Unlike design diversity, data diversity uses only one version of the software and applies it to fault related points at the data space. Due to the limitations of some design diverse techniques, it led to the development of data diverse software fault tolerance techniques. Fault tolerance can be built right into software, and improve resilience through load balancing, virtualization and other techniques. Review of software faulttolerance methods for reliability enhancement of realtime software systems. For a typical system, current proof techniques and testing methods cannot guarantee the absence of software faults, but careful use of redundancy may allow the system to tolerate them. Fault tolerance through automated diversity in the management of distributed systems jorg prei.
Big data usually includes data sets with sizes beyond the ability of commonly used software tools to capture, curate, manage, and process the data within a tolerable elapsed time. Since redundancy alone is not sufficient to help detect and tolerate software. Although building a truly practical fault tolerant system touches upon indepth distributed computing theory and complex computer science principles, there are many software toolsmany of them, like the following, open sourceto alleviate undesirable results by building a fault tolerant system. The regions of the input space that cause failure for certain experimental programs are discussed, and data reexpression. The term essentially refers to a systems ability to allow for failures or malfunctions, and this ability may be provided by software, hardware or a combination of both. Data diversity can also be applied to software testing and greatly facilitates the automation of testing. Fault tolerance ft is one method for improving reliability claims. We have several software fault tolerance schemes as proposed in 46,47,48,49,50 are based on software design diversity in order to tolerate software design bugs. Assessing diagnostic techniques for fault tolerance in software. For some data center operators that means selecting software instead of hardware to achieve resilience. Challenging malicious input with fault tolerance black hat. They include the recovery block scheme rbs programming, consensus recovery block programming, nversion programming nvp, n selfchecking programming nscp and data diversity.
Software fault tolerance relies either on design diversity or on single design using robust data structure. But, it does have one disadvantage that is it does not provide explicit protection against errors in specifying the requirements. Design diversity is the generation of different implementations codes from a common specification 3, 8. Design fault tolerance by means of design diversity is a concept that traces back to the very early age of informatics. If its operating quality decreases at all, the decrease is proportional to the severity of the failure, as compared to a naively designed system, in which even a small failure can cause total breakdown. Faulttolerant computing is the art and science of building computing systems that. Request pdf assessment of data diversity methods for software fault tolerance based on mutation analysis one of the main concerns in safetycritical software is to ensure sufficient. Faulttolerant software assures system reliability by using protective redundancy at the software level. Checkpoint and restart using data diversity with input reexpression. Data diversity relies on a different form of redundancy from existing approaches to software fault tolerance and is substantially less expensive to implement. There are two basic techniques for obtaining fault tolerant software. Data diversity n limitations of some design diverse techniques led to the development of data diverse software fault tolerance techniques n data diverse techniques are meant to complement, rather than replace, design diverse techniques n steps n obtain a related set of points in the program data space, executing the same software on those points. This diversity is normally applied under the form of recovery blocks or nversion programming. Nov 17, 2014, beryl has taken an interest in the published research on diversity and its practical applications, and to this end has designed and copresented diversity sessions at grace hopper india and.
Depending on the class of faults 76 redundant devices, networks, data or applications are used. Software fault tolerance using data diversity core. Fault masking is any process that prevents faults in a system. Introduction to software fault tolerance techniques and implementation 9 1 system requirements specification. Software fault tolerance efforts to attain software that can tolerate software design faults programming errors have made use of static and dynamic redundancy approaches.
Look to this innovative resource for the most comprehensive coverage of software fault tolerance techniques available in a single volume. Software fault tolerance sequential fault tolerance techniques. Software fault tolerance techniques and implementation. Whilst there is clear evidence that the approach can be expected to deliver some increase in reliability compared with a single version, there is not agreement about the extent of this. Software fault tolerance refers to the use of techniques to increase the likelihood that the final design embodiment will produce correct andor safe outputs. Understanding fault tolerance enterprise storage forum. Software fault tolerance techniques are employed during the procurement, or development, of the software. Design diversity is a solution to software fault tolerance only so far as it is possible. Design diversity has been used for many years now as a means of achieving a degree of fault tolerance in software based systems. Microsoft brings fault tolerant technology to windows.
1582 439 1559 1491 166 1101 597 1184 515 1103 1264 1229 1340 877 433 1438 1181 1634 1582 6 1032 1581 407 1351 315 96 401 913 400 490 907 974