|
ABSTRACT
Title |
: |
A Review of Checkpointing Based Fault Tolerance Techniques in Mobile Distributed Systems |
Authors |
: |
Rachit Garg, Praveen Kumar |
Keywords |
: |
Fault tolerance, coordinated checkpointing, consistent global state, and mobile distributed system.
|
Issue Date |
: |
July 2010 |
Abstract |
: |
Fault Tolerance Techniques enable systems to perform tasks in the presence of faults. A checkpoint is a local state of a process saved on stable storage. In a distributed system, since the processes in the system do not share memory, a global state of the system is defined as a set of local states, one from each process. In case of a fault in distributed systems, checkpointing enables the execution of a program to be resumed from a previous consistent global state rather than resuming the execution from the beginning. In this way, the amount of useful processing lost because of the fault is significantly reduced. Checkpointing is an effective fault tolerant technique in distributed system as it avoids the domino effect and require minimum storage requirement. Most of the earlier coordinated checkpoint algorithms block their computation during checkpointing and forces minimum-process or non-blocking even though many of them may not be necessary or non-blocking minimum-process but takes useless checkpoints or reduced useless checkpoint but has higher synchronization message overhead or has high checkpoint request propagation time. In this paper, we discuss various issues related to the checkpointing for distributed systems and mobile computing environments. We also present a survey of some checkpointing algorithms for distributed systems.
|
Page(s) |
: |
1052-1063 |
ISSN |
: |
0975–3397 |
Source |
: |
Vol. 2, Issue.4 |
|