PTL Logo

Fault Tolerance Research @ Open Systems Laboratory

Transparent Checkpoint/Restart in Open MPI

  •  

Overview

Transparent checkpoint/restart process fault tolerance allows an application to be preserved to a stable storage device and recovered at a later time. This technique does not require any changes to the application source code making it a convenient solution for complex, legacy applications and scheduler based dynamic resource management.

We provide a transparent checkpoint/restart process fault tolerance solution for MPI-1.3 compliant applications using Open MPI. Our solution was incorporated into the development trunk of Open MPI in March 2007, and later released as part of the v1.3 release series.

Open MPI supports a transparent, coordinated checkpoint/restart implementation supported primarily by the Berkeley Lab's Checkpoint/Restart (BLCR) Library.

Publications

Demonstration

Currently Supported

  • Checkpoint/Restart Services:
  • Interconnects (BTLs):
    • SELF - Loopback interface
    • TCP - Ethernet
    • MX - Myrinet
    • openib - Open Fabrics devices (e.g., iWarp and InfiniBand devices)
    • sm - Shared Memory
  • Collective Components:
    • Any collective component layered over point-to-point operations
    • basic
    • tuned
    • libnbc
    • self
  • SIGSTOP/SIGCONT
  • Recovery Techniques
    • crmig Addition to hnp ErrMgr component: C/R-enabled Process Migration
    • autor Addition to hnp ErrMgr component: C/R-enabled Automatic Recovery

Notes

No special code is required in MPI application to take advantage of Open MPI's checkpoint/restart functionality, although some limitations may be imposed (depending on the back-end checkpointing system that is used).


Open MPI's checkpoint/restart functionality only involves MPI process: the Open MPI runtime environment is not checkpointed.


Open MPI does not yet support checkpointing/restarting MPI-2 applications. In particular, Open MPI's behavior is undefined when checkpointing MPI process that invoke any MPI-2 functionality (including dynamic functions and IO).


Checkpoints can only be performed after all processes have invoked MPI_INIT and before any process has invoked MPI_FINALIZE.


Threaded checkpoint coordination support was added in Feb. 2008. This allows an application to make progress on a checkpoint operation whether or not the process is inside the MPI library. To enable this feature you must enable MPI threads and the checkpoint thread

./configure --enable-ft-thread --with-ft=cr --enable-mpi-threads
After r22841 the --enable-mpi-threads was replaced by --enable-opal-multi-threads. So you should use the following instead:
./configure --enable-ft-thread --with-ft=cr --enable-opal-multi-threads

Do not use the BLCR command line tools! You must use the Open MPI provided tools. It is currently undefined how Open MPI will behave if you use the cr_checkpoint and cr_restart tools.


Currently, the only fully supported threading model is MPI_THREAD_SINGLE. Other MPI threading models may work, but have not received any testing.


The SELF checkpoint interface has changed slightly. Be sure to read the attached documentation for the new function call specifications. Let us know if you require backwards compatibility on the users list, and we can discuss options there.