Fault-tolerant communication systems rely on recovery strategies which are often error-prone (e.g.~a programmer manually specifies recovery strategies) or inefficient (e.g.~the whole system is restarted from the beginning). This paper proposes a static analysis based on multiparty session types that can efficiently compute a safe global state from which a system of interacting processes should be recovered. We statically analyse the communication flow of a program, given as a multiparty protocol, to extract the causal dependencies between processes and to localise failures. We formalise our recovery algorithm and prove its safety. A recovered communication system is free from deadlocks, orphan messages and reception errors. Our recovery algorithm incurs less communication cost (only affected processes are notified) and overall execution time (only required states are repeated). On top of our analysis, we design and implement a runtime framework in Erlang where failed processes and their dependencies are soundly restarted from a computed safe state. We evaluate our recovery framework on message-passing benchmarks and a use case for crawling webpages. The experimental results indicate our framework outperforms a built-in static recovery strategy in Erlang when a part of the protocol can be safely recovered.
Sun 5 FebDisplayed time zone: Saskatchewan, Central America change
15:30 - 16:30
|Granullar: Gradual Nullable Types for Java|
Dan Brotherston University of Waterloo, Canada, Werner Dietl University of Waterloo, Canada, Ondřej Lhoták University of Waterloo, CanadaDOI
|Let It Recover: Multiparty Protocol-Induced Recovery|