"A Checkpoint-on-Failure Protocol for Algorithm-Based Recovery in Standard MPI", 18th International European Conference on Parallel and Distributed Computing (Euro-Par 2012), Christos Kaklamanis, Theodore Papatheodorou and Paul Spirakis eds., Springer-Verlag, Rhodes, Greece, August 27-31, 2012.
"An Evaluation of User-Level Failure Mitigation Support in MPI", roceedings of Recent Advances in Message Passing Interface - 19th European MPI Users' Group Meeting, EuroMPI, Springer, Vienna, Austria, September 23 - 26, 2012.
On Scalability for MPI Runtime Systems, , no. ICL-UT-11-05: Innovative Computing Laboratory, University of Tennessee, may, 2011.
"Constructing Resilient Communication Infrastructure for Runtime Environments", Parallel Computing: From Multicores and GPU's to Petascale: IOS Press, pp. 441-451, 2010.