# -*- text -*-
#
# Copyright (c) 2004-2005 The Trustees of Indiana University and Indiana
#                         University Research and Technology
#                         Corporation.  All rights reserved.
# Copyright (c) 2004-2005 The University of Tennessee and The University
#                         of Tennessee Research Foundation.  All rights
#                         reserved.
# Copyright (c) 2004-2005 High Performance Computing Center Stuttgart, 
#                         University of Stuttgart.  All rights reserved.
# Copyright (c) 2004-2006 The Regents of the University of California.
#                         All rights reserved.
# $COPYRIGHT$
# 
# Additional copyrights may follow
# 
# $HEADER$
#
# This is the US/English general help file for Open MPI.
#
[btl_mvapi:retry-exceeded]
The retry count is a down counter initialized on creation of the QP. Retry
count is defined in the InfiniBand Spec 1.2 (12.7.38): 
The total number of times that the sender wishes the receiver to retry tim- 
eout, packet sequence, etc. errors before posting a completion error.

Note that two mca parameters are involved here: 
btl_mvapi_ib_retry_count - The number of times the sender will attempt to
retry  (defaulted to 7, the maximum value). 

btl_mvapi_ib_timeout - The local ack timeout parameter (defaulted to 10). The
actual timeout value used is calculated as: 
(4.096 micro-seconds * 2^btl_mvapi_ib_timeout). 
See InfiniBand Spec 1.2 (12.7.34) for more details.

What to do next: 
One item to note is the hosts on which this error has occured, it has been
observed that rebooting or removing a particular host from the job can resolve
this issue. Should you be able to identify a specific cause or additional
trouble shooting information please report  this to devel@open-mpi.org.

