SGE_SHADOWD(8) System Manager's Manual SGE_SHADOWD(8) NAME sge_shadowd - Univa Grid Engine shadow master daemon SYNOPSIS sge_shadowd DESCRIPTION sge_shadowd is a "light weight" process which can be run on so-called shadow master hosts in a Univa Grid Engine cluster to detect failure of the current Univa Grid Engine master daemon, sge_qmaster(8), and to start-up a new sge_qmaster(8) on the host on which the sge_shadowd runs. If multiple shadow daemons are active in a cluster, they run a protocol which ensures that only one of them will start-up a new master daemon. The hosts suitable for being used as shadow master hosts must have shared root read/write access to the directory $SGE_ROOT/$SGE_CELL/com- mon as well as to the master daemon spool directory (by default $SGE_ROOT/$SGE_CELL/spool/qmaster). The names of the shadow master hosts need to be contained in the file $SGE_ROOT/$xQS_NAME_Sxx_CELL/common/shadow_masters. RESTRICTIONS sge_shadowd may only be started by root. ENVIRONMENT VARIABLES SGE_ROOT Specifies the location of the Univa Grid Engine standard configuration files. SGE_CELL If set, specifies the default Univa Grid Engine cell. To address a Univa Grid Engine cell sge_shadowd uses (in the order of precedence): The name of the cell specified in the environment variable SGE_CELL, if it is set. The name of the default cell, i.e. default. SGE_DEBUG_LEVEL If set, specifies that debug information should be writ- ten to stderr. In addition the level of detail in which debug information is generated is defined. SGE_QMASTER_PORT If set, specifies the tcp port on which sge_qmaster(8) is expected to listen for communication requests. Most installations will use a services map entry for the ser- vice "sge_qmaster" instead to define that port. SGE_DELAY_TIME This variable controls the interval in which sge_shadowd pauses if a takeover bid fails. This value is used only when there are multiple sge_shadowd instances and they are contending to be the master. The default is 600 seconds. SGE_CHECK_INTERVAL This variable controls the interval in which the sge_shadowd checks the heartbeat file (60 seconds by default). SGE_GET_ACTIVE_INTERVAL This variable controls the interval when a sge_shadowd instance tries to take over when the heartbeat file has not changed. The default is 240 seconds. FILES //common Default configuration directory //common/shadow_masters Shadow master hostname file. //spool/qmaster Default master daemon spool directory //spool/qmaster/heartbeat The heartbeat file. SEE ALSO sge_intro(1), sge_conf(5), sge_qmaster(8), Univa Grid Engine Installa- tion and Administration Guide. COPYRIGHT See sge_intro(1) for a full statement of rights and permissions. Univa Grid Engine Administrative CoUGEn8.5.4 SGE_SHADOWD(8)