REPORTING(5) File Formats Manual REPORTING(5) NAME reporting - Univa Grid Engine reporting file format DESCRIPTION A Univa Grid Engine system writes a reporting file to $SGE_ROOT/default/common/reporting. The reporting file contains data that can be used for accounting, monitoring and analysis purposes. It contains information about the cluster (hosts, queues, load values, consumables, etc.), about the jobs running in the cluster and about sharetree configuration and usage. All information is time related, events are dumped to the reporting file in a configurable interval. It allows to monitor a "real time" status of the cluster as well as his- torical analysis. FORMAT The reporting file is an ASCII file. Each line contains one record, and the fields of a record are separated by a delimiter (:). The reporting file contains records of different type. Each record type has a specific record structure. The first two fields are common to all reporting records: time Time (64bit GMT unix timestamp in milliseconds) when the record was created. record type Type of the accounting record. The different types of records and their structure are described in the following text. new_job The new_job record is written whenever a new job enters the system (usually by a submitting command). It has the following fields: submission_time Time (64bit GMT unix time stamp in milliseconds) when the job was submitted. job_number The job number. task_number The array task id. Always has the value -1 for new_job records (as we don't have array tasks yet). pe_taskid The task id of parallel tasks. Always has the value "none" for new_job records. job_name The job name (from -N submission option) owner The job owner. group The unix group of the job owner. project The project the job is running in. department The department the job owner is in. account The account string specified for the job (from -A submission option). priority The job priority (from -p submission option). job_class If the job has been submitted into a job class, the name of the job class, otherwise "" . submit_host The submit host name. submit_cmd The command line used for job submission. As the delimiter used by the reporting file (colon ":") can be part of the command line all colons in the command line are replaced by ASCII code 255. When reading the reporting file characters with ASCII code 255 have to be converted back to colon. Line feeds being part of the command line will be replaced by a space character. For jobs submitted via the DRMAA interface or via qmon graphical user interface the reporting file contains "NONE" as submit_cmd. job_log The job_log record is written whenever a job, an array task or a pe tasks is changing status. A status change can be the transition from pending to running, but can also be triggered by user actions like sus- pension of a job. It has the following fields: event_time Time (64bit GMT unix time stamp in milliseconds) when the event was generated. event A one word description of the event. job_number The job number. task_number The array task id. Always has the value -1 for new_job records (as we don't have array tasks yet). pe_taskid The task id of parallel tasks. Always has the value "none" for new_job records. state The state of the job after the event was processed. user The user who initiated the event (or special usernames "qmas- ter", "scheduler" and "execd" for actions of the system itself like scheduling jobs, executing jobs etc.). host The host from which the action was initiated (e.g. the submit host, the qmaster host, etc.). state_time Reserved field for later use. priority The job priority (from -p submission option). submission_time Time (64bit GMT unix time stamp in milliseconds) when the job was submitted. job_name The job name (from -N submission option) owner The job owner. group The unix group of the job owner. project The project the job is running in. department The department the job owner is in. account The account string specified for the job (from -A submission option). job_class If the job has been submitted into a job class, the name of the job class, otherwise "" . message A message describing the reported action. online_usage Online usage records are written per array task or pe task of running jobs if online usage reporting is configured in the global cluster con- figuration, see also sge_conf(5) or per job via the -rou option, see also submit(1). An online usage record contains the following fields: report_time Time (64bit GMT unix time stamp in milliseconds) when the usage values were generated by sge_execd. job_number The job number. task_number The array task id. pe_taskid The task id of parallel tasks. usage Comma separated list of name=value tuples. acct Records of type acct are accounting records. Normally, they are written whenever a job, a task of an array job, or the task of a parallel job terminates. However, for long running jobs an intermediate acct record is created once a day after a midnight. This results in multiple accounting records for a particular job and allows for a fine-grained resource usage monitoring over time. Accounting records comprise the following fields: qname Name of the cluster queue in which the job has run. hostname Name of the execution host. group The effective group id of the job owner when executing the job. owner Owner of the Univa Grid Engine job. job_name Job name. job_number Job identifier - job number. account An account string as specified by the qsub(1) or qalter(1) -A option. priority Priority value assigned to the job corresponding to the priority parameter in the queue configuration (see queue_conf(5)). submission_time Submission time (64bit GMT unix time stamp in milliseconds). start_time Start time (64bit GMT unix time stamp in milliseconds). end_time End time (64bit GMT unix time stamp in milliseconds). failed Indicates the problem which occurred in case a job could not be started on the execution host (e.g. because the owner of the job did not have a valid account on that machine). If Univa Grid Engine tries to start a job multiple times, this may lead to multiple entries in the accounting file corresponding to the same job ID. exit_status Exit status of the job script (or Univa Grid Engine specific status in case of certain error conditions). ru_wallclock Difference between end_time and start_time (see above). The remainder of the accounting entries follows the contents of the standard UNIX rusage structure as described in getrusage(2). Depending on the operating system where the job was executed some of the fields may be 0. The following entries are provided: ru_utime ru_stime ru_maxrss ru_ixrss ru_ismrss ru_idrss ru_isrss ru_minflt ru_majflt ru_nswap ru_inblock ru_oublock ru_msgsnd ru_msgrcv ru_nsignals ru_nvcsw ru_nivcsw On Windows, only the values ru_wallclock, ru_utime and ru_stime are accounted. These values are the final usage values of the Windows Job object that is used to reflect the Univa Grid Engine Job, not the sum of the usage of all processes. project The project which was assigned to the job. department The department which was assigned to the job. granted_pe The parallel environment which was selected for that job. slots The number of slots which were dispatched to the job by the scheduler. task_number Array job task index number. cpu The cpu time usage in seconds. mem The integral memory usage in Gbytes seconds. io The amount of data transferred in Gbytes. On Linux data trans- ferred means all bytes read and written by the job through the read(), pread(), write() and pwrite() systems calls. On Windows this is the sum of all bytes transferred by the job by doing write, read and other operations. It's not documented what these other operations are. category A string specifying the job category. iow The io wait time in seconds. ioops The number of io operations. pe_taskid If this identifier is set the task was part of a parallel job and was passed to Univa Grid Engine via the qrsh -inherit inter- face. maxvmem The maximum vmem size in bytes. arid Advance reservation identifier. If the job used resources of an advance reservation then this field contains a positive integer identifier otherwise the value is "0" . ar_submission_time If the job used resources of an advance reservation then this field contains the submission time (64bit GMT unix time stamp in milliseconds) of the advance reservation, otherwise the value is "0" . job_class If the job has been running in a job class, the name of the job class, otherwise "NONE" . qdel_info If the job (the array task) has been deleted via qdel, "@", else "NONE". If qdel was called multiple times, every invocation is recorded in a comma separated list. maxrss The maximum resident set size in bytes. maxpss The maximum proportional set size in bytes. submit_host The submit host name. cwd The working directory the job ran in as specified with qsub / qalter switches -cwd and -wd. As the delimiter used by the accounting file (colon ":") can be part of the working directory all colons in the working directory are replaced by ASCII code 255. submit_cmd The command line used for job submission. As the delimiter used by the reporting file (colon ":") can be part of the command line all colons in the command line are replaced by ASCII code 255. When reading the reporting file characters with ASCII code 255 have to be converted back to colon. Line feeds being part of the command line will be replaced by a space character. For jobs submitted via the DRMAA interface or via qmon graphical user interface the reporting file contains "NONE" as submit_cmd. wallclock The wallclock time the job spent in running state. Times during which the job was suspended are not counted as wallclock time. For loosely integrated jobs and for tightly integrated jobs with accounting summary enabled the wallclock time reported for the master task is the wallclock time multiplied by the number of job slots. queue Records of type queue contain state information for queues (queue instances). A queue record has the following fields: qname The cluster queue name. hostname The hostname of a specific queue instance. report_time The time (64bit GMT unix time stamp in milliseconds) when a state change was triggered. state The new queue state. queue_consumable A queue_consumable record contains information about queue consumable values in addition to queue state information: qname The cluster queue name. hostname The hostname of a specific queue instance. report_time The time (64bit GMT unix time stamp in milliseconds) when a state change was triggered. state The new queue state. consumables Description of consumable values. Information about multiple consumables is separated by space. A consumable description has the format ==. host A host record contains information about hosts and host load values. It contains the following information: hostname The name of the host. report_time The time (64bit GMT unix time stamp in milliseconds) when the reported information was generated. state The new host state. Currently, Univa Grid Engine doesn't track a host state, the field is reserved for future use. Always con- tains the value X. load values Description of load values. Information about multiple load val- ues is separated by space. A load value description has the format =. host_consumable A host_consumable record contains information about hosts and host con- sumables. Host consumables can for example be licenses. It contains the following information: hostname The name of the host. report_time The time (64bit GMT unix time stamp in milliseconds) when the reported information was generated. state The new host state. Currently, Univa Grid Engine doesn't track a host state, the field is reserved for future use. Always con- tains the value X. consumables Description of consumable values. Information about multiple consumables is separated by space. A consumable description has the format ==. sharelog The Univa Grid Engine qmaster can dump information about sharetree con- figuration and use to the reporting file. The parameter sharelog sets an interval in which sharetree information will be dumped. It is set in the format HH:MM:SS. A value of 00:00:00 configures qmaster not to dump sharetree information. Intervals of several minutes up to hours are sensible values for this parameter. The record contains the fol- lowing fields current time The present time usage time The time used so far node name The node name user name The user name project name The project name shares The total shares job count The job count level The percentage of shares used total The adjusted percentage of shares used long target share The long target percentage of resource shares used short target share The short target percentage of resource shares used actual share The actual percentage of resource shares used usage The combined shares used cpu The cpu used mem The memory used io The IO used long target cpu The long target cpu used long target mem The long target memory used long target io The long target IO used new_ar A new_ar record contains information about advance reservation objects. Entries of this type will be added if an advance reservation is cre- ated. It contains the following information: submission_time The time (64bit GMT unix time stamp in milliseconds) when the advance reservation was created. ar_number The advance reservation number identifying the reservation. ar_owner The owner of the advance reservation. ar_attribute The ar_attribute record is written whenever a new advance reservation was added or the attribute of an existing advance reservation has changed. It has following fields. event_time The time (64bit GMT unix time stamp in milliseconds) when the event was generated. submission_time The time (64bit GMT unix time stamp in milliseconds) when the advance reservation was created. ar_number The advance reservation number identifying the reservation. ar_name Name of the advance reservation. ar_account An account string which was specified during the creation of the advance reservation. ar_start_time Start time. ar_end_time End time. ar_granted_pe The parallel environment which was selected for an advance reservation. ar_granted_resources The granted resources which were selected for an advance reser- vation. ar_sr_cal_week In case of standing reservation the week calendar describing the reservation points, max. 2048 characters. See also the -cal_week option in the qrsub(1) man page. ar_sr_depth In case of standing reservation the SR depth (the number of reservations being done at a time). See also the -cal_depth option in the qrsub(1) man page. ar_sr_jmp In case of standing reservation the number of un-allocatable reservations to accept See also the -cal_jmp option in the qrsub(1) man page. ar_log The ar_log record is written whenever a advance reservation is changing status. A status change can be from pending to active, but can also be triggered by system events like host outage. It has following fields. ar_state_change_time The time (64bit GMT unix time stamp in milliseconds) when the event occurred which caused a state change. submission_time The time (64bit GMT unix time stamp in milliseconds) when the advance reservation was created. ar_number The advance reservation number identifying the reservation. ar_state The new state. ar_event An event id identifying the event which caused the state change. ar_message A message describing the event which caused the state change. ar_sr_id In case of standing reservation the SR id (a number >= 0), in case of advance reservation -1. ar_acct The ar_acct records are accounting records which are written for every queue instance whenever a advance reservation terminates. Advance reservation accounting records comprise following fields. ar_termination_time The time (64bit GMT unix time stamp in milliseconds) when the advance reservation terminated. submission_time The time (64bit GMT unix time stamp in milliseconds) when the advance reservation was created. ar_number The advance reservation number identifying the reservation. ar_qname Cluster queue name which the advance reservation reserved. ar_hostname The name of the execution host. ar_slots The number of slots which were reserved. ar_sr_id In case of standing reservation the SR id (a number >= 0), in case of advance reservation -1. SEE ALSO sge_conf(5). host_conf(5). COPYRIGHT See sge_intro(1) for a full statement of rights and permissions. Univa Grid Engine File Formats UGE 8.5.4 REPORTING(5)