Difference: SystemAdministratorGuideForEMI1 (6 vs. 7)

Revision 72011-04-17 - MassimoSgaravatto

Line: 1 to 1
 
META TOPICPARENT name="SystemAdministratorDocumentation"

System Administrator Guide for CREAM for EMI-1 release

Line: 72 to 72
 

0.0.1 Choose the BLAH BLparser deployment model

Changed:
<
<
The BLAH Blparser is ....
>
>
The BLAH Blparser is the component of the CREAM CE responsible to notify CREAM about job status changes.
 
Added:
>
>
For LSF and PBS/Torque it is possible to configure the BLAH blparser in two possible ways:

  • The new BLAH BLparser, which relies on the status/history batch system commands
  • The old BLAH BLparser, which parses the batch system log files

For SGE and Condor, only the configuration with the new BLAH blparser is possible

0.0.0.1 New BLAH Blparser

The new Blparser runs on the CREAM CE machine and it is automatically installed when installing the CREAM CE. The configuration of the new BLAH Blparser is done when configuring the CREAM CE (i.e. it is not necessary to configure the Blparser separately from the CREAM CE).

To use the new BLAH blparser, it is just necessary to set:

BLPARSER_WITH_UPDATER_NOTIFIER=true

in the siteinfo.def. This is the default value.

The new BLParser doesn't parse the log files. However the bhist (for LSF) and tracejob (for Torque) commands (used by the new BLParser) require the batch system log files, which therefore must be available (in case e.g. via NFS in the CREAM CE node. Actually for Torque the blparser uses tracejob (which requires the log files) only when qstat can't find anymore the job. And this can happen if the job has been completed more than keep_completed seconds ago and the blparser was not able to detect before that the job completed/was cancelled/whatever. This can happen e.g. if keep_completed is too short or if the BLAH blparser for whatever reason didn't run for a while. If the log files are not available and the tracejob command is issued (for the reasons specified above), the BLAH blparser will not be able to find the job, which will considered "lost" (DONE-FAILED wrt CREAM).

The init script of the new Blparser is /etc/init.d/glite-ce-blahparser. Please note that it is not needed to explicitly start the new blparser: when CREAM is started, it starts also this new BLAH Blparser if it is not already running.

When the new Blparser is running, you should see the following two processes on the CREAM CE node:

  • /usr/bin/BUpdaterxxx
  • /usr/bin/BNotifier

Please note that the user tomcat on the CREAM CE should be allowed to issue the relevant status/history commands (for Torque: qstat, tracejob, for LSF: bhist, bjobs). Some sites configure the batch system so that users can only see their own jobs (e.g. in torque:

set server query_other_jobs = False
). If this is done at the site, then the tomcat user will need a special privilege in order to be exempt from this setting (in torque:
set server operators += tomcat@creamce.yoursite.domain
).
 

0.1 Installation

 
This site is powered by the TWiki collaboration platformCopyright © 2008-2021 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback