Difference: SystemAdministratorGuideForEMI1 (53 vs. 54)

Revision 542011-11-04 - GoncaloBorges

Line: 1 to 1
 
META TOPICPARENT name="SystemAdministratorDocumentation"

System Administrator Guide for CREAM for EMI-1 release

Line: 26 to 26
  If you plan to use LSF as batch system for your CREAM CE, you have to install and configure it before installing and configuring the CREAM software. Since LSF is a commercial software it can't be distributed together with the middleware.
Changed:
<
<
If you plan to use SGE as batch system for your CREAM CE, it will be installed and configured along with the middleware (i.e. you don't have to install and configure it in advance)
>
>
If you plan to use GE as batch system for your CREAM CE, you have to install and configure it before installing and configuring the CREAM software. The CREAM CE integration was tested with GE 6.2u5 but it should work with any forked version of the original GE software. The support of the GE batch system software (or any of its forked versions) is out of the scope of this activity.
 

0.1 Plan how to deploy the CREAM CE

0.1.1 CREAM CE and gLite-cluster

Line: 124 to 124
 
  • The new BLAH BLparser, which relies on the status/history batch system commands
  • The old BLAH BLparser, which parses the batch system log files
Changed:
<
<
For SGE and Condor, only the configuration with the new BLAH blparser is possible
>
>
For GE and Condor, only the configuration with the new BLAH blparser is possible
 

0.0.0.1 New BLAH Blparser

Line: 182 to 182
  Click here for information how to deploy the databases on a machine different wrt the CREAM-CE.
Changed:
<
<

0.1 Installation

>
>

0.1 CREAM CE Installation

  This section explains how to install:
Line: 292 to 292
 yum install emi-lsf-utils
Changed:
<
<
  • If you are running SGE, install the emi-ge-utils metapackage:
>
>
  • If you are running GE, install the emi-ge-utils metapackage:
 
yum install emi-ge-utils
Line: 355 to 355
 yum install emi-lsf-utils
Changed:
<
<
  • If you are running SGE, install the emi-ge-utils metapackage:
>
>
  • If you are running GE, install the emi-ge-utils metapackage:
 
yum install emi-ge-utils
Line: 410 to 410
  The CREAM CLI is part of the EMI-UI. To install it please refer to https://twiki.cern.ch/twiki/bin/view/EMI/EMIui#Client_Installation_Configuratio .
Changed:
<
<

0.1 Configuration

>
>

0.1 CREAM CE configuration

 

0.0.1 Using the YAIM configuration tool

Line: 471 to 471
  /opt/glite/yaim/bin/yaim -c -s <site-info.def> -n creamCE -n LSF_utils
Changed:
<
<
  • Configuration of a CREAM CE in no cluster mode using SGE as batch system
>
>
  • Configuration of a CREAM CE in no cluster mode using GE as batch system
 
     /opt/glite/yaim/bin/yaim -c -s <site-info.def> -n creamCE -n SGE_utils 

0.0.1 Configuration of a CREAM CE node in cluster mode

Line: 516 to 516
 
     /opt/glite/yaim/bin/yaim -c -s <site-info.def> -n creamCE -n LSF_utils
Changed:
<
<
  • Configuration of a CREAM CE in cluster mode (with glite-CLUSTER deployed on a different node) using SGE as batch system
>
>
  • Configuration of a CREAM CE in cluster mode (with glite-CLUSTER deployed on a different node) using GE as batch system
 
    /opt/glite/yaim/bin/yaim -c -s <site-info.def> -n creamCE -n SGE_utils 
  • Configuration of a CREAM CE in cluster mode (with glite-CLUSTER deployed on a different node) using Torque as batch system, with the CREAM CE being also Torque server
Line: 537 to 537
  /opt/glite/yaim/bin/yaim -c -s <site-info.def> -n creamCE -n LSF_utils -n glite-CLUSTER
Changed:
<
<
  • Configuration of a CREAM CE in cluster mode (with glite-CLUSTER deployed on the same node of the CREAM-CE) using SGE as batch system
>
>
  • Configuration of a CREAM CE in cluster mode (with glite-CLUSTER deployed on the same node of the CREAM-CE) using GE as batch system
 
     /opt/glite/yaim/bin/yaim -c -s <site-info.def> -n creamCE -n SGE_utils -n glite-CLUSTER
  • Configuration of a CREAM CE in cluster mode (with glite-CLUSTER deployed on the same node of the CREAM-CE) using Torque as batch system, with the CREAM CE being also Torque server
Line: 671 to 670
 
  • Restart tomcat on ce01.mydomain and ce02.mydomain
You can of course replace 33333, 33334, 56565, 56566 (reported in the above examples) with other port numbers
Deleted:
<
<
 

0.0.0.1 Configuration of the new BLAH Blparser to to use cached batch system commands

Changed:
<
<
With BLAH version >= 1.16-3, the new BLAH blparser can be configured in order to not interact directly with the batch system, but through a program (to be implemented by the site admin) which can implement some caching functionality. This is the case for example of CommandProxyTools, implement at Cern
>
>
With BLAH version >= 1.16-3, the new BLAH blparser can be configured in order to not interact directly with the batch system, but through a program (to be implemented by the site admin) which can implement some caching functionality. This is the case for example of CommandProxyTools, implement at Cern
  To enable this feature, add in /etc/blah.config (the example below is for lsf, with /usr/bin/runcmd.pl as name of the "caching" program):
Line: 684 to 681
 batch_command_caching_filter=/usr/bin/runcmd.pl
Changed:
<
<
So the blparser, insead of issuing bjobs -u ...., will issue /usr/bin/runcmd.pl bjobs -u ..."
>
>
So the blparser, insead of issuing bjobs -u ...., will issue /usr/bin/runcmd.pl bjobs -u ..." </verbatim>
 

0.0.1 Configuration of the CREAM databases on a host different than the CREAM-CE

Line: 793 to 784
 
 mysqlshow --password="$MYSQL_PASSWORD" | grep "creamdatabase" > /dev/null 2>&1
Deleted:
<
<
 
Changed:
<
<
>
>
 mysql -u root --password="$MYSQL_PASSWORD" -e "DROP DATABASE creamdatabase"
Deleted:
<
<
 
Changed:
<
<
>
>
 mysqlshow --password="$MYSQL_PASSWORD" | grep "delegationdatabase" > /dev/null 2>&1
Deleted:
<
<
 
Changed:
<
<
>
>
 mysql -u root --password="$MYSQL_PASSWORD" -e "DROP DATABASE delegationdatabase"
Line: 842 to 832
 
Added:
>
>

Batch system integration

Grid Engine

Requirements

You have to install and configure the GE batch system software before installing and configuring the CREAM software. The CREAM CE integration was tested with GE 6.2u5 but it should work with any forked version of the original GE software. The support of the GE batch system software (or any of its forked versions) is out of the scope of this activity.

Before proceeding, please take note of the following remarks:

  1. CREAM CE must be installed in a separate node from the GE SERVER (GE QMASTER).
  2. CREAM CE must work as a GE submission host (use qconf -as <CE.MY.DOMAIN> in the GE QMASTER to set it up).

Integration plugins

The GE integration with CREAM CE consists in deploying specific BLAH plugins and configure them to properly interoperate with Grid Engine batch system. The following GE BLAH plugins are deployed with CREAM CE installation: BUpdaterSGE, sge_hold.sh, sge_submit.sh, sge_resume.sh, sge_status.sh and sge_cancel.

Instalation

If you are running GE, install the emi-ge-utils metapackage: yum install emi-ge-utils

Configuration

Set your siteinfo.def file, which is the input file used by yaim. Documentation about yaim variables relevant for CREAM CE and GE is available at

The most relevant GE YAIM variables to set in your site-info.def are:
  1. BLPARSER_WITH_UPDATER_NOTIFIER= "true"
  2. JOB_MANAGER= sge
  3. CE_BATCH_SYS= sge
  4. SGE_ROOT= <Path to your SGE installation>. Default: "/usr/local/sge/pro"
  5. SGE_CELL= <Path to your SGE CELL>. Default: "default"
  6. SGE_QMASTER= <SGE QMASTER PORT>. Default: "536"
  7. SGE_EXECD= <SGE EXECD PORT>. Defaul: "537"
  8. SGE_SPOOL_METH= "classic"
  9. BATCH_SERVER= <FQDN of your QMASTER>
  10. BATCH_LOG_DIR= <Path for the GE accounting file>
  11. BATCH_BIN_DIR= <Path for the GE binaries>
  12. BATCH_VERSION= <GE version>
Some sites use GE installations shared via NFS (or equivalent) in the CREAM CE. In order to prevent changes in that setup when YAIM is executed, define SGE_SHARED_INSTALL=yes in your site-info.def, otherwise YAIM may change your setup according to the definitions in your site-info.def.

The CREAM CE GE integration in then configured running YAIM:

  • no cluster mode: /opt/glite/yaim/bin/yaim -c -s <site-info.def> -n creamCE -n SGE_utils
  • in cluster mode with glite-CLUSTER deployed on a different node: /opt/glite/yaim/bin/yaim -c -s <site-info.def> -n creamCE -n SGE_utils
  • in cluster mode with glite-CLUSTER deployed on the same node of the CREAM-CE: /opt/glite/yaim/bin/yaim -c -s <site-info.def> -n creamCE -n SGE_utils -n glite-CLUSTER

Important notes

File transfers

Besides the input/output sandbox files (transfered via GFTP) there are some other files that need to be transferred from/to the CREAM sandbox directory on the CE node to/from the Worker Node, namely:

  • The CREAM job wrapper and the user proxies, that are staged from the CE node to the WN where the job will run
  • The standard output and error files of the Cream job wrapper, that are copied from the WN to the CE when the job completes its execution.
Since GE does not implement staging capabilitites by default, we distribute the sge_filestaging file with the GE CREAM software. In order to enable the copy of the previous files:
  1. Copy the sge_filestaging file to all your WNs (or to a shared directory mounted on your WNs)
  2. Add <path>/sge_filestaging --stagein and <path>/sge_filestaging --stageout to your prolog and epilog defined in GE global configuration (use qconf -mconf), or alternatively, in each queue configuration (qconf -mq <QUEUE>).
  3. If you do not share the CREAM sanbox area between the CREAM CE node and the Worker Node, the sge_filestaging file requires configuring the ssh trust between CE and WNs.
  4. If you share the CREAM sanbox area between the CREAM CE node and the Worker Node, the sge_filestaging has to be changed according to:

# diff -Nua sge_filestaging.modified sge_filestaging.orig
--- sge_filestaging.modified    2010-03-25 19:38:11.000000000 +0000
+++ sge_filestaging.orig    2010-03-25 19:05:43.000000000 +0000
Line: 21 to 21
Added:
>
>
my $remotefile = $3;

if ( $STAGEIN ) { - system( 'cp', $remotefile, $localfile ); + system( 'scp', "$remotemachine:$remotefile", $localfile ); } else { - system( 'cp', $localfile, $remotefile" ); + system( 'scp', $localfile, "$remotemachine:$remotefile" ); } }

GE accounting file

BUpdaterSGE needs to consult the GE accounting file to determine how did a given job ended. Therefore, the GE accounting file must be shared between the GE SERVER / QMASTER and the CREAM CE.

Moreover, to guarantee that the accounting file is updated on the fly, the GE configuration should be tunned (using qconf -mconf) in order to add under the reporting_params the following definitions: accounting=true accounting_flush_time=00:00:00

GE SERVER (QMASTER) tunning

The following suggestions should be implemented to achieve better performance when integrating with CREAM CE:

  1. The Cream CE machine must be set as a submission machine
  2. The GE QMASTER configuration should have the definition execd_params INHERIT_ENV=false (use qconf -mconf to set it up). This setting allows to propagate the environment of the submission machine (CREAM CE) into the execution machine (WN).
 

1 Postconfiguration

Have a look at the Known issue page.

 
This site is powered by the TWiki collaboration platformCopyright © 2008-2022 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback