Tags:
, view all tags

System Administrator Guide for CREAM for EMI-2 release

1 Installation and Configuration

1.1 Prerequisites

1.1.1 Operating system

The following operating systems are supported:

  • SL5 64 bit
  • TBC

It is assumed that the operating system is already properly installed.

1.1.2 Node synchronization

A general requirement for the Grid nodes is that they are synchronized. This requirement may be fulfilled in several ways. One of the most common one is using the NTP protocol with a time server.

1.1.3 Cron and logrotate

Many components deployed on the CREAM CE rely on the presence of cron (including support for /etc/cron.* directories) and logrotate. You should make sure these utils are available on your system.

1.1.4 Batch system

If you plan to use Torque as batch system for your CREAM CE, it will be installed and configured along with the middleware (i.e. you don't have to install and configure it in advance)

If you plan to use LSF as batch system for your CREAM CE, you have to install and configure it before installing and configuring the CREAM software. Since LSF is a commercial software it can't be distributed together with the middleware.

If you plan to use GE as batch system for your CREAM CE, you have to install and configure it before installing and configuring the CREAM software. The CREAM CE integration was tested with GE 6.2u5 but it should work with any forked version of the original GE software.

The support of the batch system softwares is out of the scope of this activity.

More information abut batch system integration is available in the relevant section.

1.2 Plan how to deploy the CREAM CE

1.2.1 CREAM CE and gLite-cluster

glite-CLUSTER is a node type that can publish information about clusters and subclusters in a site, referenced by any number of compute elements. In particular it allows to deal with sites having multiple CREAM CE nodes and/or multiple subclusters (i.e. disjoint sets of worker nodes, each set having sufficiently homogeneous properties).

In Glue1, batch system queues are represented through GlueCE objectclasses. Each GlueCE refers to a Cluster, which can be composed by one or more SubClusters. However the gLite WMS requires the publication of exactly one SubCluster per Cluster (and hence per batch queue).

Thus sites with heterogeneous hardware have two possible choices:

  • publish a SubCluster with a representative/minimum hardware description (e.g. the minimum memory on any node)
  • define separate batch queues for each hardware configuration, e.g. low/high memory queues, and attach the corresponding GlueCE objects to separate Cluster/SubCluster pairs. For attributes with discrete values, e.g. SL4 vs SL5, this second option is the only one which makes sense.
However, without the use of the gLite-cluster, YAIM allows configuring a single Cluster per CREAM head node.

A second problem, addressed by gLite-cluster arises for larger sites which install multiple CE headnodes submitting to the same batch queues for redundancy or load-balancing. Without the use of gLite-cluster, YAIM generates a separate Cluster/SubCluster pair for each head node even though they all describe the same hardware. This causes no problems for job submission, but by default would overcount the installed capacity at the site by a multiple of the number of SubClusters. The workaround, before introducing the gLite-cluster, was to publish zero values for the installed capacity from all but one of the nodes (but this is clearly far from being an ideal solution).

The glite-CLUSTER node addresses this issue. It contains a subset of the functionality incorporated in the CREAM node types: the publication of the Glue1 GlueCluster and its dependent objects, the publication of the Glue1 GlueService object for the GridFTP endpoint, and the directories which store the RunTimeEnvironment tags, together with the YAIM functions which configure them.

So, gLite-CLUSTER should be considered:

  • if in the site there are multiple CE head nodes, and/or
  • if in the site there are multiple disjoint sets of worker nodes, each set having sufficiently homogeneous properties

When configuring a gLite-cluster, please consider that:

  • There should be one cluster for each set of worker nodes having sufficiently homogeneous properties
  • There should be one subcluster for each cluster
  • Each batch system queue should refer to the WNs of a single subcluster

glite-CLUSTER can be deployed in the same host of a CREAM-CE or in a different one.

The following deployment models are possible for a CREAM-CE:

  • CREAM-CE can be configured without worrying about the glite-CLUSTER node. This can be useful for small sites who don't want to worry about cluster/subcluster configurations because they have a very simple setup. In this case CREAM-CE will publish a single cluster/subcluster. This is called no cluster mode. This is done as described below by defining the yaim setting CREAMCE_CLUSTER_MODE=no (or by no defining at all that variable).
  • CREAM-CE can work on cluster mode using the glite-CLUSTER node type. This is done as described below by defining the yaim setting CREAMCE_CLUSTER_MODE=yes. The CREAM-CE can be in the same host or in a different host wrt the glite-CLUSTER node.

More information about glite-CLUSTER can be found at https://twiki.cern.ch/twiki/bin/view/LCG/CLUSTER and in this note.

Information concerning glue2 publication is available here.

1.2.2 Define a DNS alias to refer to set of CREAM CEs

In order to distribute load for job submissions, it is possible to deploy multiple CREAM CEs head nodes referring to the same set of resources. As explained in the previous section, this should be implemented with:

  • a gLite-CLUSTER node
  • multiple CREAM CEs configured in cluster mode
It is then also possible to define a DNS alias to refer to the set of CREAM head nodes: after the initial contact from outside clients to the CREAM-CE alias name for job submission, all further actions on that job are based on the jobid which contains the physical hostname of the CREAM-CE to which the job was submitted. This allows to switch the DNS alias in order to distribute load.

The alias shouldn't be published in the information service, but should be simply communicated to the relevant users.

There are various techniques to change an alias entry in the DNS. The choice depends strongly on the way the network is set up and managed. For example at DESY a self-written service called POISE is used; using metrics (which take into account in particular the load and the sandbox size ) it decides the physical instance the alias should point to. Another possibility to define aliases is to use commercial network techniques such as F5.

It must be noted that, as observed by Desy sysadmins, the proliferation of the aliases (C-records) is not well defined among DNS'. Therefore changes of an alias sometimes can take hours to be propagated to other sites.

The use of alias for job submission is a good solution to improve load balancing and availability of the service (the unavailability of a physical CREAM CE would be hidden by the use of the alias). It must however be noted that:

  • The list operation ( glite-ce-job-list command of the CREAM CLI) issued on a alias returns the identifiers of the jobs submitted to the physical instance currently pointed to the alias, and not the identifiers of all the jobs submitted to all CREAM CEs instances
  • The operations to be done on all jobs (e.g. cancel all jobs, return the status of all jobs, etc.), i.e. the ones issued using the options -a -e of the CREAM CLI, issued on a alias, refer to just the CREAM physical instance currently pointed by the alias (and not to all CREAM CE instances).
  • The use of alias is not supported for submissions through the gLite-WMS

1.2.3 Choose the authorization model

The CREAM CE can be configured to use as authorization system:

  • the ARGUS authorization framework
OR

  • the grid Java Authorization Framework (gJAF)
In the former case a ARGUS box (recommended to be at site level: it can of course serve multiple CEs of that site) where to define policies for the CREAM CE box is needed.

To use ARGUS as authorization system, yaim variable USE_ARGUS must be set in the following way:

USE_ARGUS=yes

In this case it is also necessary to set the following yaim variables:

  • ARGUS_PEPD_ENDPOINTS The endpoint of the ARGUS box (e.g."https://cream-43.pd.infn.it:8154/authz")
  • CREAM_PEPC_RESOURCEID The id of the CREAM CE in the ARGUS box (e.g. "http://pd.infn.it/cream-18")

If instead gJAF should be used as authorization system, yaim variable USE_ARGUS must be set in the following way:

USE_ARGUS=no

1.2.4 Choose the BLAH BLparser deployment model

The BLAH Blparser is the component of the CREAM CE responsible to notify CREAM about job status changes.

For LSF and PBS/Torque it is possible to configure the BLAH blparser in two possible ways:

  • The new BLAH BLparser, which relies on the status/history batch system commands
  • The old BLAH BLparser, which parses the batch system log files

For GE and Condor, only the configuration with the new BLAH blparser is possible

1.2.4.1 New BLAH Blparser

The new Blparser runs on the CREAM CE machine and it is automatically installed when installing the CREAM CE. The configuration of the new BLAH Blparser is done when configuring the CREAM CE (i.e. it is not necessary to configure the Blparser separately from the CREAM CE).

To use the new BLAH blparser, it is just necessary to set:

BLPARSER_WITH_UPDATER_NOTIFIER=true

in the siteinfo.def and then configure the CREAM CE. This is the default value.

The new BLParser doesn't parse the log files. However the bhist (for LSF) and tracejob (for Torque) commands (used by the new BLParser) require the batch system log files, which therefore must be available (in case e.g. via NFS in the CREAM CE node. Actually for Torque the blparser uses tracejob (which requires the log files) only when qstat can't find anymore the job. And this can happen if the job has been completed more than keep_completed seconds ago and the blparser was not able to detect before that the job completed/was cancelled/whatever. This can happen e.g. if keep_completed is too short or if the BLAH blparser for whatever reason didn't run for a while. If the log files are not available and the tracejob command is issued (for the reasons specified above), the BLAH blparser will not be able to find the job, which will considered "lost" (DONE-FAILED wrt CREAM).

The init script of the new Blparser is /etc/init.d/glite-ce-blah-parser. Please note that it is not needed to explicitly start the new blparser: when CREAM is started, it starts also this new BLAH Blparser if it is not already running.

When the new Blparser is running, you should see the following two processes on the CREAM CE node:

  • /usr/bin/BUpdaterxxx
  • /usr/bin/BNotifier

Please note that the user tomcat on the CREAM CE should be allowed to issue the relevant status/history commands (for Torque: qstat, tracejob, for LSF: bhist, bjobs). Some sites configure the batch system so that users can only see their own jobs (e.g. in torque:

set server query_other_jobs = False

). If this is done at the site, then the tomcat user will need a special privilege in order to be exempt from this setting (in torque:

set server operators += tomcat@creamce.yoursite.domain

).

1.2.4.2 Old BLAH Blparser

The old BLAH blparser must be installed on a machine where the batch system log files are available (let's call this host BLPARSER_HOST. So the BLPARSER_HOST can be the batch system master or a different machine where the log files are available (e.g. they have been exported via NFS). There are two possible layouts:

  • The BLPARSER_HOST is the CREAM CE host
  • The BLPARSER_HOST is different than the CREAM CE host

If the BLPARSER_HOST is the CREAM CE host, after having installed and configured the CREAM CE, it is necessary to configure the old BLAH Blparser (as explained below) and then to restart tomcat.

If the BLPARSER_HOST is different than the CREAM CE host, after having installed and configured the CREAM CE it is necessary:

  • to install the old BLAH BLparser software on this BLPARSER_HOST as explained below
  • to configure thie old BLAH BLparser
  • to restart tomcat on the CREAM-CE

On the CREAM CE, to use the old BLAH blparser, it is necessary to set:

BLPARSER_WITH_UPDATER_NOTIFIER=false

in the siteinfo.def before configuring via yaim.

1.2.5 Deployment models for CREAM databases

The databases used by CREAM can be deployed in the CREAM CE host (which is the default scenario) or on a different machine.

Click here for information how to deploy the databases on a machine different wrt the CREAM-CE.

1.3 CREAM CE Installation

This section explains how to install:

  • a CREAM CE in no cluster mode
  • a CREAM CE in cluster mode
  • a glite-CLUSTER node
For all these scenarios, the setting of the repositories is the same.

1.3.1 Repositories

For a successful installation, you will need to configure your package manager to reference a number of repositories (in addition to your OS);

  • the EPEL repository
  • the EMI middleware repository
  • the CA repository

and to REMOVE (!!!) or DEACTIVATE (!!!)

  • the DAG repository

1.3.1.1 The EPEL repository

On sl5_x86_64, you can install the EPEL repository, issuing:

rpm -Uvh http://download.fedora.redhat.com/pub/epel/5/i386/epel-release-5-4.noarch.rpm

TBC

1.3.1.2 The EMI middleware repository

On sl5_x86_64 you can install the EMI-2 yum repository, issuing:

wget TBD
yum install ./TBD

TBC

1.3.1.3 The Certification Authority repository

The most up-to-date version of the list of trusted Certification Authorities (CA) is needed on your node. The relevant yum repo can be installed issuing:

wget http://repository.egi.eu/sw/production/cas/1/current/repo-files/EGI-trustanchors.repo -O /etc/yum.repos.d/EGI-trustanchors.repo

1.3.1.4 Important note on automatic updates

An update of an the packages not followed by configuration can cause problems. Therefore WE STRONGLY RECOMMEND NOT TO USE AUTOMATIC UPDATE PROCEDURE OF ANY KIND.

Running the script available at http://forge.cnaf.infn.it/frs/download.php/101/disable_yum.sh (implemented by Giuseppe Platania (INFN Catania) yum autoupdate will be disabled

1.3.2 Installation of a CREAM CE node in no cluster mode

On sl5_x86_64 first of all install the yum-protectbase rpm:

  yum install yum-protectbase.noarch 

Then proceed with the installation of the CA certificates.

TBC

1.3.2.1 Installation of the CA certificates

On sl5_x86_64, the CA certificate can be installed issuing:

yum install ca-policy-egi-core 

TBC

1.3.2.2 Installation of the CREAM CE software

On sl5_x86_64 first of all install xml-commons-apis:

yum install xml-commons-apis 

This is due to a dependency problem within the Tomcat distribution

Then install the CREAM-CE metapackage:

yum install emi-cream-ce

TBC

1.3.2.3 Installation of the batch system specific software

After the installation of the CREAM CE metapackage it is necessary to install the batch system specific metapackage(s):

On sl5_x86_64:

  • If you are running Torque, and your CREAM CE node is the torque master, install the emi-torque-server and emi-torque-utils metapackages:

yum install emi-torque-server
yum install emi-torque-utils

  • If you are running Torque, and your CREAM CE node is NOT the torque master, install the emi-torque-utils metapackage:

yum install emi-torque-utils

  • If you are running LSF, install the emi-lsf-utils metapackage:

yum install emi-lsf-utils

  • If you are running GE, install the emi-ge-utils metapackage:
yum install emi-ge-utils

TBC

1.3.3 Installation of a CREAM CE node in cluster mode

On sl5_x86_64, first of all install the yum-protectbase rpm:

  yum install yum-protectbase.noarch 

Then proceed with the installation of the CA certificates.

1.3.3.1 Installation of the CA certificates

On sl5_86_64 the CA certificate can be installed issuing:

yum install ca-policy-egi-core 

1.3.3.2 Installation of the CREAM CE software

On sl5_x86_64, first of all install xml-commons-apis:

yum install xml-commons-apis 

This is due to a dependency problem within the Tomcat distribution

Then install the CREAM-CE metapackage:

yum install emi-cream-ce

1.3.3.3 Installation of the batch system specific software

After the installation of the CREAM CE metapackage it is necessary to install the batch system specific metapackage(s).

On sl5_x86_64:

  • If you are running Torque, and your CREAM CE node is the torque master, install the emi-torque-server and emi-torque-utils metapackages:

yum install emi-torque-server
yum install emi-torque-utils

  • If you are running Torque, and your CREAM CE node is NOT the torque master, install the emi-torque-utils metapackage:

yum install emi-torque-utils

  • If you are running LSF, install the emi-lsf-utils metapackage:

yum install emi-lsf-utils

* If you are running GE, install the emi-ge-utils metapackage:

yum install emi-ge-utils

TBC

1.3.3.4 Installation of the cluster metapackage

If the CREAM CE node has to host also the glite-cluster, install also the relevant metapackage.

On sl5_x86_64:

yum install emi-cluster 

TBC

1.3.4 Installation of a glite-cluster node

On sl5_x86_64, first of all install the yum-protectbase rpm:

  yum install yum-protectbase.noarch 

Then proceed with the installation of the CA certificates.

1.3.4.1 Installation of the CA certificates

On sl5_x86_64, the CA certificates can be installed issuing:

yum install ca-policy-egi-core 

1.3.4.2 Installation of the cluster metapackage

Install the glite-CLUSTER metapackage.

On sl5_x86_64:

yum install emi-cluster 

1.3.5 Installation of the BLAH BLparser

If the new BLAH Blparser must be used, there isn't anything to be installed for the BLAH Blparser (i.e. the installation of the CREAM-CE is enough).

This is also the case when the old BLAH Blparser must be used AND the BLPARSER_HOST is the CREAM-CE.

Only when the old BLAH Blparser must be used AND the BLPARSER_HOST is different than the CREAM-CE, it is necessary to install the BLParser software on this BLPARSER_HOST. This is done in the following way:

On sl5_x86_64:

yum install glite-ce-blahp 
yum install glite-yaim-cream-ce

TBC

1.3.6 Installation of the CREAM CLI

The CREAM CLI is part of the EMI-UI. To install it please refer to TBD.

1.4 CREAM CE configuration

1.4.1 Manual and automatic (yaim) configuration

The following sections describe the needed configuration steps for two following scenarios:

  • Manual configuration
  • Automatic configuration via yaim

For a detailed description on how to configure the middleware with YAIM, please check the YAIM guide.

The necessary YAIM modules needed to configure a certain node type are automatically installed with the middleware.

1.4.2 Configuration of a CREAM CE node in no cluster mode

1.4.2.1 Install host certificate

The CREAM CE node requires the host certificate/key files to be installed. Contact your national Certification Authority (CA) to understand how to obtain a host certificate if you do not have one already.

Once you have obtained a valid certificate:

  • hostcert.pem - containing the machine public key
  • hostkey.pem - containing the machine private key
make sure to place the two files in the target node into the /etc/grid-security directory. Then set the proper mode and ownerships doing:

chown root.root /etc/grid-security/hostcert.pem
chown root.root /etc/grid-security/hostkey.pem
chmod 600 /etc/grid-security/hostcert.pem
chmod 400 /etc/grid-security/hostkey.pem

1.4.2.2 Manual configuration

TBD

1.4.2.3 Configuration via yaim

1.4.2.3.1 Configure the siteinfo.def file

Set your siteinfo.def file, which is the input file used by yaim. Documentation about yaim variables relevant for CREAM CE is available at TBD,

Be sure that CREAMCE_CLUSTER_MODE is set to no (or not set at all, since no is the default value).

1.4.2.3.2 Run yaim

After having filled the siteinfo.def file, run yaim:

/opt/glite/yaim/bin/yaim -c -s <site-info.def> -n creamCE -n <LRMSnode> 

Examples:

  • Configuration of a CREAM CE in no cluster mode using Torque as batch system, with the CREAM CE being also Torque server

     /opt/glite/yaim/bin/yaim -c -s <site-info.def> -n creamCE -n TORQUE_server -n TORQUE_utils

  • Configuration of a CREAM CE in no cluster mode using Torque as batch system, with the CREAM CE NOT being also Torque server

     /opt/glite/yaim/bin/yaim -c -s <site-info.def> -n creamCE -n TORQUE_utils

  • Configuration of a CREAM CE in no cluster mode using LSF as batch system

     /opt/glite/yaim/bin/yaim -c -s <site-info.def> -n creamCE -n LSF_utils 

  • Configuration of a CREAM CE in no cluster mode using GE as batch system
     /opt/glite/yaim/bin/yaim -c -s <site-info.def> -n creamCE -n SGE_utils 

1.4.3 Configuration of a CREAM CE node in cluster mode

1.4.3.1 Install host certificate

The CREAM CE node requires the host certificate/key files to be installed. Contact your national Certification Authority (CA) to understand how to obtain a host certificate if you do not have one already.

Once you have obtained a valid certificate:

  • hostcert.pem - containing the machine public key
  • hostkey.pem - containing the machine private key
make sure to place the two files in the target node into the /etc/grid-security directory. Then set the proper mode and ownerships doing:

chown root.root /etc/grid-security/hostcert.pem
chown root.root /etc/grid-security/hostkey.pem
chmod 600 /etc/grid-security/hostcert.pem
chmod 400 /etc/grid-security/hostkey.pem

1.4.3.2 Manual configuration

TBD

1.4.3.3 Configuration via yaim

1.4.3.3.1 Configure the siteinfo.def file

Set your siteinfo.def file, which is the input file used by yaim.

Variables which are required in cluster mode are described at TBD.

When the CREAM CE is configured in cluster mode it will stop publishing information about clusters and subclusters. That information should be published by the glite-CLUSTER node type instead. A specific set of yaim variables has been defined for configuring the information which is still required by the CREAM CE in cluster mode. The names of these variables follow this syntax:

  • In general, variables based on hostnames, queues or VOViews containing '.' and '_' # should be transformed into '-'
  • <host-name>: identifier that corresponds to the CE hostname in lower case. Example: ctb-generic-1.cern.ch -> ctb_generic_1_cern_ch
  • <queue-name>: identifier that corresponds to the queue in upper case. Example: dteam -> DTEAM
  • <voview-name>: identifier that corresponds to the VOView id in upper case. '/' and '=' should also be transformed into '_'. Example: /dteam/Role=admin -> DTEAM_ROLE_ADMIN

Be sure that CREAMCE_CLUSTER_MODE is set to yes

1.4.3.3.2 Run yaim

After having filled the siteinfo.def file, run yaim:

/opt/glite/yaim/bin/yaim -c -s <site-info.def> -n creamCE -n <LRMSnode> [-n glite-CLUSTER]

-n glite-CLUSTER must be specified only if the glite-CLUSTER is deployed in the same node of the CREAM-CE

Examples:

  • Configuration of a CREAM CE in cluster mode (with glite-CLUSTER deployed on a different node) using LSF as batch system

     /opt/glite/yaim/bin/yaim -c -s <site-info.def> -n creamCE -n LSF_utils

  • Configuration of a CREAM CE in cluster mode (with glite-CLUSTER deployed on a different node) using GE as batch system
    /opt/glite/yaim/bin/yaim -c -s <site-info.def> -n creamCE -n SGE_utils 
  • Configuration of a CREAM CE in cluster mode (with glite-CLUSTER deployed on a different node) using Torque as batch system, with the CREAM CE being also Torque server

     /opt/glite/yaim/bin/yaim -c -s <site-info.def> -n creamCE -n TORQUE_server -n TORQUE_utils

  • Configuration of a CREAM CE in cluster mode (with glite-CLUSTER deployed on a different node) using Torque as batch system, with the CREAM CE NOT being also Torque server

     /opt/glite/yaim/bin/yaim -c -s <site-info.def> -n creamCE -n TORQUE_utils

  • Configuration of a CREAM CE in cluster mode (with glite-CLUSTER deployed on the same node of the CREAM-CE) using LSF as batch system

     /opt/glite/yaim/bin/yaim -c -s <site-info.def> -n creamCE -n LSF_utils -n glite-CLUSTER

* Configuration of a CREAM CE in cluster mode (with glite-CLUSTER deployed on the same node of the CREAM-CE) using GE as batch system

     /opt/glite/yaim/bin/yaim -c -s <site-info.def> -n creamCE -n SGE_utils -n glite-CLUSTER
  • Configuration of a CREAM CE in cluster mode (with glite-CLUSTER deployed on the same node of the CREAM-CE) using Torque as batch system, with the CREAM CE being also Torque server

     /opt/glite/yaim/bin/yaim -c -s <site-info.def> -n creamCE -n TORQUE_server -n TORQUE_utils  -n glite-CLUSTER

  • Configuration of a CREAM CE in cluster mode (with glite-CLUSTER deployed on the same node of the CREAM-CE)) using Torque as batch system, with the CREAM CE NOT being also Torque server

     /opt/glite/yaim/bin/yaim -c -s <site-info.def> -n creamCE -n TORQUE_utils -n glite-CLUSTER

1.4.4 Configuration of a glite-CLUSTER node

1.4.4.1 Install host certificate

The glite-CLUSTER node requires the host certificate/key files to be installed. Contact your national Certification Authority (CA) to understand how to obtain a host certificate if you do not have one already.

Once you have obtained a valid certificate:

  • hostcert.pem - containing the machine public key
  • hostkey.pem - containing the machine private key
make sure to place the two files in the target node into the /etc/grid-security directory. Then set the proper mode and ownerships doing:

chown root.root /etc/grid-security/hostcert.pem
chown root.root /etc/grid-security/hostkey.pem
chmod 600 /etc/grid-security/hostcert.pem
chmod 400 /etc/grid-security/hostkey.pem

1.4.4.2 Manual configuration

1.4.4.3 Configuration via yaim

1.4.4.3.1 Configure the siteinfo.def file

Set your siteinfo.def file, which is the input file used by yaim. Documentation about yaim variables relevant for glite-CLUSTER is available at TBD.

1.4.4.3.2 Run yaim

After having filled the siteinfo.def file, run yaim:

/opt/glite/yaim/bin/yaim -c -s <site-info.def> -n glite-CLUSTER

1.4.5 Configuration of the BLAH Blparser

If the new BLAH Blparser must be used, there isn't anything to be configured for the BLAH Blparser (i.e. the configuration of the CREAM-CE is enough).

If the old BLparser must be used, it is necessary to configure it on the BLPARSER_HOST (which, as said above, can be the CREAM-CE node or on a different host). This is done via yaim in the following way:

/opt/glite/yaim/bin/yaim -r -s <site-info.def> -n creamCE -f config_cream_blparser

In case of manual configuration, TBD

Then it is necessary to restart tomcat on the CREAM-CE node:

service tomcat5 restart

1.4.5.1 Configuration of the old BLAH Blparser to serve multiple CREAM CEs

The configuration instructions reported above explains how to configure a CREAM CE and the BLAH blparser (old model) considering the scenario where the BLAH blparser has to "serve" a single CREAM CE.

Considering that the blparser (old model) has to run where the batch system log files are available, let's consider a scenario where there are 2 CREAM CEs ( ce1.mydomain and ce2.mydomain) that must be configured. Let's suppose that the batch system log files are not available on these 2 CREAM CEs machine. Let's assume they are available in another machine ( blhost.mydomain), where the old blparser has to be installed.

The following summarizes what must be done:

  • In the /services/glite-creamce for ce1.mydomain set:

BLPARSER_HOST=blhost.mydomain
BLAH_JOBID_PREFIX=cre01_
BLP_PORT=33333

and configure ce1.mydomain via yaim:

/opt/glite/yaim/bin/yaim -c -s <site-info.def> -n creamCE -n <LRMSnode> [-n glite-CLUSTER]

  • In the /services/glite-creamce for ce2.mydomain set:

BLPARSER_HOST=blhost.mydomain
BLAH_JOBID_PREFIX=cre02_
BLP_PORT=33334

and configure ce2.mydomain via yaim:

/opt/glite/yaim/bin/yaim -c -s <site-info.def> -n creamCE -n <LRMSnode> [-n glite-CLUSTER]

  • In the /services/glite-creamce for blhost.mydomain sets:

CREAM_PORT=56565

and configure blhost.mydomain via yaim:

/opt/glite/yaim/bin/yaim -r -s <site-info.def> -n creamCE -f config_cream_blparser

  • In blhost.mydomain edit the file /etc/blparser.conf setting (considering the pbs/torque scenario):

GLITE_CE_BLPARSERPBS_NUM=2

# ce01.mydomain
GLITE_CE_BLPARSERPBS_PORT1=33333
GLITE_CE_BLPARSERPBS_CREAMPORT1=56565

# ce02.mydomain
GLITE_CE_BLPARSERPBS_PORT2=33334
GLITE_CE_BLPARSERPBS_CREAMPORT2=56566

  • Restart the blparser on blhost.mydomain:

/etc/init.d/glite-ce-blparser restart

  • Restart tomcat on ce01.mydomain and ce02.mydomain
You can of course replace 33333, 33334, 56565, 56566 (reported in the above examples) with other port numbers

1.4.5.2 Configuration of the new BLAH Blparser to to use cached batch system commands

The new BLAH blparser can be configured in order to not interact directly with the batch system, but through a program (to be implemented by the site admin) which can implement some caching functionality. This is the case for example of CommandProxyTools, implement at Cern

To enable this feature, add in /etc/blah.config (the example below is for lsf, with /usr/bin/runcmd.pl as name of the "caching" program):

lsf_batch_caching_enabled=yes
batch_command_caching_filter=/usr/bin/runcmd.pl

So the blparser, insead of issuing bjobs -u ...., will issue /usr/bin/runcmd.pl bjobs -u ..." </verbatim>

1.4.6 Configuration of the CREAM databases on a host different than the CREAM-CE (using yaim)

To configure the CREAM databases on a host different than the CREAM-CE:

  • Set in the siteinfo.def file the variable CREAM_DB_HOST to the remote host (where mysql must be already installed)
  • Set in the siteinfo.def file the variable MYSQL_PASSWORD considering the mysql password of the remote host
  • On this remote host, grant appropriate privs to root@CE_HOST
  • Configure via yaim

1.4.7 Configuration of the CREAM CLI

The CREAM CLI is part of the EMI-UI. To configure it please refer to https://twiki.cern.ch/twiki/bin/view/EMI/EMIui#Client_Installation_Configuratio.

1.4.8 Configurations possible only manually

yaim allows to choose the most important parameters (via yaim variables) related to the CREAM-CE. It is then possible to tune some other attributes manually editing the relevant configuration files.

Please note that:

  • After having manually modified a configuration file, it is then necessary to restart the service
  • Manual changes done in the configuration files are scratched by following yaim reconfigurations


1.5 Batch system integration

1.5.1 Torque

1.5.1.1 Installation

-- MassimoSgaravatto - 2011-12-20

Edit | Attach | PDF | History: r31 | r5 < r4 < r3 < r2 | Backlinks | Raw View | More topic actions...
Topic revision: r3 - 2012-01-16 - MassimoSgaravatto
 

  • Edit
  • Attach
This site is powered by the TWiki collaboration platformCopyright © 2008-2020 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback