Line: 1 to 1 | ||||||||
---|---|---|---|---|---|---|---|---|
Notes about Installation and Configuration of a CREAM Computing Element - EMI-2 - SL6 (external Torque, external Argus, MPI enabled)
|
Line: 1 to 1 | ||||||||
---|---|---|---|---|---|---|---|---|
Notes about Installation and Configuration of a CREAM Computing Element - EMI-2 - SL6 (external Torque, external Argus, MPI enabled)
| ||||||||
Line: 71 to 71 | ||||||||
vo.d directoryCreate the directorysiteinfo/vo.d and fill it with a file for each supported VO. You can download them from HERE![]() ![]() | ||||||||
Added: | ||||||||
> > | # cat /root/siteinfo/vo.d/comput-er.it SW_DIR=$VO_SW_DIR/computer DEFAULT_SE=$SE_HOST STORAGE_DIR=$CLASSIC_STORAGE_DIR/computer VOMS_SERVERS="'vomss://voms2.cnaf.infn.it:8443/voms/comput-er.it?/comput-er.it'" VOMSES="'comput-er.it voms2.cnaf.infn.it 15007 /C=IT/O=INFN/OU=Host/L=CNAF/CN=voms2.cnaf.infn.it comput-er.it' 'comput-er.it voms-02.pd.infn.it 15007 /C=IT/O=INFN/OU=Host/L=Padova/CN=voms-02.pd.infn.it comput-er.it'" VOMS_CA_DN="'/C=IT/O=INFN/CN=INFN CA' '/C=IT/O=INFN/CN=INFN CA'" # cat /root/siteinfo/vo.d/dteam SW_DIR=$VO_SW_DIR/dteam DEFAULT_SE=$SE_HOST STORAGE_DIR=$CLASSIC_STORAGE_DIR/dteam VOMS_SERVERS='vomss://voms.hellasgrid.gr:8443/voms/dteam?/dteam/' VOMSES="'dteam lcg-voms.cern.ch 15004 /DC=ch/DC=cern/OU=computers/CN=lcg-voms.cern.ch dteam 24' 'dteam voms.cern.ch 15004 /DC=ch/DC=cern/OU=computers/CN=voms.cern.ch dteam 24' 'dteam voms.hellasgrid.gr 15004 /C=GR/O=HellasGrid/OU=hellasgrid.gr/CN=voms.hellasgrid.gr dteam 24' 'dteam voms2.hellasgrid.gr 15004 /C=GR/O=HellasGrid/OU=hellasgrid.gr/CN=voms2.hellasgrid.gr dteam 24'" VOMS_CA_DN="'/DC=ch/DC=cern/CN=CERN Trusted Certification Authority' '/DC=ch/DC=cern/CN=CERN Trusted Certification Authority' '/C=GR/O=HellasGrid/OU=Certification Authorities/CN=HellasGrid CA 2006' '/C=GR/O=HellasGrid/OU=Certification Authorities/CN=HellasGrid CA 2006'" # cat /root/siteinfo/vo.d/gridit SW_DIR=$VO_SW_DIR/gridit DEFAULT_SE=$SE_HOST STORAGE_DIR=$CLASSIC_STORAGE_DIR/gridit VOMS_SERVERS="'vomss://voms.cnaf.infn.it:8443/voms/gridit?/gridit' 'vomss://voms-01.pd.infn.it:8443/voms/gridit?/gridit'" VOMSES="'gridit voms.cnaf.infn.it 15008 /C=IT/O=INFN/OU=Host/L=CNAF/CN=voms.cnaf.infn.it gridit' 'gridit voms-01.pd.infn.it 15008 /C=IT/O=INFN/OU=Host/L=Padova/CN=voms-01.pd.infn.it gridit'" VOMS_CA_DN="'/C=IT/O=INFN/CN=INFN CA' '/C=IT/O=INFN/CN=INFN CA'" # cat /root/siteinfo/vo.d/igi.italiangrid.it SW_DIR=$VO_SW_DIR/igi DEFAULT_SE=$SE_HOST STORAGE_DIR=$CLASSIC_STORAGE_DIR/igi VOMS_SERVERS="'vomss://vomsmania.cnaf.infn.it:8443/voms/igi.italiangrid.it?/igi.italiangrid.it'" VOMSES="'igi.italiangrid.it vomsmania.cnaf.infn.it 15003 /C=IT/O=INFN/OU=Host/L=CNAF/CN=vomsmania.cnaf.infn.it igi.italiangrid.it'" VOMS_CA_DN="'/C=IT/O=INFN/CN=INFN CA'" # cat /root/siteinfo/vo.d/infngrid SW_DIR=$VO_SW_DIR/infngrid DEFAULT_SE=$SE_HOST STORAGE_DIR=$CLASSIC_STORAGE_DIR/infngrid VOMS_SERVERS="'vomss://voms.cnaf.infn.it:8443/voms/infngrid?/infngrid' 'vomss://voms-01.pd.infn.it:8443/voms/infngrid?/infngrid'" VOMSES="'infngrid voms.cnaf.infn.it 15000 /C=IT/O=INFN/OU=Host/L=CNAF/CN=voms.cnaf.infn.it infngrid' 'infngrid voms-01.pd.infn.it 15000 /C=IT/O=INFN/OU=Host/L=Padova/CN=voms-01.pd.infn.it infngrid'" VOMS_CA_DN="'/C=IT/O=INFN/CN=INFN CA' '/C=IT/O=INFN/CN=INFN CA'" # cat /root/siteinfo/vo.d/ops SW_DIR=$VO_SW_DIR/ops DEFAULT_SE=$SE_HOST STORAGE_DIR=$CLASSIC_STORAGE_DIR/ops VOMS_SERVERS="vomss://voms.cern.ch:8443/voms/ops?/ops/" VOMSES="'ops lcg-voms.cern.ch 15009 /DC=ch/DC=cern/OU=computers/CN=lcg-voms.cern.ch ops 24' 'ops voms.cern.ch 15009 /DC=ch/DC=cern/OU=computers/CN=voms.cern.ch ops 24'" VOMS_CA_DN="'/DC=ch/DC=cern/CN=CERN Trusted Certification Authority' '/DC=ch/DC=cern/CN=CERN Trusted Certification Authority'" | |||||||
users and groupsYou can download them from HERE![]() |
Line: 1 to 1 | ||||||||
---|---|---|---|---|---|---|---|---|
Notes about Installation and Configuration of a CREAM Computing Element - EMI-2 - SL6 (external Torque, external Argus, MPI enabled)
| ||||||||
Line: 52 to 52 | ||||||||
# yum clean all | ||||||||
Changed: | ||||||||
< < | # yum install ca-policy-egi-core emi-cream-ce emi-torque-utils | |||||||
> > | # yum install ca-policy-egi-core emi-cream-ce emi-torque-utils glite-mpi | |||||||
Service configuration | ||||||||
Line: 61 to 61 | ||||||||
# cp -vr /opt/glite/yaim/examples/siteinfo . | ||||||||
Added: | ||||||||
> > | host certificate# ll /etc/grid-security/host* -rw-r--r-- 1 root root 1440 Oct 18 09:31 /etc/grid-security/hostcert.pem -r-------- 1 root root 887 Oct 18 09:31 /etc/grid-security/hostkey.pem | |||||||
vo.d directoryCreate the directorysiteinfo/vo.d and fill it with a file for each supported VO. You can download them from HERE![]() ![]() | ||||||||
Line: 83 to 90 | ||||||||
site-info.defKISS: Keep it simple, stupid! For your convenience there is an explanation of each yaim variable. For more details look HERE![]() | ||||||||
Added: | ||||||||
> > | SUGGESTION: use the same site-info.def for CREAM and WNs: for this reason in this example file there are yaim variable used by CREAM, TORQUE or emi-WN. | |||||||
Changed: | ||||||||
< < | # cat siteinfo/site-info.def BATCH_SERVER=batch.cnaf.infn.it | |||||||
> > | # cat site-info.def | |||||||
CE_HOST=cream-01.cnaf.infn.it | ||||||||
Added: | ||||||||
> > | SITE_NAME=IGI-BOLOGNA BATCH_SERVER=batch.cnaf.infn.it BATCH_LOG_DIR=/var/torque #BDII_HOST=egee-bdii.cnaf.infn.it CE_BATCH_SYS=torque JOB_MANAGER=pbs BATCH_VERSION=torque-2.5.7 #CE_DATADIR= CE_INBOUNDIP=FALSE CE_OUTBOUNDIP=TRUE CE_OS="ScientificSL" CE_OS_RELEASE=6.2 CE_OS_VERSION="Carbon" CE_RUNTIMEENV="IGI-BOLOGNA" CE_PHYSCPU=8 CE_LOGCPU=16 CE_MINPHYSMEM=16000 CE_MINVIRTMEM=32000 | |||||||
CE_SMPSIZE=8 | ||||||||
Added: | ||||||||
> > | CE_CPU_MODEL=Xeon CE_CPU_SPEED=2493 CE_CPU_VENDOR=intel CE_CAPABILITY="CPUScalingReferenceSI00=1039 glexec" CE_OTHERDESCR="Cores=1,Benchmark=4.156-HEP-SPEC06" CE_SF00=951 CE_SI00=1039 CE_OS_ARCH=x86_64 CREAM_PEPC_RESOURCEID="http://cnaf.infn.it/cremino" | |||||||
USERS_CONF=/root/siteinfo/ig-users.conf GROUPS_CONF=/root/siteinfo/ig-users.conf | ||||||||
Line: 95 to 139 | ||||||||
QUEUES="cert prod" CERT_GROUP_ENABLE="dteam infngrid ops /dteam/ROLE=lcgadmin /dteam/ROLE=production /ops/ROLE=lcgadmin /ops/ROLE=pilot /infngrid/ROLE=SoftwareManager /infngrid/ROLE=pilot" PROD_GROUP_ENABLE="comput-er.it gridit igi.italiangrid.it /comput-er.it/ROLE=SoftwareManager /gridit/ROLE=SoftwareManager /igi.italiangrid.it/ROLE=SoftwareManager" | ||||||||
Added: | ||||||||
> > | VO_SW_DIR=/opt/exp_soft | |||||||
WN_LIST="/root/siteinfo/wn-list.conf" MUNGE_KEY_FILE=/etc/munge/munge.key CONFIG_MAUI="no" | ||||||||
Changed: | ||||||||
< < | SITE_NAME=IGI-BOLOGNA | |||||||
> > | MYSQL_PASSWORD=********************************* | |||||||
APEL_DB_PASSWORD=not_used APEL_MYSQL_HOST=not_used | ||||||||
Added: | ||||||||
> > | SE_LIST="darkstorm.cnaf.infn.it" SE_MOUNT_INFO_LIST="none" | |||||||
WN list | ||||||||
Line: 113 to 160 | ||||||||
wn06.cnaf.infn.it | ||||||||
Added: | ||||||||
> > | services/glite-mpi_ce# cp /opt/glite/yaim/examples/siteinfo/services/glite-mpi_ce /root/siteinfo/services/ | |||||||
Changed: | ||||||||
< < | site-info.defSUGGESTION: use the same site-info.def for CREAM and WNs: for this reason in this example file there are yaim variable used by CREAM, TORQUE or emi-WN. It is also included the settings of some VOs | |||||||
> > | # cat services/glite-mpi_ce # Setup configuration variables that are common to both the CE and WN | |||||||
Changed: | ||||||||
< < | For your convenience there is an explanation of each yaim variable. For more details look at [8, 9, 10]
</> <--/twistyPlugin--> | |||||||
> > | if [ -r ${config_dir}/services/glite-mpi ]; then source ${config_dir}/services/glite-mpi fi # The MPI CE config function can create a submit filter for # Torque to ensure that CPU allocation is performed correctly. # Change this variable to "yes" to have YAIM create this filter. # Warning: if you have an existing torque.cfg it will be modified. MPI_SUBMIT_FILTER=${MPI_SUBMIT_FILTER:-"yes"} | |||||||
Deleted: | ||||||||
< < | <--/twistyPlugin twikiMakeVisibleInline--> | |||||||
services/glite-creamce | ||||||||
Added: | ||||||||
> > | # cat /root/siteinfo/services/glite-creamce | |||||||
# # YAIM creamCE specific variables # | ||||||||
Deleted: | ||||||||
< < | # LSF settings: path where lsf.conf is located #BATCH_CONF_DIR=lsf_install_path/conf | |||||||
# # CE-monitor host (by default CE-monitor is installed on the same machine as # cream-CE) CEMON_HOST=$CE_HOST # # CREAM database user | ||||||||
Changed: | ||||||||
< < | CREAM_DB_USER=********* | |||||||
> > | CREAM_DB_USER=******************** CREAM_DB_PASSWORD=**************************** | |||||||
# | ||||||||
Deleted: | ||||||||
< < | CREAM_DB_PASSWORD=********* | |||||||
# Machine hosting the BLAH blparser. # In this machine batch system logs must be accessible. | ||||||||
Deleted: | ||||||||
< < | #BLPARSER_HOST=set_to_fully_qualified_host_name_of_machine_hosting_blparser_server | |||||||
BLPARSER_HOST=$CE_HOST | ||||||||
Deleted: | ||||||||
< < |
<--/twistyPlugin--> <--/twistyPlugin twikiMakeVisibleInline--> services/dgas_sensors# # YAIM DGAS Sensors specific variables # ################################ # DGAS configuration variables # ################################ # For any details about DGAS variables please refer to the guide: # http://igrelease.forge.cnaf.infn.it/doku.php?id=doc:guides:dgas # Reference Resource HLR for the site. DGAS_HLR_RESOURCE="prod-hlr-01.pd.infn.it" # Specify the type of job which the CE has to process. # Set ”all” on “the main CE” of the site, ”grid” on the others. # Default value: all #DGAS_JOBS_TO_PROCESS="all" # This parameter can be used to specify the list of VOs to publish. # If the parameter is specified, the sensors (pushd) will forward # to the Site HLR just records belonging to one of the specified VOs. # Leave commented if you want to send records for ALL VOs # Default value: parameter not specified #DGAS_VO_TO_PROCESS="vo1;vo2;vo3..." # Bound date on jobs backward processing. # The backward processing does not consider jobs prior to that date. # Default value: 2009-01-01. #DGAS_IGNORE_JOBS_LOGGED_BEFORE="2011-11-01" # Main CE of the site. # ATTENTION: set this variable only in the case of site with a “singleLRMS” # in which there are more than one CEs or local submission hosts (i.e. host # from which you may submit jobs directly to the batch system). # In this case, DGAS_USE_CE_HOSTNAME parameter must be set to the same value # for all hosts sharing the lrms and this value can be arbitrary chosen among # these submitting hostnames (you may choose the best one). # Otherwise leave it commented. # we have 2 CEs, cremino is the main one DGAS_USE_CE_HOSTNAME="cremino.cnaf.infn.it" # Path for the batch-system log files. # * for torque/pbs: # DGAS_ACCT_DIR=/var/torque/server_priv/accounting # * for LSF: # DGAS_ACCT_DIR=lsf_install_path/work/cluster_name/logdir # * for SGE: # DGAS_ACCT_DIR=/opt/sge/default/common/ DGAS_ACCT_DIR=/var/torque/server_priv/accounting # Full path to the 'condor_history' command, used to gather DGAS usage records # when Condor is used as a batch system. Otherwise leave it commented. #DGAS_CONDOR_HISTORY_COMMAND="" <--/twistyPlugin--> <--/twistyPlugin twikiMakeVisibleInline--> ---+++ host certificate # ll /etc/grid-security/host* -rw-r--r-- 1 root root 1440 Oct 18 09:31 /etc/grid-security/hostcert.pem -r-------- 1 root root 887 Oct 18 09:31 /etc/grid-security/hostkey.pem <--/twistyPlugin--> | |||||||
Changed: | ||||||||
< < | <--/twistyPlugin twikiMakeVisibleInline--> munge configurationIMPORTANT: The updated EPEL5 build of torque-2.5.7-1 as compared to previous versions enables munge as an inter node authentication method.
# rpm -qa | grep munge munge-libs-0.5.8-8.el5 munge-0.5.8-8.el5
# /usr/sbin/create-munge-key # ls -ltr /etc/munge/ total 4 -r-------- 1 munge munge 1024 Jan 13 14:32 munge.key
# chown munge:munge /etc/munge/munge.key
# service munge start Starting MUNGE: [ OK ] # chkconfig munge on | |||||||
> > | # Value to be published as GlueCEStateStatus instead of Production #CREAM_CE_STATE=Special | |||||||
Changed: | ||||||||
< < | <--/twistyPlugin--> | |||||||
> > | services/dgas_sensors (not available yet)TODO | |||||||
Changed: | ||||||||
< < | <--/twistyPlugin twikiMakeVisibleInline--> | |||||||
> > | yaim check | |||||||
Verify to have set all the yaim variables by launching: | ||||||||
Changed: | ||||||||
< < | # /opt/glite/yaim/bin/yaim -v -s site-info_cremino.def -n creamCE -n TORQUE_server -n TORQUE_utils -n DGAS_sensors | |||||||
> > | # /opt/glite/yaim/bin/yaim -v -s /root/siteinfo/site-info.def -n creamCE -n TORQUE_utils | |||||||
Changed: | ||||||||
< < | see details
<--/twistyPlugin--> <--/twistyPlugin twikiMakeVisibleInline--> # /opt/glite/yaim/bin/yaim -c -s site-info_cremino.def -n creamCE -n TORQUE_server -n TORQUE_utils -n DGAS_sensorssee details <--/twistyPlugin--> <--/twistyPlugin twikiMakeVisibleInline--> Software Area settingsIf the Software Area is hosted on your CE, you have to create it and export to the WNs in the site.def we set:VO_SW_DIR=/opt/exp_soft
mkdir /opt/exp_soft/
/opt/exp_soft/ *.cnaf.infn.it(rw,sync,no_root_squash)
# service nfs status rpc.mountd is stopped nfsd is stopped # service portmap status portmap is stopped # service portmap start Starting portmap: [ OK ] # service nfs start Starting NFS services: [ OK ] Starting NFS daemon: [ OK ] Starting NFS mountd: [ OK ] Starting RPC idmapd: [ OK ] # chkconfig nfs on # chkconfig portmap on
# exportfs -raor simply restart nfs daemon <--/twistyPlugin--> <--/twistyPlugin twikiMakeVisibleInline--> walltime workaroundIf on the queues there is published:GlueCEStateWaitingJobs: 444444and in the log /var/log/bdii/bdii-update.log you notice errors like the folllowing: Traceback (most recent call last): File "/usr/libexec/lcg-info-dynamic-scheduler", line 435, in ? wrt = qwt * nwait TypeError: unsupported operand type(s) for *: 'NoneType' and 'int'probably the queues have no "resources_default.walltime" parameter configured. So define it for each queue by launching, for example: | |||||||
> > | yaim config | |||||||
Changed: | ||||||||
< < | # qmgr -c "set queue prod resources_default.walltime = 01:00:00" # qmgr -c "set queue cert resources_default.walltime = 01:00:00" # qmgr -c "set queue cloudtf resources_default.walltime = 01:00:00" | |||||||
> > | # /opt/glite/yaim/bin/yaim -c -s /root/siteinfo/site-info.def -n creamCE -n TORQUE_utils | |||||||
Deleted: | ||||||||
< < | <--/twistyPlugin--> | |||||||
Service Checks | ||||||||
Deleted: | ||||||||
< < | <--/twistyPlugin twikiMakeVisibleInline--> | |||||||
| ||||||||
Deleted: | ||||||||
< < | TORQUE checks:
# qmgr -c 'p s'
# pbsnodes -a <--/twistyPlugin--> <--/twistyPlugin twikiMakeVisibleInline--> maui settingsIn order to reserve a job slot for test jobs, you need to apply some settings in the maui configuration (/var/spool/maui/maui.cfg) Suppose you have enabled the test VOs (ops, dteam and infngrid) on the "cert" queue and that you have 8 job slots available. Add the following lines in the/var/spool/maui/maui.cfg file:
CLASSWEIGHT 1 QOSWEIGHT 1 QOSCFG[normal] MAXJOB=7 CLASSCFG[prod] QDEF=normal CLASSCFG[cert] PRIORITY=5000After the modification restart maui. In order to avoid that yaim overwrites this file during the host reconfiguration, set: CONFIG_MAUI="no"in your site.def (the first time you launch the yaim script, it has to be set to "yes" <--/twistyPlugin--> | |||||||
Revisions
|
Line: 1 to 1 | |||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Added: | |||||||||||||||||||||
> > |
Notes about Installation and Configuration of a CREAM Computing Element - EMI-2 - SL6 (external Torque, external Argus, MPI enabled)
References
Service installationO.S. and Repos
# cat /etc/redhat-release Scientific Linux release 6.2 (Carbon)
# yum install yum-priorities yum-protectbase epel-release # rpm -ivh http://emisoft.web.cern.ch/emisoft/dist/EMI/2/sl6/x86_64/base/emi-release-2.0.0-1.sl6.noarch.rpm # cd /etc/yum.repos.d/ # wget http://repo-pd.italiangrid.it/mrepo/repos/egi-trustanchors.repo
# getenforce Disabled yum install# yum clean all # yum install ca-policy-egi-core emi-cream-ce emi-torque-utils Service configurationYou have to copy the configuration files in another path, for example root, and set them properly (see later):# cp -vr /opt/glite/yaim/examples/siteinfo . vo.d directoryCreate the directorysiteinfo/vo.d and fill it with a file for each supported VO. You can download them from HERE![]() ![]() users and groupsYou can download them from HERE![]() MungeCopy the key/etc/munge/munge.key from the Torque server to every host of your cluster, adjust the permissions and start the service
# chown munge:munge /etc/munge/munge.key # ls -ltr /etc/munge/ total 4 -r-------- 1 munge munge 1024 Jan 13 14:32 munge.key # chkconfig munge on # /etc/init.d/munge restart site-info.defKISS: Keep it simple, stupid! For your convenience there is an explanation of each yaim variable. For more details look HERE![]() # cat siteinfo/site-info.def BATCH_SERVER=batch.cnaf.infn.it CE_HOST=cream-01.cnaf.infn.it CE_SMPSIZE=8 USERS_CONF=/root/siteinfo/ig-users.conf GROUPS_CONF=/root/siteinfo/ig-users.conf VOS="comput-er.it dteam igi.italiangrid.it infngrid ops gridit" QUEUES="cert prod" CERT_GROUP_ENABLE="dteam infngrid ops /dteam/ROLE=lcgadmin /dteam/ROLE=production /ops/ROLE=lcgadmin /ops/ROLE=pilot /infngrid/ROLE=SoftwareManager /infngrid/ROLE=pilot" PROD_GROUP_ENABLE="comput-er.it gridit igi.italiangrid.it /comput-er.it/ROLE=SoftwareManager /gridit/ROLE=SoftwareManager /igi.italiangrid.it/ROLE=SoftwareManager" WN_LIST="/root/siteinfo/wn-list.conf" MUNGE_KEY_FILE=/etc/munge/munge.key CONFIG_MAUI="no" SITE_NAME=IGI-BOLOGNA APEL_DB_PASSWORD=not_used APEL_MYSQL_HOST=not_used WN listSet in this file the WNs list, for example:# less /root/siteinfo/wn-list.conf wn05.cnaf.infn.it wn06.cnaf.infn.it site-info.defSUGGESTION: use the same site-info.def for CREAM and WNs: for this reason in this example file there are yaim variable used by CREAM, TORQUE or emi-WN. It is also included the settings of some VOs For your convenience there is an explanation of each yaim variable. For more details look at [8, 9, 10] </><--/twistyPlugin--> <--/twistyPlugin twikiMakeVisibleInline--> services/glite-creamce# # YAIM creamCE specific variables # # LSF settings: path where lsf.conf is located #BATCH_CONF_DIR=lsf_install_path/conf # # CE-monitor host (by default CE-monitor is installed on the same machine as # cream-CE) CEMON_HOST=$CE_HOST # # CREAM database user CREAM_DB_USER=********* # CREAM_DB_PASSWORD=********* # Machine hosting the BLAH blparser. # In this machine batch system logs must be accessible. #BLPARSER_HOST=set_to_fully_qualified_host_name_of_machine_hosting_blparser_server BLPARSER_HOST=$CE_HOST <--/twistyPlugin--> <--/twistyPlugin twikiMakeVisibleInline--> services/dgas_sensors# # YAIM DGAS Sensors specific variables # ################################ # DGAS configuration variables # ################################ # For any details about DGAS variables please refer to the guide: # http://igrelease.forge.cnaf.infn.it/doku.php?id=doc:guides:dgas # Reference Resource HLR for the site. DGAS_HLR_RESOURCE="prod-hlr-01.pd.infn.it" # Specify the type of job which the CE has to process. # Set ”all” on “the main CE” of the site, ”grid” on the others. # Default value: all #DGAS_JOBS_TO_PROCESS="all" # This parameter can be used to specify the list of VOs to publish. # If the parameter is specified, the sensors (pushd) will forward # to the Site HLR just records belonging to one of the specified VOs. # Leave commented if you want to send records for ALL VOs # Default value: parameter not specified #DGAS_VO_TO_PROCESS="vo1;vo2;vo3..." # Bound date on jobs backward processing. # The backward processing does not consider jobs prior to that date. # Default value: 2009-01-01. #DGAS_IGNORE_JOBS_LOGGED_BEFORE="2011-11-01" # Main CE of the site. # ATTENTION: set this variable only in the case of site with a “singleLRMS” # in which there are more than one CEs or local submission hosts (i.e. host # from which you may submit jobs directly to the batch system). # In this case, DGAS_USE_CE_HOSTNAME parameter must be set to the same value # for all hosts sharing the lrms and this value can be arbitrary chosen among # these submitting hostnames (you may choose the best one). # Otherwise leave it commented. # we have 2 CEs, cremino is the main one DGAS_USE_CE_HOSTNAME="cremino.cnaf.infn.it" # Path for the batch-system log files. # * for torque/pbs: # DGAS_ACCT_DIR=/var/torque/server_priv/accounting # * for LSF: # DGAS_ACCT_DIR=lsf_install_path/work/cluster_name/logdir # * for SGE: # DGAS_ACCT_DIR=/opt/sge/default/common/ DGAS_ACCT_DIR=/var/torque/server_priv/accounting # Full path to the 'condor_history' command, used to gather DGAS usage records # when Condor is used as a batch system. Otherwise leave it commented. #DGAS_CONDOR_HISTORY_COMMAND="" <--/twistyPlugin--> <--/twistyPlugin twikiMakeVisibleInline--> ---+++ host certificate # ll /etc/grid-security/host* -rw-r--r-- 1 root root 1440 Oct 18 09:31 /etc/grid-security/hostcert.pem -r-------- 1 root root 887 Oct 18 09:31 /etc/grid-security/hostkey.pem <--/twistyPlugin--> <--/twistyPlugin twikiMakeVisibleInline--> munge configurationIMPORTANT: The updated EPEL5 build of torque-2.5.7-1 as compared to previous versions enables munge as an inter node authentication method.
# rpm -qa | grep munge munge-libs-0.5.8-8.el5 munge-0.5.8-8.el5
# /usr/sbin/create-munge-key # ls -ltr /etc/munge/ total 4 -r-------- 1 munge munge 1024 Jan 13 14:32 munge.key
# chown munge:munge /etc/munge/munge.key
# service munge start Starting MUNGE: [ OK ] # chkconfig munge on <--/twistyPlugin--> <--/twistyPlugin twikiMakeVisibleInline-->
Verify to have set all the yaim variables by launching:
# /opt/glite/yaim/bin/yaim -v -s site-info_cremino.def -n creamCE -n TORQUE_server -n TORQUE_utils -n DGAS_sensorssee details <--/twistyPlugin--> <--/twistyPlugin twikiMakeVisibleInline--> # /opt/glite/yaim/bin/yaim -c -s site-info_cremino.def -n creamCE -n TORQUE_server -n TORQUE_utils -n DGAS_sensorssee details <--/twistyPlugin--> <--/twistyPlugin twikiMakeVisibleInline--> Software Area settingsIf the Software Area is hosted on your CE, you have to create it and export to the WNs in the site.def we set:VO_SW_DIR=/opt/exp_soft
mkdir /opt/exp_soft/
/opt/exp_soft/ *.cnaf.infn.it(rw,sync,no_root_squash)
# service nfs status rpc.mountd is stopped nfsd is stopped # service portmap status portmap is stopped # service portmap start Starting portmap: [ OK ] # service nfs start Starting NFS services: [ OK ] Starting NFS daemon: [ OK ] Starting NFS mountd: [ OK ] Starting RPC idmapd: [ OK ] # chkconfig nfs on # chkconfig portmap on
# exportfs -raor simply restart nfs daemon <--/twistyPlugin--> <--/twistyPlugin twikiMakeVisibleInline--> walltime workaroundIf on the queues there is published:GlueCEStateWaitingJobs: 444444and in the log /var/log/bdii/bdii-update.log you notice errors like the folllowing: Traceback (most recent call last): File "/usr/libexec/lcg-info-dynamic-scheduler", line 435, in ? wrt = qwt * nwait TypeError: unsupported operand type(s) for *: 'NoneType' and 'int'probably the queues have no "resources_default.walltime" parameter configured. So define it for each queue by launching, for example: # qmgr -c "set queue prod resources_default.walltime = 01:00:00" # qmgr -c "set queue cert resources_default.walltime = 01:00:00" # qmgr -c "set queue cloudtf resources_default.walltime = 01:00:00" <--/twistyPlugin--> Service Checks<--/twistyPlugin twikiMakeVisibleInline-->
TORQUE checks:
# qmgr -c 'p s'
# pbsnodes -a <--/twistyPlugin--> <--/twistyPlugin twikiMakeVisibleInline--> maui settingsIn order to reserve a job slot for test jobs, you need to apply some settings in the maui configuration (/var/spool/maui/maui.cfg) Suppose you have enabled the test VOs (ops, dteam and infngrid) on the "cert" queue and that you have 8 job slots available. Add the following lines in the/var/spool/maui/maui.cfg file:
CLASSWEIGHT 1 QOSWEIGHT 1 QOSCFG[normal] MAXJOB=7 CLASSCFG[prod] QDEF=normal CLASSCFG[cert] PRIORITY=5000After the modification restart maui. In order to avoid that yaim overwrites this file during the host reconfiguration, set: CONFIG_MAUI="no"in your site.def (the first time you launch the yaim script, it has to be set to "yes" <--/twistyPlugin--> Revisions
|