Tags:
create new tag
,
view all tags
---+!! Notes about Installation and Configuration of a Torque server (no cream) - EMI-2 - SL6 x86_64 * *These notes are provided by site admins on a best effort base as a contribution to the IGI communities and MUST not be considered as a subsitute of the [[http://wiki.italiangrid.it/twiki/bin/view/IGIRelease/IgiEmi][Official IGI documentation]].* * This document is addressed to site administrators responsible for middleware installation and configuration. * The goal of this page is to provide some hints and examples on how to install and configure an EMI torque server based on EMI-2 middleware. %TOC% ---++ References 1. [[http://www.italiangrid.it/][About IGI - Italian Grid infrastructure]] 1. [[http://wiki.italiangrid.it/twiki/bin/view/IGIRelease/WebHome][About IGI Release]] 1. [[http://www.eu-emi.eu/emi-2-matterhorn][EMI-2 Release]] 1. [[https://twiki.cern.ch/twiki/bin/view/LCG/YaimGuide400][ Yaim Guide]] 1. [[https://twiki.cern.ch/twiki/bin/view/LCG/Site-info_configuration_variables#site_info_def][TOBE CHANGED - site-info.def yaim variables]] 1. [[https://twiki.cern.ch/twiki/bin/view/LCG/Site-info_configuration_variables#site_BDII][TOBE CHANGED - site-BDII yaim variables]] 1. [[https://wiki.egi.eu/wiki/Tools/Manuals/SiteProblemsFollowUp][Troubleshooting Guide for Operational Errors on EGI Sites]] 1. [[https://wiki.egi.eu/wiki/Tools/Manuals/AdministrationFaq][Grid Administration FAQs page]] ---++ Service installation ---+++ O.S. and Repos * Starts from a fresh installation of Scientific Linux 6.x (x86_64). <verbatim> # cat /etc/redhat-release Scientific Linux release 6.2 (Carbon) </verbatim> * Install the additional repositories: EPEL, Certification Authority, EMI-2 <verbatim> # yum install yum-priorities yum-protectbase epel-release # rpm -ivh http://emisoft.web.cern.ch/emisoft/dist/EMI/2/sl6/x86_64/base/emi-release-2.0.0-1.sl6.noarch.rpm # cd /etc/yum.repos.d/ # wget http://repo-pd.italiangrid.it/mrepo/repos/egi-trustanchors.repo </verbatim> * Be sure that SELINUX is disabled (or permissive). Details on how to disable SELINUX are [[http://fedoraproject.org/wiki/SELinux/setenforce][here]]: <verbatim> # getenforce Disabled </verbatim> ---+++ yum install <verbatim> # yum clean all Loaded plugins: downloadonly, kernel-module, priorities, protect-packages, protectbase, security, verify, versionlock Cleaning up Everything </verbatim> <verbatim> # yum install ca-policy-egi-core # yum install emi-torque-server emi-torque-utils </verbatim> ---++ Service configuration You have to copy the configuration files in another path, for example root, and set them properly (see later): <verbatim> # cp -vr /opt/glite/yaim/examples/siteinfo . </verbatim> ---+++ vo.d Create the directory =siteinfo/vo.d= and fill it with a file for each supported VO. You can download them from [[https://forge.cnaf.infn.it/plugins/scmsvn/viewcvs.php/branches/BRANCH-4_0_X/ig-yaim/examples/siteinfo/vo.d/?root=igrelease][HERE]] ---+++ users and groups You can download them from [[https://forge.cnaf.infn.it/plugins/scmsvn/viewcvs.php/branches/BRANCH-4_0_X/ig-yaim/examples/?rev=6231&root=igrelease#dirlist][HERE]]. ---+++ site-info.def KISS: Keep it simple, stupid! For your convenience there is an explanation of each yaim variable. For more details look [[https://twiki.cern.ch/twiki/bin/view/LCG/Site-info_configuration_variables#TORQUE][HERE]]. <verbatim> # cat siteinfo/site-info.def BATCH_SERVER=batch.cnaf.infn.it CE_HOST=cream-01.cnaf.infn.it CE_SMPSIZE=8 USERS_CONF=/root/siteinfo/ig-users.conf GROUPS_CONF=/root/siteinfo/ig-users.conf VOS="comput-er.it dteam igi.italiangrid.it infngrid ops gridit" QUEUES="cert prod" CERT_GROUP_ENABLE="dteam infngrid ops /dteam/ROLE=lcgadmin /dteam/ROLE=production /ops/ROLE=lcgadmin /ops/ROLE=pilot /infngrid/ROLE=SoftwareManager /infngrid/ROLE=pilot" PROD_GROUP_ENABLE="comput-er.it gridit igi.italiangrid.it /comput-er.it/ROLE=SoftwareManager /gridit/ROLE=SoftwareManager /igi.italiangrid.it/ROLE=SoftwareManager" WN_LIST="/root/siteinfo/wn-list.conf" MUNGE_KEY_FILE=/etc/munge/munge.key CONFIG_MAUI="no" SITE_NAME=IGI-BOLOGNA APEL_DB_PASSWORD=not_used APEL_MYSQL_HOST=not_used </verbatim> ---+++ WN list Set in this file the WNs list, for example: <verbatim> # less /root/siteinfo/wn-list.conf wn05.cnaf.infn.it wn06.cnaf.infn.it </verbatim> ---+++ munge configuration * generate a key by launching =/usr/sbin/create-munge-key= <verbatim> # ls -ltr /etc/munge/ total 4 -r-------- 1 munge munge 1024 Jan 13 14:32 munge.key </verbatim> * Copy the key, /etc/munge/munge.key to every host of your cluster, adjusting the permissions: <verbatim> # chown munge:munge /etc/munge/munge.key </verbatim> * Start the munge daemon on each node: <verbatim> # service munge start Starting MUNGE: [ OK ] # chkconfig munge on </verbatim> ---+++ tomcat and ldap users It is necessary to create tomcat and ldap users on the torque server, otherwise the computing elements will fail in connecting the server. When those users doesn't exist on the server, on the CE you will see errors like the following <verbatim> 2012-04-24 15:37:29 lcg-info-dynamic-scheduler: LRMS backend command returned nonzero exit status 2012-04-24 15:37:29 lcg-info-dynamic-scheduler: Exiting without output, GIP will use static values Can not obtain pbs version from host [...] </verbatim> instead, on the torque server: <verbatim> 04/24/2012 14:00:46;0080;PBS_Server;Req;req_reject;Reject reply code=15021(Invalid credential), aux=0, type=StatusJob, from tomcat@cream-01.cnaf.infn.it 04/24/2012 14:01:02;0080;PBS_Server;Req;req_reject;Reject reply code=15021(Invalid credential), aux=0, type=StatusJob, from ldap@cream-01.cnaf.infn.it </verbatim> *Solution* is to add tomcat and ldap users/groups to torque host and restart pbs_server - as they exists only on CreamCE host. <verbatim> # echo 'tomcat:x:91:91:Tomcat:/usr/share/tomcat5:/bin/sh' >> /etc/passwd # echo 'ldap:x:55:55:LDAP User:/var/lib/ldap:/bin/false' >> /etc/passwd # echo 'tomcat:x:91:' >> /etc/group # echo 'ldap:x:55:' >> /etc/group </verbatim> ---+++ yaim check Verify to have set all the yaim variables by launching: <verbatim> # chmod -R 600 siteinfo/ # /opt/glite/yaim/bin/yaim -v -s siteinfo/site-info.def -n TORQUE_server -n TORQUE_utils [...] INFO: YAIM terminated succesfully. </verbatim> ---+++ yaim config <verbatim> # /opt/glite/yaim/bin/yaim -c -s siteinfo/site-info.def -n TORQUE_server -n TORQUE_utils [...] INFO: YAIM terminated succesfully. </verbatim> ---++ Service Checks ---+++ TORQUE checks * check the pbs settings: <verbatim> # qmgr -c 'p s' </verbatim> * check the WNs state <verbatim> # pbsnodes -a </verbatim> ---++ maui settings * In order to reserve a job slot for test jobs, you need to apply some settings in the maui configuration (=/var/spool/maui/maui.cfg=). Suppose you have enabled the test VOs (ops, dteam and infngrid) on the "cert" queue and that you have 8 job slots available. Add the following lines in the =maui.cfg= files: <verbatim> CLASSWEIGHT 1 QOSWEIGHT 1 QOSCFG[normal] MAXJOB=7 CLASSCFG[prod] QDEF=normal CLASSCFG[cert] PRIORITY=5000 </verbatim> After the modification restart maui. * In order to avoid that yaim overwrites this file during the host reconfiguration, set: <verbatim> CONFIG_MAUI="no" </verbatim> in your site.def (the first time you launch the yaim script, it has to be set to "yes") ---++ Revisions | *Date* | *Comment* | *By* | | 2012-05-24 | First draft | Paolo Veronesi | -- Main.PaoloVeronesi - 2012-05-24 -- Main.PaoloVeronesi - 2012-05-25
E
dit
|
A
ttach
|
PDF
|
H
istory
: r4
<
r3
<
r2
<
r1
|
B
acklinks
|
V
iew topic
|
M
ore topic actions
Topic revision: r4 - 2012-06-04
-
PaoloVeronesi
Home
Site map
CEMon web
CREAM web
Cloud web
Cyclops web
DGAS web
EgeeJra1It web
Gows web
GridOversight web
IGIPortal web
IGIRelease web
MPI web
Main web
MarcheCloud web
MarcheCloudPilotaCNAF web
Middleware web
Operations web
Sandbox web
Security web
SiteAdminCorner web
TWiki web
Training web
UserSupport web
VOMS web
WMS web
WMSMonitor web
WeNMR web
SiteAdminCorner Web
Create New Topic
Index
Search
Changes
Notifications
RSS Feed
Statistics
Preferences
View
Raw View
Print version
Find backlinks
History
More topic actions
Edit
Raw edit
Attach file or image
Edit topic preference settings
Set new parent
More topic actions
Account
Log In
E
dit
A
ttach
Copyright © 2008-2022 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki?
Send feedback