Tags:
,
view all tags
%TOC% ---+ EMI-CREAM-Torque using last Torque (2.5.7-7) and Maui 3.3-4 installation with MPI (CE and WN) ---++ On CE Host (also BATCH Master PBS/Torque) Here some steps to install the last Cream CE using the Torque Staged-Rollout release in ig/gLite distribution. ---+++INSTALLATION: ---++++ Repository settings: <pre> cd /etc/yum.repos.d/ mv dag.repo dag.repo.orig wget http://repo-pd.italiangrid.it/mrepo/repos/egi-trustanchors.repo wget http://repo-pd.italiangrid.it/mrepo/repos/igi/sl5/x86_64/igi-cert-emi.repo cd /root/ wget http://download.fedoraproject.org/pub/epel/5/i386/epel-release-5-4.noarch.rpm wget http://emisoft.web.cern.ch/emisoft/dist/EMI/1/sl5/x86_64/updates/emi-release-1.0.1-1.sl5.noarch.rpm </pre> ---++++ Packages installation (CA, epel, emi, cream, torque,...): <pre> yum install ca-policy-egi-core yum localinstall *.rpm yum install xml-commons-apis yum install emi-cream-ce yum install emi-torque-server emi-torque-utils yum install glite-mpi </pre> ---++++ Munge configuration <pre> /usr/sbin/create-munge-key service munge start chkconfig munge on </pre> ---++++ Starting PBS <pre> /etc/init.d/pbs_server start </pre> ---++++ File LOG [[%ATTACHURL%/InstallCert-09-CreamCETroqueMPI.log][Installation File on CE - cert-09.pd.infn.it Work Log]] ---+++CONFIGURATION: <verbatim> /opt/glite/yaim/bin/yaim -c -d 6 -s /usr/local/nfs/cert-3_2/rtc_mpi/rtc-site-info.def -n MPI_CE -n creamCE -n TORQUE_server -n TORQUE_utils 2>&1 | tee /root/conf_EMI_CREAM_Torque_MPI.`hostname -s`.`date +%Y%m%d-%H%M%S`.log </verbatim> ---++++SSH Customization: Modify the file /etc/ssh/sshd_config as the example attached [[%ATTACHURL%/sshd_config][here]] <BR/> Modify the file /etc/ssh/shosts.equiv as the example attached [[%ATTACHURL%/shosts.equiv][here]] <pre> service sshd restart </pre> ---++++ File LOG: [[%ATTACHURL%/Configuration-cert-09.log][Yaim Configuration File on CE cert-09.pd.infn.it]] ---++ On WN Hosts: ---+++INSTALLATION: ---++++ Repository settings: <pre> cd /etc/yum.repos.d/ mv dag.repo dag.repo.orig wget http://repo-pd.italiangrid.it/mrepo/repos/egi-trustanchors.repo wget http://repo-pd.italiangrid.it/mrepo/repos/igi/sl5/x86_64/igi-cert-emi.repo cd /root/ wget http://download.fedoraproject.org/pub/epel/5/i386/epel-release-5-4.noarch.rpm wget http://emisoft.web.cern.ch/emisoft/dist/EMI/1/sl5/x86_64/updates/emi-release-1.0.1-1.sl5.noarch.rpm </pre> ---++++ Packages installation (CA, epel, emi, cream, torque,...): <pre> yum install ca-policy-egi-core yum localinstall *.rpm yum install igi-wn_torque_noafs yum install glite-mpi yum install openmpi openmpi-devel mpich2 </pre> ---++++ Munge configuration: <pre> scp cert-09:/etc/munge/munge.key /etc/munge/ chown munge.munge /etc/munge/munge.key service munge start </pre> ---++++ File LOG: [[%ATTACHURL%/InstallCert-wn64-08-WNTroqueMPI.log][Installation File on WN - cert-wn64-08.pn.pd.infn.it Work Log]] ---+++CONFIGURATION: <verbatim> /opt/glite/yaim/bin/yaim -c -d 6 -s /usr/local/nfs/cert-3_2/rtc_mpi/rtc-site-info.def -n MPI_WN -n WN_torque_noafs 2>&1 | tee /root/conf_WN_Torque_MPI.`hostname -s`.`date +%Y%m%d-%H%M%S`.log </verbatim> ---++++SSH Customization: Modify the file /etc/ssh/sshd_config as the example attached [[%ATTACHURL%/sshd_config][here]] <BR/> Modify the file /etc/ssh/shosts.equiv as the example attached [[%ATTACHURL%/shosts.equiv][here]] <pre> service sshd restart </pre> ---++++ File LOG: [[%ATTACHURL%/Configuration-cert-wn64-08.log][Yaim Configuration File on WN cert-wn64-08.pn.pd.infn.it]] ---++ TESTING: ---+++ JOB Submission: ---++++ First Test (simple job): From an EMI UI creating proxy and submission: <pre> -bash-3.2$ glite-ce-job-submit -r cert-09.pd.infn.it:8443/cream-pbs-cert -a testCream.jdl https://cert-09.pd.infn.it:8443/CREAM043342708 -bash-3.2$ glite-ce-job-status https://cert-09.pd.infn.it:8443/CREAM043342708 ****** JobID=[https://cert-09.pd.infn.it:8443/CREAM043342708] Status = [RUNNING] -bash-3.2$ glite-ce-job-status https://cert-09.pd.infn.it:8443/CREAM043342708 ****** JobID=[https://cert-09.pd.infn.it:8443/CREAM043342708] Status = [REALLY-RUNNING] -bash-3.2$ glite-ce-job-status https://cert-09.pd.infn.it:8443/CREAM043342708 ****** JobID=[https://cert-09.pd.infn.it:8443/CREAM043342708] Status = [DONE-OK] ExitCode = [0] </pre> First Test : %GREEN% *PASSED* %ENDCOLOR% <BR/> ---++++ Second Test (simple job MPI with 2 core): <pre> -bash-3.2$ glite-ce-job-submit -r cert-09.pd.infn.it:8443/cream-pbs-cert -a mpi-start-wrapper_Cream.jdl https://cert-09.pd.infn.it:8443/CREAM986827158 -bash-3.2$ glite-ce-job-status https://cert-09.pd.infn.it:8443/CREAM986827158 ****** JobID=[https://cert-09.pd.infn.it:8443/CREAM986827158] Status = [IDLE] -bash-3.2$ glite-ce-job-status https://cert-09.pd.infn.it:8443/CREAM986827158 ****** JobID=[https://cert-09.pd.infn.it:8443/CREAM986827158] Status = [RUNNING] -bash-3.2$ glite-ce-job-status https://cert-09.pd.infn.it:8443/CREAM986827158 ****** JobID=[https://cert-09.pd.infn.it:8443/CREAM986827158] Status = [REALLY-RUNNING] -bash-3.2$ glite-ce-job-status https://cert-09.pd.infn.it:8443/CREAM986827158 ****** JobID=[https://cert-09.pd.infn.it:8443/CREAM986827158] Status = [REALLY-RUNNING] -bash-3.2$ glite-ce-job-status https://cert-09.pd.infn.it:8443/CREAM986827158 ****** JobID=[https://cert-09.pd.infn.it:8443/CREAM986827158] Status = [DONE-OK] ExitCode = [1] -bash-3.2$ glite-ce-job-status -L 2 https://cert-09.pd.infn.it:8443/CREAM986827158 ****** JobID=[https://cert-09.pd.infn.it:8443/CREAM986827158] Current Status = [DONE-OK] Working Dir = [[reserved]] ExitCode = [1] Grid JobID = [N/A] LRMS Abs JobID = [[reserved]] LRMS JobID = [[reserved]] Deleg Proxy ID = [613f3558926a0ba8642c51cfcaeb019b88a5b791] DelegProxyInfo = [Valid From : 2/14/12 2:04 PM (GMT) Valid To : 2/14/12 11:14 PM (GMT) Holder Subject : /C=IT/O=INFN/OU=Personal Certificate/L=Padova/CN=Sergio Traldi Holder CA : /C=IT/O=INFN/CN=INFN CA VO : dteam AC Issuer : CN=voms2.hellasgrid.gr,OU=hellasgrid.gr,O=HellasGrid,C=GR Attribute : /dteam/Role=NULL/Capability=NULL /dteam/NGI_IT/Role=NULL/Capability=NULL ] Worker Node = [cert-wn64-08.pn.pd.infn.it] Local User = [dteam009] CREAM ISB URI = [gsiftp://cert-09.pd.infn.it/var/cream_sandbox/dteam/_C_IT_O_INFN_OU_Personal_Certificate_L_Padova_CN_Sergio_Traldi_dteam_Role_NULL_Capability_NULL_dteam009/98/CREAM986827158/ISB] CREAM OSB URI = [gsiftp://cert-09.pd.infn.it/var/cream_sandbox/dteam/_C_IT_O_INFN_OU_Personal_Certificate_L_Padova_CN_Sergio_Traldi_dteam_Role_NULL_Capability_NULL_dteam009/98/CREAM986827158/OSB] JDL = [[ Arguments = "name_mpi OPENMPI"; QueueName = "cert"; JobType = "Normal"; Executable = "mpi-start-wrapper.sh"; VirtualOrganisation = "dteam"; InputSandbox = { "/home/traldi/JOB_MPI/mpi-start-wrapper.sh","/home/traldi/JOB_MPI/mpi-hooks.sh","/home/traldi/JOB_MPI/name_mpi.c" }; CPUNumber = 2; StdOutput = "std.out"; Type = "Job"; OutputSandboxBaseDestUri = "gsiftp://prod-se-01.pd.infn.it/tmp"; StdError = "std.err"; BatchSystem = "pbs"; OutputSandbox = { "std.err","std.out" } ]] Type = [Normal] Job status changes: ------------------- Status = [REGISTERED] - [Tue 14 Feb 2012 15:09:29] (1329228569) Status = [PENDING] - [Tue 14 Feb 2012 15:09:31] (1329228571) Status = [IDLE] - [Tue 14 Feb 2012 15:09:31] (1329228571) Status = [RUNNING] - [Tue 14 Feb 2012 15:09:36] (1329228576) Status = [REALLY-RUNNING] - [Tue 14 Feb 2012 15:09:40] (1329228580) Status = [DONE-OK] - [Tue 14 Feb 2012 15:09:43] (1329228583) Issued Commands: ------------------- *** Command Name = [JOB_REGISTER] Command Category = [JOB_MANAGEMENT] Command Status = [SUCCESSFULL] Creation Time = [Tue 14 Feb 2012 15:09:29] (1329228569) Start Scheduling Time = [Tue 14 Feb 2012 15:09:29] (1329228569) Start Processing Time = [Tue 14 Feb 2012 15:09:29] (1329228569) Execution Completed Time = [Tue 14 Feb 2012 15:09:29] (1329228569) *** Command Name = [JOB_START] Command Category = [JOB_MANAGEMENT] Command Status = [SUCCESSFULL] Creation Time = [Tue 14 Feb 2012 15:09:31] (1329228571) Start Scheduling Time = [Tue 14 Feb 2012 15:09:31] (1329228571) Start Processing Time = [Tue 14 Feb 2012 15:09:31] (1329228571) Execution Completed Time = [Tue 14 Feb 2012 15:09:38] (1329228578) </pre> Second Test : %GREEN% *PASSED* %ENDCOLOR% <BR/> ---++++ Third Test: (MPI 4 core required) Test Submission with 4 core required: <pre> -bash-3.2$ glite-ce-job-submit -r cert-09.pd.infn.it:8443/cream-pbs-cert -a mpi-start-wrapper_Cream.jdl https://cert-09.pd.infn.it:8443/CREAM888211702 -bash-3.2$ glite-ce-job-status https://cert-09.pd.infn.it:8443/CREAM888211702 ****** JobID=[https://cert-09.pd.infn.it:8443/CREAM888211702] Status = [ABORTED] ExitCode = [] FailureReason = [BLAH error: submission command failed (exit code = 1) (stdout:) (stderr:qsub: Job exceeds queue resource limits MSG=cannot locate feasible nodes-) N/A (jobId = CREAM888211702)] </pre> Third Test : %RED% *NOT PASSED* %ENDCOLOR% <BR/> ---++++ Third Test (bis): (MPI 4 core required) Execute again the third test submitting a jobs using 4 cores: <BR/> Modified one YAIM variable in services/glite-mpi_ce file : <pre> MPI_SUBMIT_FILTER=${MPI_SUBMIT_FILTER:-"yes"} </pre> and reconfigure the CE <pre> -bash-3.2$ glite-ce-job-submit -r cert-09.pd.infn.it:8443/cream-pbs-cert -a mpi-start-wrapper_Cream.jdl https://cert-09.pd.infn.it:8443/CREAM115768488 -bash-3.2$ glite-ce-job-status https://cert-09.pd.infn.it:8443/CREAM115768488 ****** JobID=[https://cert-09.pd.infn.it:8443/CREAM115768488] Status = [IDLE] -bash-3.2$ glite-ce-job-status https://cert-09.pd.infn.it:8443/CREAM115768488 ****** JobID=[https://cert-09.pd.infn.it:8443/CREAM115768488] Status = [RUNNING] -bash-3.2$ glite-ce-job-status https://cert-09.pd.infn.it:8443/CREAM115768488 ****** JobID=[https://cert-09.pd.infn.it:8443/CREAM115768488] Status = [REALLY-RUNNING] -bash-3.2$ glite-ce-job-status https://cert-09.pd.infn.it:8443/CREAM115768488 ****** JobID=[https://cert-09.pd.infn.it:8443/CREAM115768488] Status = [DONE-OK] ExitCode = [0] -bash-3.2$ glite-ce-job-status -L 2 https://cert-09.pd.infn.it:8443/CREAM115768488 ****** JobID=[https://cert-09.pd.infn.it:8443/CREAM115768488] Current Status = [DONE-OK] Working Dir = [[reserved]] ExitCode = [0] Grid JobID = [N/A] LRMS Abs JobID = [[reserved]] LRMS JobID = [[reserved]] Deleg Proxy ID = [e1444f9cc9df997f65b2b6d247d1dc582814c451] DelegProxyInfo = [Valid From : 2/15/12 2:12 PM (GMT) Valid To : 2/15/12 9:26 PM (GMT) Holder Subject : /C=IT/O=INFN/OU=Personal Certificate/L=Padova/CN=Sergio Traldi Holder CA : /C=IT/O=INFN/CN=INFN CA VO : dteam AC Issuer : CN=voms2.hellasgrid.gr,OU=hellasgrid.gr,O=HellasGrid,C=GR Attribute : /dteam/Role=NULL/Capability=NULL /dteam/NGI_IT/Role=NULL/Capability=NULL ] Worker Node = [cert-wn64-08.pn.pd.infn.it] Local User = [dteam009] CREAM ISB URI = [gsiftp://cert-09.pd.infn.it/var/cream_sandbox/dteam/_C_IT_O_INFN_OU_Personal_Certificate_L_Padova_CN_Sergio_Traldi_dteam_Role_NULL_Capability_NULL_dteam009/11/CREAM115768488/ISB] CREAM OSB URI = [gsiftp://cert-09.pd.infn.it/var/cream_sandbox/dteam/_C_IT_O_INFN_OU_Personal_Certificate_L_Padova_CN_Sergio_Traldi_dteam_Role_NULL_Capability_NULL_dteam009/11/CREAM115768488/OSB] JDL = [[ Arguments = "name_mpi OPENMPI"; QueueName = "cert"; JobType = "Normal"; Executable = "mpi-start-wrapper.sh"; VirtualOrganisation = "dteam"; InputSandbox = { "/home/traldi/JOB_MPI/mpi-start-wrapper.sh","/home/traldi/JOB_MPI/mpi-hooks.sh","/home/traldi/JOB_MPI/name_mpi.c" }; CPUNumber = 4; StdOutput = "std.out"; Type = "Job"; OutputSandboxBaseDestUri = "gsiftp://prod-se-01.pd.infn.it/tmp"; StdError = "std.err"; BatchSystem = "pbs"; OutputSandbox = { "std.err","std.out" } ]] Type = [Normal] Job status changes: ------------------- Status = [REGISTERED] - [Wed 15 Feb 2012 15:18:00] (1329315480) Status = [PENDING] - [Wed 15 Feb 2012 15:18:04] (1329315484) Status = [IDLE] - [Wed 15 Feb 2012 15:18:04] (1329315484) Status = [RUNNING] - [Wed 15 Feb 2012 15:18:11] (1329315491) Status = [REALLY-RUNNING] - [Wed 15 Feb 2012 15:18:14] (1329315494) Status = [DONE-OK] - [Wed 15 Feb 2012 15:23:24] (1329315804) Issued Commands: ------------------- *** Command Name = [JOB_REGISTER] Command Category = [JOB_MANAGEMENT] Command Status = [SUCCESSFULL] Creation Time = [Wed 15 Feb 2012 15:17:59] (1329315479) Start Scheduling Time = [Wed 15 Feb 2012 15:17:59] (1329315479) Start Processing Time = [Wed 15 Feb 2012 15:17:59] (1329315479) Execution Completed Time = [Wed 15 Feb 2012 15:18:01] (1329315481) *** Command Name = [JOB_START] Command Category = [JOB_MANAGEMENT] Command Status = [SUCCESSFULL] Creation Time = [Wed 15 Feb 2012 15:18:04] (1329315484) Start Scheduling Time = [Wed 15 Feb 2012 15:18:04] (1329315484) Start Processing Time = [Wed 15 Feb 2012 15:18:04] (1329315484) Execution Completed Time = [Wed 15 Feb 2012 15:18:11] (1329315491) </pre> *Inside CE:* <pre> [root@cert-09 ~]# qstat -n cert-09.pd.infn.it: Req'd Req'd Elap Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time -------------------- -------- -------- ---------------- ------ ----- --- ------ ----- - ----- 5.cert-09.pd.inf dteam009 cert cream_115768488 29356 2 4 -- -- R 00:02 cert-wn64-08+cert-wn64-08+cert-wn64-07+cert-wn64-07 </pre> Third Test (bis) : %GREEN% *PASSED* %ENDCOLOR% <BR/> -- Main.SergioTraldi - 2012-02-15
Edit
|
Attach
|
PDF
|
H
istory
:
r4
<
r3
<
r2
<
r1
|
B
acklinks
|
V
iew topic
|
More topic actions...
Topic revision: r3 - 2012-02-17
-
SergioTraldi
Home
Site map
CEMon web
CREAM web
Cloud web
Cyclops web
DGAS web
EgeeJra1It web
Gows web
GridOversight web
IGIPortal web
IGIRelease web
MPI web
Main web
MarcheCloud web
MarcheCloudPilotaCNAF web
Middleware web
Operations web
Sandbox web
Security web
SiteAdminCorner web
TWiki web
Training web
UserSupport web
VOMS web
WMS web
WMSMonitor web
WeNMR web
IGI Documentation
Repositories specifications
Installation and Configuration Guides
Updates Guides
Services/Node Types List
IGI Updates Calendar
Tips & Tricks
Use Cases & Troubleshooting
Site Admin Corner
IGI Release Management
Integration Process
TODO List
IGI Testing & Certification
Certification Testbed
Blah testing
CREAM testing
HLR testing
Storm testing
UI testing
VOMS testing
WMS testing
WN testing
IGIRelease Web
Create New Topic
Index
Search
Changes
Notifications
RSS Feed
Statistics
Preferences
P
P
P
P
View
Raw View
Print version
Find backlinks
History
More topic actions
Edit
Raw edit
Attach file or image
Edit topic preference settings
Set new parent
More topic actions
Account
Log In
Edit
Attach
Copyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki?
Send feedback