How to enable WeNMR
job submission to OSG
This is the procedure to enable a gLite UI to submit jobs to OSG via the SBGrid VO Frontend of the glideinWMS system.
You must follow the instructions at
http://www.uscms.org/SoftwareComputing/Grid/WMS/glideinWMS/doc.prd/components/pool_install.html
to install the Condor Submit Node on your SL5/x86_64 gLite UI
I've used the tarballs condor-7.6.3-x86_64_rhap_5-stripped.tar.gz and glideinWMS_v2_5_2.tgz and run ./glideinWMS_install after untarring the glideinWMS_v2_5_2.tgz selecting the item 5 (user schedd) and so on as described in section 5 of the above link.
Do use /DC=org/DC=doegrids/OU=Services/CN=glidein/glidein.nebiogrid.org and /DC=org/DC=doegrids/OU=Services/CN=frontend/glidein.nebiogrid.org respectively for the Collector and Frontend DN in the configuration.
See
here
an example of condor_config.local file.
Then, send the DN of the certificate of your gLite UI to the HMS grid managers operating the SBGrid VO Frontend by opening a ticket at the
OSG GOC Ticketing System
Start condor by doing, as root:
$ /etc/rc.d/init.d/condor start
Check things are working by doing (back from your user linux account):
$ condor_status -schedd
Name Machine TotalRunningJobs TotalIdleJobs TotalHeldJobs
glidein@glidein.nebi glidein.ne 2 0 0
ui-wenmr.pd.infn.it ui-wenmr.p 0 0 0
TotalRunningJobs TotalIdleJobs TotalHeldJobs
Total 2 0 0
You should see the hostname of you gLite UI listed in the output. In the exampl it was ui-wenmr.pd.infn.it
Try now to submit a CSRosetta test job with your enmr.eu VOMS proxy:
$ voms-proxy-init -voms enmr.eu:/enmr.eu/csrosetta
$ condor_submit csrosetta.job
Submitting job(s).
1 job(s) submitted to cluster "some number".
where the executable is
run-csRosetta
The
csrosetta.job
file is the Condor analogous of the JDL for gLite.
Notice the line +DESIRED_Sites = "UCSD,SPRACE,Purdue,Michigan". It refers to the sites where CSRosetta needed software has been already installed by running a job executing the script
install-csrosetta.sh
. Also Harvard and Nebraska sites have the software installed, but for them the test doesn't work because the liblfc.so library is not available for their Debian OS. Nevertheless these two sites are also eligible to run CSRosetta production jobs as far as the scripts do not use lcg-* commands requiring interaction with the enmr.eu LFC file catalog.
You can check the status of your job by doing:
$ condor_q
-- Submitter: ui-wenmr.pd.infn.it : <193.206.210.130:9615?sock=18144_b6d6_2> : ui-wenmr.pd.infn.it
ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD
67.0 verlato 3/6 17:22 0+00:00:00 I 0 0.0 run-csRosetta
1 jobs; 1 idle, 0 running, 0 held
When finished, the job will disappear from the condor-q output. You cat still get info on it by doing:
$ condor_history 67.0 (or -l for verbose output)
ID OWNER SUBMITTED RUN_TIME ST COMPLETED CMD
67.0 verlato 3/6 17:22 0+00:00:38 C 3/6 17:22 /home/verlato/s
In the output/ directory you will see the csRosetta.67.0.err/log/out files plus the other output files as expected from the csrosetta.job directives.
Useful links:
How to account WeNMR
jobs submitted to OSG with Gratia
This is the procedure to enable Gratia sensors in the Condor Submit Node
You must follow the instructions at
https://twiki.grid.iu.edu/bin/view/Accounting/ProbeConfigGlideinWMS#Installation
Instead of use yum install try the following:
[root@ui-wenmr ~]# wget http://koji-hub.batlab.org/mnt/koji/packages/gratia-probe/1.10/0.7.osg.el5/noarch/gratia-probe-common-1.10-0.7.osg.el5.noarch.rpm
[root@ui-wenmr ~]# wget http://koji-hub.batlab.org/mnt/koji/packages/gratia-probe/1.10/0.7.osg.el5/noarch/gratia-probe-condor-1.10-0.7.osg.el5.noarch.rpm
[root@ui-wenmr ~]# rpm -Uvh --nodeps gratia-probe-common-1.10-0.7.osg.el5.noarch.rpm gratia-probe-condor-1.10-0.7.osg.el5.noarch.rpm
Then register you submit host to
https://oim.grid.iu.edu/oim/home
and modify the /etc/gratia/condor/ProbeConfig file and the condor_config.local as described in the instructions.
The "Non-Standard Condor Install" instructions do not work. I've modified manually the file /etc/cron.d/gratia-probe-condor.cron as below:
[root@ui-wenmr ~]# cat /etc/cron.d/gratia-probe-condor.cron
CONDOR_CONFIG=/opt/glidecondor/etc/condor_config
PATH=/opt/glidecondor/bin:/opt/glidecondor/sbin
0,15,30,45 * * * * root /usr/share/gratia/common/cron_check /etc/gratia/condor/ProbeConfig && /usr/share/gratia/condor/condor_meter -s 900
Do not forget to enable the gratia probes cron by doing:
[root@ui-wenmr ~]# service gratia-probes-cron start condor
You can check if your probe is working looking at the log files in /var/log/gratia/.
Accounting records produced by your probe can be displayed e.g. at
http://gratiaweb.grid.iu.edu/gratia/xml/glidein_hours_bar_smry
, putting your probe name (e.g. condor:ui-wenmr.pd.infn.it) in the Variables table and pressing the "Query again" button.
Many other interesting histograms are available from the "Glidein and Campus Grid Bar Graphs" top menu. All of them can be queried by selecting User, VO and also VO group/role (named role in the Variables table). E.g., by putting "csrosetta" in the role box and pressing the "Query again" button, you'll see the records of jobs submitted with the user proxy created with
voms-proxy-init -voms enmr.eu:/enmr.eu/csrosetta.
--
MarcoVerlato - 2012-02-29