How to enable WeNMR job submission to OSG

This is the procedure to enable a gLite UI to submit jobs to OSG via the SBGrid VO Frontend of the glideinWMS system.

You must follow the instructions at http://www.uscms.org/SoftwareComputing/Grid/WMS/glideinWMS/doc.prd/components/pool_install.html to install the Condor Submit Node on your SL5/x86_64 gLite UI

I've used the tarballs condor-7.6.3-x86_64_rhap_5-stripped.tar.gz and glideinWMS_v2_5_2.tgz and run ./glideinWMS_install after untarring the glideinWMS_v2_5_2.tgz selecting the item 5 (user schedd) and so on as described in section 5 of the above link. Do use /DC=org/DC=doegrids/OU=Services/CN=glidein/glidein.nebiogrid.org and /DC=org/DC=doegrids/OU=Services/CN=frontend/glidein.nebiogrid.org respectively for the Collector and Frontend DN in the configuration. See here an example of condor_config.local file.

Then, send the DN of the certificate of your gLite UI to the HMS grid managers operating the SBGrid VO Frontend by opening a ticket at the OSG GOC Ticketing System

Start condor by doing, as root:

$ /etc/rc.d/init.d/condor start

Check things are working by doing (back from your user linux account):

$ condor_status -schedd

Name                 Machine    TotalRunningJobs TotalIdleJobs TotalHeldJobs 

glidein@glidein.nebi glidein.ne                2             0              0
ui-wenmr.pd.infn.it  ui-wenmr.p                0             0              0
                      TotalRunningJobs      TotalIdleJobs      TotalHeldJobs

                    
               Total                 2                  0                  0

You should see the hostname of you gLite UI listed in the output. In the exampl it was ui-wenmr.pd.infn.it

Try now to submit a CSRosetta test job with your enmr.eu VOMS proxy:

$ voms-proxy-init -voms enmr.eu:/enmr.eu/csrosetta
$ condor_submit csrosetta.job
Submitting job(s).
1 job(s) submitted to cluster "some number".

where the executable is run-csRosetta

The csrosetta.job file is the Condor analogous of the JDL for gLite. Notice the line +DESIRED_Sites = "UCSD,SPRACE,Purdue,Michigan". It refers to the sites where CSRosetta needed software has been already installed by running a job executing the script install-csrosetta.sh. Also Harvard and Nebraska sites have the software installed, but for them the test doesn't work because the liblfc.so library is not available for their Debian OS. Nevertheless these two sites are also eligible to run CSRosetta production jobs as far as the scripts do not use lcg-* commands requiring interaction with the enmr.eu LFC file catalog.

You can check the status of your job by doing:

$ condor_q
-- Submitter: ui-wenmr.pd.infn.it : <193.206.210.130:9615?sock=18144_b6d6_2> : ui-wenmr.pd.infn.it
 ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE CMD               
  67.0   verlato         3/6  17:22   0+00:00:00 I  0   0.0  run-csRosetta     

1 jobs; 1 idle, 0 running, 0 held

When finished, the job will disappear from the condor-q output. You cat still get info on it by doing:

$ condor_history 67.0 (or -l for verbose output)
 ID      OWNER            SUBMITTED     RUN_TIME ST   COMPLETED CMD            
  67.0   verlato         3/6  17:22   0+00:00:38 C   3/6  17:22 /home/verlato/s

In the output/ directory you will see the csRosetta.67.0.err/log/out files plus the other output files as expected from the csrosetta.job directives.

Useful links:


How to account WeNMR jobs submitted to OSG with Gratia

This is the procedure to enable Gratia sensors in the Condor Submit Node

You must follow the instructions at https://twiki.grid.iu.edu/bin/view/Accounting/ProbeConfigGlideinWMS#Installation

Instead of use yum install try the following:

[root@ui-wenmr ~]# wget http://koji-hub.batlab.org/mnt/koji/packages/gratia-probe/1.10/0.7.osg.el5/noarch/gratia-probe-common-1.10-0.7.osg.el5.noarch.rpm
[root@ui-wenmr ~]# wget http://koji-hub.batlab.org/mnt/koji/packages/gratia-probe/1.10/0.7.osg.el5/noarch/gratia-probe-condor-1.10-0.7.osg.el5.noarch.rpm 
[root@ui-wenmr ~]# rpm -Uvh --nodeps gratia-probe-common-1.10-0.7.osg.el5.noarch.rpm gratia-probe-condor-1.10-0.7.osg.el5.noarch.rpm
Then register you submit host to https://oim.grid.iu.edu/oim/home and modify the /etc/gratia/condor/ProbeConfig file and the condor_config.local as described in the instructions.

The "Non-Standard Condor Install" instructions do not work. I've modified manually the file /etc/cron.d/gratia-probe-condor.cron as below:

[root@ui-wenmr ~]# cat /etc/cron.d/gratia-probe-condor.cron
CONDOR_CONFIG=/opt/glidecondor/etc/condor_config
PATH=/opt/glidecondor/bin:/opt/glidecondor/sbin
0,15,30,45 * * * * root /usr/share/gratia/common/cron_check  /etc/gratia/condor/ProbeConfig && /usr/share/gratia/condor/condor_meter -s 900
Do not forget to enable the gratia probes cron by doing:
[root@ui-wenmr ~]# service gratia-probes-cron start condor
You can check if your probe is working looking at the log files in /var/log/gratia/.

Accounting records produced by your probe can be displayed e.g. at http://gratiaweb.grid.iu.edu/gratia/xml/glidein_hours_bar_smry, putting your probe name (e.g. condor:ui-wenmr.pd.infn.it) in the Variables table and pressing the "Query again" button.

Many other interesting histograms are available from the "Glidein and Campus Grid Bar Graphs" top menu. All of them can be queried by selecting User, VO and also VO group/role (named role in the Variables table). E.g., by putting "csrosetta" in the role box and pressing the "Query again" button, you'll see the records of jobs submitted with the user proxy created with voms-proxy-init -voms enmr.eu:/enmr.eu/csrosetta.


-- MarcoVerlato - 2012-02-29

Edit | Attach | PDF | History: r5 < r4 < r3 < r2 < r1 | Backlinks | Raw View | More topic actions
Topic revision: r5 - 2012-03-21 - MarcoVerlato
 
This site is powered by the TWiki collaboration platformCopyright © 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback