Tags:
, view all tags

How to enable WeNMR job submission to OSG

This is the procedure to enable a gLite UI to submit jobs to OSG via the SBGrid VO Frontend of the glideinWMS system.

You must follow the instructions at http://www.uscms.org/SoftwareComputing/Grid/WMS/glideinWMS/doc.prd/components/pool_install.html to install the Condor Submit Node on your SL5/x86_64 gLite UI

I've used the tarballs condor-7.6.3-x86_64_rhap_5-stripped.tar.gz and glideinWMS_v2_5_2.tgz and run ./glideinWMS_install after untarring the glideinWMS_v2_5_2.tgz selecting the item 5 (user schedd) and so on as described in section 5 of the above link. Do use /DC=org/DC=doegrids/OU=Services/CN=glidein/glidein.nebiogrid.org and /DC=org/DC=doegrids/OU=Services/CN=frontend/glidein.nebiogrid.org respectively for the Collector and Frontend DN in the configuration. See here an example of condor_config.local file.

Then, send the DN of the certificate of your gLite UI to the HMS grid managers operating the SBGrid VO Frontend.

Start condor by doing, as root:

$ /etc/rc.d/init.d/condor start

Check things are working by doing (back from your user linux account):

$ condor_status -schedd

Name                 Machine    TotalRunningJobs TotalIdleJobs TotalHeldJobs 

glidein@glidein.nebi glidein.ne                2             0              0
ui-wenmr.pd.infn.it  ui-wenmr.p                0             0              0
                      TotalRunningJobs      TotalIdleJobs      TotalHeldJobs

                    
               Total                 2                  0                  0

You should see the hostname of you gLite UI listed in the output. In the exampl it was ui-wenmr.pd.infn.it

Try now to submit a CSRosetta test job with your enmr.eu VOMS proxy:

$ voms-proxy-init -voms enmr.eu:/enmr.eu/csrosetta
$ condor_submit csrosetta.job
Submitting job(s).
1 job(s) submitted to cluster "some number".

where the executable is run-csRosetta

The csrosetta.job file is the Condor analogous of the JDL for gLite. Notice the line +DESIRED_Sites = "UCSD,SPRACE,Purdue,Michigan". It refers to the sites where CSRosetta needed software has been already installed by running a job executing the script install-csrosetta.sh. Also Harvard and Nebraska sites have the software installed, but for them the test doesn't work because the liblfc.so library is not available for their Debian OS. Nevertheless these two sites are also eligible to run CSRosetta production jobs as far as the scripts do not use lcg-* commands requiring interaction with the enmr.eu LFC file catalog.

You can check the status of your job by doing:

$ condor_q
-- Submitter: ui-wenmr.pd.infn.it : <193.206.210.130:9615?sock=18144_b6d6_2> : ui-wenmr.pd.infn.it
 ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE CMD               
  67.0   verlato         3/6  17:22   0+00:00:00 I  0   0.0  run-csRosetta     

1 jobs; 1 idle, 0 running, 0 held

When finished, the job will disappear from the condor-q output. You cat still get info on it by doing:

$ condor_history 67.0 (or -l for verbose output)
 ID      OWNER            SUBMITTED     RUN_TIME ST   COMPLETED CMD            
  67.0   verlato         3/6  17:22   0+00:00:38 C   3/6  17:22 /home/verlato/s

In the output/ directory you will see the csRosetta.67.0.err/log/out files plus the other output files as expected from the csrosetta.job directives.

Useful links:


-- MarcoVerlato - 2012-02-29
Edit | Attach | PDF | History: r5 < r4 < r3 < r2 < r1 | Backlinks | Raw View | More topic actions...
Topic revision: r3 - 2012-03-08 - MarcoVerlato
 
  • Edit
  • Attach
This site is powered by the TWiki collaboration platformCopyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback