Grid Usage in a glance
About
This module features basic examples of
- Browse storage and computing resources for a given Virtual Organization
- how to submit a job to the grid computing resources
- how to store a file on a grid storage element, and retrieve it back
Goals
- Understand basics of job submission and monitoring.
- Understand grid data management fundamentals.
Prerequisites
- Access to an User Interface
- A personal X509 certificate
- Virtual Organization membership : this tutorial assumes VO gridit membership, but it can be smoothly used with any another VO of your choice, just replacing in the examples the VO name.
Contents
Authentication
Once you logged to your account on a User Interface, as first thing you have to create a proxy. This will authenticate yourself towards grid services. To create your proxy, you need to have a
valid certificate in the
.globus
directory, and you need to be member of a Virtual Organisation (VO). If the VO you belong is configured on the User Interface, the proxy certificate can be created, and will be used by other grid client in order to authenticate your request to the grid services. You are asked for a password, it is the one you've created when you exported the certificate from your browser.
The command for the proxy creation is
voms-proxy-init --voms _your VO name_
[egiorgio@ui2 ~]$ ls -l .globus
total 4
-rw------- 1 emidio emidio 3852 Jul 16 11:13 usercred.p12
[egiorgio@ui2 ~]$ voms-proxy-init --voms gridit
Enter GRID pass phrase for this identity:
Contacting voms.cnaf.infn.it:15008 [/C=IT/O=INFN/OU=Host/L=CNAF/CN=voms.cnaf.infn.it] "gridit"...
Remote VOMS server contacted succesfully.
Created proxy in /tmp/x509up_u500.
Your proxy is valid until Wed Jul 17 01:08:54 CEST 2013
Optional : If the certificate file it's not yet in the .globus directory, you can copy using scp or any other equivalent file transfer utility. Beware of the certificate file permissions, they must allow only read from owner,
voms-proxy-init won't work otherwise.
[egiorgio@ui2 ~]$ voms-proxy-init --voms gridit
Unable to find user certificate or key: /home/egiorgio/.globus/usercred.p12
[egiorgio@ui2 ~]$ mkdir .globus
[change machine...]
emidio@mybox: $ scp cert_INFN_2014.p12 egiorgio@ui2.grid.unipg.it:.globus/usercred.p12
cert_INFN_2014.p12
emidio@mybox: ~ $ ssh egiorgio@ui2.grid.unipg.it
Last login: Thu Jan 9 09:21:08 2014
The Information System
You can see the computing and storage resources available for your VO with the command
lcg-infosites --vo yourVO (ce|se)
Tip A computing resource is shortly said ce (computing element), while a storage resource (storage element) is said se.
[egiorgio@ui2 ~]$ lcg-infosites --vo gridit ce
# CPU Free Total Jobs Running Waiting ComputingElement
----------------------------------------------------------------
8 8 0 0 0 atlasce2.lnf.infn.it:8443/cream-pbs-grid
1232 0 0 0 0 atlasce3.lnf.infn.it:8443/cream-pbs-grid
680 156 1 1 0 ce-01.roma3.infn.it:8443/cream-pbs-fastgrid
680 156 22 22 0 ce-01.roma3.infn.it:8443/cream-pbs-grid
40 39 0 0 0 ce-1.le.infn.it:8443/cream-lsf-gridit
[..cut]
[egiorgio@ui2 ~]$ lcg-infosites --vo gridit se
Avail Space(kB) Used Space(kB) Type SE
------------------------------------------
428613066 99825647 SRM aliserv6.ct.infn.it
3654311847 2067559635 SRM atlasse.lnf.infn.it
78100676 221899323 SRM darkstorm.cnaf.infn.it
79062315 220937684 SRM darkstorm.cnaf.infn.it
1402415 597584 SRM egee013.cnaf.infn.it
1999991 8 SRM egee013.cnaf.infn.it
[..cut]
Optional lcg-infosites offers more options than the basic listed above. You can check them reading the man page.
Job Submission
The command for job submission is
glite-wms-job-submit -a <jobname.jdl>
jobname.jdl is a file which describes the computational task you want to execute. For instance
[egiorgio@ui2 ~] cat hello.jdl
# the name of the file which will be actually executed
Executable = "hello.sh" ;
# the name of the file where standard output stream will be redirected
StdOutput = "stdout.txt";
# the name of the file where standard error stream will be redirected
StdError = "stderr.txt";
# list of local files which will be transferred to
# the remote resource before the execution
InputSandbox = {"hello.sh"};
# list of remote files which will be transferred from the resource
# after the execution
OutputSandbox = {"stdout.txt", "stderr.txt"};
[egiorgio@ui2 ~]$ cat hello.sh
#!/bin/sh
echo "Hello World - from $HOSTNAME"
echo -n "It's "
date
In this minimal example, a Bash script,
hello.sh is executed. The script is transferred to the remote resource, executed, and the standard output/error are streamed on the files
stdout.txt and
stderr.txt, which are included in the job Output Sandbox for the successive retrieval.
The .jdl extension is
mandatory. JDL. which stands for Job Description Language, is a powerful descriptive language, that allows to specify many tiny details of the task. See the full guide for more details. The
-a performs an automatic delegation of your proxy to the remote resource.
[egiorgio@ui2 ~]$ glite-wms-job-submit -a hello.jdl
Connecting to the service https://gridrb.fe.infn.it:7443/glite_wms_wmproxy_server
====================== glite-wms-job-submit Success ======================
The job has been successfully submitted to the WMProxy
Your job identifier is:
https://gridrb.fe.infn.it:9000/nW8jSAEeZYKtO0j91ZU8YQ
==========================================================================
The submission command forward the execution request to a service, the WMS, which selects a resource for the actual execution. The output of the submission command contains a job identifier, in this example
https://gridrb.fe.infn.it:9000/nW8jSAEeZYKtO0j91ZU8YQ
. This is needed for the monitoring of job status and the output retrieve. Depending on the status of resources, the job execution could take some time. Eventually, the job status command (
glite-wms-job-status <job-id>
) returns with the
Job Done message, which means you can download the output with
glite-wms-job-output <job-id>
You might not wish to handle directly job identifiers, as they are quite uncomfortable. In this case you might consider using the
-o filename_
switch : this saves the job identifier in _filename, which you can eventually use as input = for
glite-wms-job-status
, through the
-i _filename_
switch.
[egiorgio@ui2 ~]$ glite-wms-job-submit -a -o jobid.txt hello.jdl
Connecting to the service https://wms005.cnaf.infn.it:7443/glite_wms_wmproxy_server
====================== glite-wms-job-submit Success ======================
The job has been successfully submitted to the WMProxy
Your job identifier is:
https://wms005.cnaf.infn.it:9000/a-UJrAT2A3zSkScTngAc6Q
The job identifier has been saved in the following file:
/home/egiorgio/jobid.txt
==========================================================================
[egiorgio@ui2 ~]$ glite-wms-job-status https://gridrb.fe.infn.it:9000/nW8jSAEeZYKtO0j91ZU8YQ
======================= glite-wms-job-status Success =====================
BOOKKEEPING INFORMATION:
Status info for the Job : https://gridrb.fe.infn.it:9000/nW8jSAEeZYKtO0j91ZU8YQ
Current Status: Scheduled
Status Reason: unavailable
Destination: gridce4.pi.infn.it:8443/cream-lsf-grid
Submitted: Tue Jan 14 10:00:19 2014 CET
==========================================================================
[egiorgio@ui2 ~]$ glite-wms-job-status -i jobid.txt
======================= glite-wms-job-status Success =====================
BOOKKEEPING INFORMATION:
Status info for the Job : https://wms005.cnaf.infn.it:9000/a-UJrAT2A3zSkScTngAc6Q
Current Status: Running
Status Reason: unavailable
Destination: ce-1.le.infn.it:8443/cream-lsf-gridit
Submitted: Tue Jan 14 11:34:42 2014 CET
==========================================================================
[egiorgio@ui2 ~]$ glite-wms-job-status -i jobid.txt
======================= glite-wms-job-status Success =====================
BOOKKEEPING INFORMATION:
Status info for the Job : https://wms005.cnaf.infn.it:9000/a-UJrAT2A3zSkScTngAc6Q
Current Status: Done(Success)
Logged Reason(s):
- job completed
- Job Terminated Successfully
Exit code: 0
Status Reason: Job Terminated Successfully
Destination: ce-1.le.infn.it:8443/cream-lsf-gridit
Submitted: Tue Jan 14 11:34:42 2014 CET
==========================================================================
Once that you get
Job Done message, you can download the job output with
glite-wms-job-output
command. You can either provide as input the job id, as well as the text file where you could have saved the job id. Notice that, according to the UI settings, you might have to specify the downloading directory, via the
--dir switch. In the given example, we are storing the output in the current directory (.).
[egiorgio@ui2 ~]$ glite-wms-job-output -i jobid.txt --dir .
Connecting to the service https://wms005.cnaf.infn.it:7443/glite_wms_wmproxy_server
================================================================================
JOB GET OUTPUT OUTCOME
Output sandbox files for the job:
https://wms005.cnaf.infn.it:9000/a-UJrAT2A3zSkScTngAc6Q
have been successfully retrieved and stored in the directory:
/home/egiorgio/egiorgio_a-UJrAT2A3zSkScTngAc6Q
================================================================================
[egiorgio@ui2 ~]$ cat /home/egiorgio/egiorgio_a-UJrAT2A3zSkScTngAc6Q/std
stderr.txt stdout.txt
[egiorgio@ui2 ~]$ cat /home/egiorgio/egiorgio_a-UJrAT2A3zSkScTngAc6Q/stdout.txt
Hello World - from lx25.le.infn.it
It's Tue Jan 14 11:35:01 CET 2014
Data Management
In this last part, we will see how to manage files across grid storage elements. EMI/gLite middleware uses a File Catalog (LFC) to keep track of the different file locations: in fact, once you store a file on a grid SE, they are created also an URL depending from the geographical location of the file (SURL), and an unique file identifier (GUID). LFC allows to choose an mnemonic identifier for the file, under a virtual filesystem, providing so a coherent representation of grid storage independent from the file locations as well as of the actual storage implementation.
It's better to check on the UI whether a LFC server is set or not.
[egiorgio@ui2 ~]$ echo $LFC_HOST
lfcserver.cnaf.infn.it
[egiorgio@ui2 ~]$
Should the variable
$LFC_HOST be unset, just set it with
export LFC_HOST=. You can retrieve a list of LFC servers suitable for your VO using
lcg-infosites --vo yourVO lfc
Users manage catalog entries with the
lfc-* commands. For instance, you can browse the catalog with
lfc-ls /. Typically, a LFC server is shared among several VOs. Under your VO area, you should be able to create your own directory. For this tutorial, we have created the
gtw directory, under which you will create a directory naming it as your username
[egiorgio@ui2 ~]$ lcg-infosites --vo gridit lfc
lfcserver.cnaf.infn.it
[egiorgio@ui2 ~]$ lfc-ls /
grid
[egiorgio@ui2 ~]$ lfc-ls /grid
ams02.cern.ch
argo
babar
bio
cdf
compassit
....
[egiorgio@ui2 ~]$ lfc-mkdir /grid/gridit/gtw/$USER
We are ready to store a file on the grid. Create locally a simple text file
[egiorgio@ui2 ~]$
[egiorgio@ui2 ~]$ echo "My first grid file" > file.txt
[egiorgio@ui2 ~]$ cat file.txt
My first grid file
To upload a file on a SE, we need first to choose a SE from those available for our VO. It is possible to set an environment variable
VO_VONAME_DEFAULT_SE
, such to avoid to specify every time the chosen storage. Similarly, it can be useful to set an environment variable to avoid specifying every time the file path prefix within LFC
[egiorgio@ui2 ~]$ export LFC_HOME=/grid/gridit/gtw/$USER
[egiorgio@ui2 ~]$ lfc-ls
[egiorgio@ui2 ~]$
[egiorgio@ui2 ~]$lcg-cr -d recasna-se01.unina.it -l test01 file.txt
guid:5ee288d8-b0a4-4f51-ae4a-07753a391577
[egiorgio@ui2 ~]$ lfc-ls
test01
The registration command, upon success, creates a (Grid) Unique Identifier, which is given in output for the future handling of the file. The GUID is not the only way, nor the easiest grid file identifier, for which the registration command also register the given logical file name (
test01 here) on LFC. To download the file back
lcg-cp <grid-identifier>
. Notice that we had to specify the protocol (lfn), as the command supports also guid.
[egiorgio@ui2 ~]$lcg-cp lfn:test01 grid-file.txt
[egiorgio@ui2 ~]$ cat grid-file.txt
My first grid file
Once the file is no more needed, we can delete it from the storage element :
lcg-del -a lfn:test01
We need to add -a (all) option, to enforce our intention to delete all the copies of the file. The entry on LFC is also deleted.
Further Material