Difference: DjsCreamProbeNew (1 vs. 5)

Revision 52014-01-31 - MarcoVerlato

Line: 1 to 1
 
META TOPICPARENT name="NagiosProbes"

New CREAM-CE direct job submission metrics and WN probes

Line: 14 to 14
 
  1. WN-softver probe - check middleware version on WN (via cream_jobSubmit.py)
  2. WN-csh probe - check if WN has csh (via cream_jobSubmit.py)
Changed:
<
<
All of them have been implemented in python and are based on the cream-cli commands. They share the same logic structure and provide useful information about their version, usage (i.e. help) including the options list and their meaning.
>
>
All of them have been implemented in python and are based on the cream-cli commands. They share the same logic structure and provide useful information about their version, usage (i.e. help) including the options list and their meaning, according to the guide Probes Development.
 For example:
Line: 80 to 80
 details: ['2014-01-16 15:59:57,085 FATAL - Received NULL fault; the error is due to another cause: FaultString=[connection error] - FaultCode=[SOAP-ENV:Client] - FaultSubCode=[SOAP-ENV:Client] - FaultDetail=[Connection refused]\n']
Added:
>
>
Sources are available here.
 

cream_serviceInfo.py

The serviceInfo.py retrieves information about the status of the CREAM CE. The help shows how the probe must be invoked:

Revision 42014-01-31 - MarcoVerlato

Line: 1 to 1
 
META TOPICPARENT name="NagiosProbes"
Changed:
<
<

New CREAM-CE direct job submission metrics

>
>

New CREAM-CE direct job submission metrics and WN probes

 
Changed:
<
<
The following metrics are a restructured version of the existing ones and provide a better approach for probing a CREAM CE.
>
>
The following metrics are a restructured version of the existing ones and provide a better approach for probing a CREAM CE and its WNs
 
  1. cream_serviceInfo.py - get CREAM CE service info
  2. cream_allowedSubmission.py - check if the submission to the selected CREAM CE is allowed
  3. cream_jobSubmit.py - submit a job directly to the selected CREAM CE
  4. cream_jobCancel.py - cancel an active job.
  5. cream_jobPurge.py - purge a terminted job.
Added:
>
>
  1. WN-softver probe - check middleware version on WN (via cream_jobSubmit.py)
  2. WN-csh probe - check if WN has csh (via cream_jobSubmit.py)
  All of them have been implemented in python and are based on the cream-cli commands. They share the same logic structure and provide useful information about their version, usage (i.e. help) including the options list and their meaning. For example:
Line: 30 to 32
  -H HOSTNAME, --hostname=HOSTNAME The hostname of the CREAM service. -p PORT, --port=PORT The port of the service. [default: none]
Deleted:
<
<
-u URL, --url=URL The status endpoint URL of the service. Example: https://[:]
  -x PROXY, --proxy=PROXY The proxy path
Added:
>
>
-t TIMEOUT, --timeout=TIMEOUT Probe execution time limit. [default:
                1. sec]
  -v, --verbose verbose mode [default: False]
Added:
>
>
-u URL, --url=URL The status endpoint URL of the service. Example: https://[:]
  $ ./cream_serviceInfo.py --version cream_serviceInfo v.1.0
Line: 50 to 55
 The verbose mode (--verbose) could be enabled to each metric. It provides several details about the probe execution itself by highlighting the internal commands:
Changed:
<
<
$ ./cream_serviceInfo.py --hostname cream-41.pd.infn.it --port 8443 --verbose
>
>
$ ./cream_serviceInfo.py --hostname cream-41.pd.infn.it --port 8443 -x /tmp/dteam.proxy --verbose
 executing command: /usr/bin/voms-proxy-info -timeleft invoking service info executing command: /usr/bin/glite-ce-service-info cream-41.pd.infn.it:8443
Changed:
<
<
Interface Version = [2.1] Service Version = [1.16.1 - EMI version: 3.5.1-1.el6] Description = [CREAM 2] Started at = [Thu Jan 2 19:06:51 2014] Submission enabled = [YES] Status = [RUNNING]
>
>
OK: Service Version = [1.16.1 - EMI version: 3.5.1-1.el6]
 

In case of mistakes on the selected options or on their values, the probe tries to explain what is wrong. For example the cream_serviceInfo doesn't support the --queue option:

Changed:
<
<
$ ./cream_serviceInfo.py --hostname cream-41.pd.infn.it --port 8443 --queue creamtest1 --verbose
>
>
$ ./cream_serviceInfo.py --hostname cream-41.pd.infn.it --port 8443 --queue creamtest1 -x /tmp/dteam.proxy --verbose
 Usage: cream_serviceInfo.py [options]

cream_serviceInfo.py: error: no such option: --queue

Line: 75 to 75
 In case of the errors in interacting with the CREAM CE, useful details will be provided about the failure:
Changed:
<
<
$ ./cream_allowedSubmission.py --url https://cream-43.pd.infn.it:8443
>
>
$ ./cream_allowedSubmission.py --url https://cream-43.pd.infn.it:8443 -x /tmp/dteam.proxy
 command '/usr/bin/glite-ce-allowed-submission cream-43.pd.infn.it:8443' failed: return_code=1 details: ['2014-01-16 15:59:57,085 FATAL - Received NULL fault; the error is due to another cause: FaultString=[connection error] - FaultCode=[SOAP-ENV:Client] - FaultSubCode=[SOAP-ENV:Client] - FaultDetail=[Connection refused]\n']
Line: 95 to 95
  -p PORT, --port=PORT The port of the service. [default: none] -x PROXY, --proxy=PROXY The proxy path
Added:
>
>
-t TIMEOUT, --timeout=TIMEOUT Probe execution time limit. [default:
                1. sec]
  -v, --verbose verbose mode [default: False] -u URL, --url=URL The status endpoint URL of the service. Example: https://[:]
Line: 103 to 106
 In order to get information about the CREAM service on the host https://cream-41.pd.infn.it:8443, use the following command:
Changed:
<
<
$ ./cream_serviceInfo.py --url https://cream-41.pd.infn.it:8443 Interface Version = [2.1] Service Version = [1.16.1 - EMI version: 3.5.1-1.el6] Description = [CREAM 2] Started at = [Thu Jan 2 19:06:51 2014] Submission enabled = [YES] Status = [RUNNING]
>
>
$ ./cream_serviceInfo.py --url https://cream-41.pd.infn.it:8443 -x /tmp/dteam.proxy OK: Service Version = [1.16.1 - EMI version: 3.5.1-1.el6]
 

or similary:

Changed:
<
<
$ ./cream_serviceInfo.py --hostname cream-41.pd.infn.it --port 8443 Interface Version = [2.1] Service Version = [1.16.1 - EMI version: 3.5.1-1.el6] Description = [CREAM 2] Started at = [Thu Jan 2 19:06:51 2014] Submission enabled = [YES] Status = [RUNNING]
>
>
$ ./cream_serviceInfo.py --hostname cream-41.pd.infn.it --port 8443 -x /tmp/dteam.proxy OK: Service Version = [1.16.1 - EMI version: 3.5.1-1.el6]
 

cream_allowedSubmission

Line: 137 to 130
  -H HOSTNAME, --hostname=HOSTNAME The hostname of the CREAM service. -p PORT, --port=PORT The port of the service. [default: none]
Deleted:
<
<
-u URL, --url=URL The status endpoint URL of the service. Example: https://[:]
  -x PROXY, --proxy=PROXY The proxy path
Added:
>
>
-t TIMEOUT, --timeout=TIMEOUT Probe execution time limit. [default:
                1. sec]
  -v, --verbose verbose mode [default: False]
Added:
>
>
-u URL, --url=URL The status endpoint URL of the service. Example: https://[:]
 

Notice: the use of the --url option is equivalent to specify both the options: --hostname and --port:

Changed:
<
<
$ ./cream_allowedSubmission.py --hostname cream-41.pd.infn.it --port 8443 ENABLED
>
>
$ ./cream_allowedSubmission.py --hostname cream-41.pd.infn.it --port 8443 -x /tmp/dteam.proxy OK: ENABLED
 
Changed:
<
<
$ ./cream_allowedSubmission.py --url https://cream-41.pd.infn.it:8443 ENABLED
>
>
$ ./cream_allowedSubmission.py --url https://cream-41.pd.infn.it:8443 -x /tmp/dteam.proxy OK: ENABLED
 

The verbose mode highlights the internal commands:

Changed:
<
<
$ ./cream_allowedSubmission.py --url https://cream-41.pd.infn.it:8443 --verbose
>
>
$ ./cream_allowedSubmission.py --url https://cream-41.pd.infn.it:8443 -x /tmp/dteam.proxy --verbose
 executing command: /usr/bin/voms-proxy-info -timeleft invoking allowedSubmission executing command: /usr/bin/glite-ce-allowed-submission cream-41.pd.infn.it:8443
Changed:
<
<
ENABLED
>
>
OK: ENABLED
 
Line: 178 to 174
  -H HOSTNAME, --hostname=HOSTNAME The hostname of the CREAM service. -p PORT, --port=PORT The port of the service. [default: none]
Deleted:
<
<
-u URL, --url=URL The status endpoint URL of the service. Example: https://[:]/cream--
  -x PROXY, --proxy=PROXY The proxy path
Added:
>
>
-t TIMEOUT, --timeout=TIMEOUT Probe execution time limit. [default:
                1. sec]
  -v, --verbose verbose mode [default: False]
Added:
>
>
-u URL, --url=URL The status endpoint URL of the service. Example: https://[:]/cream--
  -l LRMS, --lrms=LRMS The LRMS name (e.g.: 'lsf', 'pbs' etc) -q QUEUE, --queue=QUEUE The queue name (e.g.: 'creamtest') -j JDL, --jdl=JDL The jdl path
Added:
>
>
-d DIR, --dir=DIR The output sandbox path
 

The --url (-u) directive must be used to target the probe to a specific CREAM CE identified by its identifier (i.e. CREAM CE ID). Alternatively is it possible to specify the CREAM CE identifier by using the --hostname , --port, --lrms and --queue options which are mutually exclusive with respect to the --url option.

Line: 199 to 199
 Type="Job"; JobType="Normal"; Executable = "/bin/hostname";
Added:
>
>
Arguments = "-s";
 StdOutput = "std.out"; StdError = "std.err"; OutputSandbox = {"std.out","std.err"};
Changed:
<
<
OutputSandboxBaseDestUri="gsiftp://localhost"
>
>
OutputSandboxBaseDestUri="gsiftp://localhost";
 ]

If verbose mode is disabled, the output should look like this:

Changed:
<
<
$ ./cream_jobSubmit.py --url https://cream-41.pd.infn.it:8443/cream-lsf-creamtest1 --jdl ./hostname.jdl Job terminated with status DONE-OK
>
>
$ ./cream_jobSubmit.py --url https://cream-41.pd.infn.it:8443/cream-lsf-creamtest1 -x /tmp/dteam.proxy --jdl ./hostname.jdl DONE-OK: prod-wn-001
 

Notice: the use of the --url option is equivalent to specify both the options: --hostname, --port --lrms and --queue:

Changed:
<
<
$ ./cream_jobSubmit.py --hostname cream-41.pd.infn.it --port 8443 --lrms lsf --queue creamtest1 --jdl ./hostname.jdl Job terminated with status DONE-OK
>
>
$ ./cream_jobSubmit.py --hostname cream-41.pd.infn.it --port 8443 --lrms lsf --queue creamtest1 -x /tmp/dteam.proxy --jdl ./hostname.jdl DONE-OK: prod-wn-001
 

If the verbose mode is enabled, the output of the above command should be like this:

Changed:
<
<
$ ./cream_jobSubmit.py --hostname cream-41.pd.infn.it --port 8443 --lrms lsf --queue creamtest1 --jdl ./hostname.jdl --verbose
>
>
$ ./cream_jobSubmit.py --hostname cream-41.pd.infn.it --port 8443 --lrms lsf --queue creamtest1 -x /tmp/dteam.proxy --jdl ./hostname.jdl --verbose
 executing command: /usr/bin/voms-proxy-info -timeleft
Changed:
<
<
executing command: /usr/bin/glite-ce-job-submit -d -a -r cream-41.pd.infn.it:8443/cream-lsf-creamtest1 ./hostname.jdl ['2014-01-16 13:52:57,305 DEBUG - Using certificate proxy file [/tmp/x509up_u0]\n', '2014-01-16 13:52:57,324 DEBUG - VO from certificate=[dteam]\n', '2014-01-16 13:52:57,324 WARN - No configuration file suitable for loading. Using built-in configuration\n', '2014-01-16 13:52:57,324 DEBUG - Logfile is [/tmp/glite_cream_cli_logs/glite-ce-job-submit_CREAM_root_20140116-135257.log]\n', '2014-01-16 13:52:57,328 INFO - certUtil::generateUniqueID() - Generated DelegationID: [99b19aafc98e11bb7956cc58d901bd860697227d]\n', '2014-01-16 13:52:59,645 DEBUG - Registering to [https://cream-41.pd.infn.it:8443/ce-cream/services/CREAM2] JDL=[ StdOutput = "std.out"; BatchSystem = "lsf"; QueueName = "creamtest1"; Executable = "/bin/hostname"; Type = "Job"; JobType = "Normal"; OutputSandboxBaseDestUri = "gsiftp://localhost"; OutputSandbox = { "std.out","std.err" }; StdError = "std.err" ] - JDL File=[./hostname.jdl]\n', '2014-01-16 13:53:00,022 DEBUG - Will invoke JobStart for JobID [CREAM304354901]\n', 'https://cream-41.pd.infn.it:8443/CREAM304354901\n'] job id: https://cream-41.pd.infn.it:8443/CREAM304354901
>
>
executing command: /usr/bin/glite-ce-job-submit -d -a -r cream-41.pd.infn.it:8443/cream-lsf-creamtest1 hostname.jdl ['2014-01-31 12:18:12,740 DEBUG - Using certificate proxy file [/tmp/dteam.proxy]\n', '2014-01-31 12:18:12,756 DEBUG - VO from certificate=[dteam]\n', '2014-01-31 12:18:12,756 WARN - No configuration file suitable for loading. Using built-in configuration\n', '2014-01-31 12:18:12,756 DEBUG - Logfile is [/tmp/glite_cream_cli_logs/glite-ce-job-submit_CREAM_root_20140131-121812.log]\n', '2014-01-31 12:18:12,760 INFO - certUtil::generateUniqueID() - Generated DelegationID: [e0b6d023c766400e8b27e51cf5a2b1fa179d78f9]\n', '2014-01-31 12:18:14,606 DEBUG - Registering to [https://cream-41.pd.infn.it:8443/ce-cream/services/CREAM2] JDL=[ StdOutput = "std.out"; BatchSystem = "lsf"; QueueName = "creamtest1"; Executable = "/bin/hostname"; Type = "Job"; Arguments = "-s"; JobType = "Normal"; OutputSandboxBaseDestUri = "gsiftp://localhost"; OutputSandbox = { "std.out","std.err" }; StdError = "std.err" ] - JDL File=[hostname.jdl]\n', '2014-01-31 12:18:14,942 DEBUG - Will invoke JobStart for JobID [CREAM446576112]\n', 'https://cream-41.pd.infn.it:8443/CREAM446576112\n'] job id: https://cream-41.pd.infn.it:8443/CREAM446576112
 invoking jobStatus
Changed:
<
<
executing command: /usr/bin/glite-ce-job-status https://cream-41.pd.infn.it:8443/CREAM304354901 job status: REALLY-RUNNING invoking jobStatus executing command: /usr/bin/glite-ce-job-status https://cream-41.pd.infn.it:8443/CREAM304354901
>
>
executing command: /usr/bin/glite-ce-job-status https://cream-41.pd.infn.it:8443/CREAM446576112 ['\n', '****** JobID=[https://cream-41.pd.infn.it:8443/CREAM446576112]\n', '\tStatus = [DONE-OK]\n', '\tExitCode = [0]\n', '\n', '\n'] exitCode= ExitCode = [0]
 job status: DONE-OK invoking getOutputSandbox
Changed:
<
<
executing command: /usr/bin/glite-ce-job-output --noint https://cream-41.pd.infn.it:8443/CREAM187996258 output sandbox dir: ./cream-41.pd.infn.it_8443_CREAM187996258
>
>
executing command: /usr/bin/glite-ce-job-output --noint --dir /tmp https://cream-41.pd.infn.it:8443/CREAM446576112 output sandbox dir: /tmp/cream-41.pd.infn.it_8443_CREAM446576112
 invoking jobPurge
Changed:
<
<
executing command: /usr/bin/glite-ce-job-purge --noint https://cream-41.pd.infn.it:8443/CREAM187996258 Job terminated with status DONE-OK
>
>
executing command: /usr/bin/glite-ce-job-purge --noint https://cream-41.pd.infn.it:8443/CREAM446576112 DONE-OK: prod-wn-001
 
Changed:
<
<
Notice the output sandbox dir: ./cream-41.pd.infn.it_8443_CREAM187996258. This is the output sandbox directory containing all the produced files:
>
>
Notice the output sandbox dir: ./cream-41.pd.infn.it_8443_CREAM446576112. This is the output sandbox directory containing all the produced files:
 
Changed:
<
<
$ ls -la ./cream-41.pd.infn.it_8443_CREAM187996258
>
>
$ ls -la ./cream-41.pd.infn.it_8443_CREAM446576112
 total 12 drwxr-xr-x 2 root root 4096 17 gen 16:20 . dr-xr-x---. 23 root root 4096 17 gen 16:20 ..
Line: 267 to 271
  -p PORT, --port=PORT The port of the service. [default: none] -x PROXY, --proxy=PROXY The proxy path
Added:
>
>
-t TIMEOUT, --timeout=TIMEOUT Probe execution time limit. [default:
                1. sec]
  -v, --verbose verbose mode [default: False] -u URL, --url=URL The status endpoint URL of the service. Example: https://[:]/cream--
Line: 280 to 287
 For example consider the job (i.e. hostnane.jdl) of the above metric. In this case the probe will fail because the job already terminated before the execution of the e glite-ce-job-cancel command:
Changed:
<
<
$ ./cream_jobCancel.py --url https://cream-41.pd.infn.it:8443/cream-lsf-creamtest1 --jdl ./hostname.jdl
>
>
$ ./cream_jobCancel.py --url https://cream-41.pd.infn.it:8443/cream-lsf-creamtest1 -x /tmp/dteam.proxy --jdl ./hostname.jdl
 job already terminated
Line: 300 to 307
  The output of the probe should be like:
Changed:
<
<
$ ./cream_jobCancel.py --url https://cream-41.pd.infn.it:8443/cream-lsf-creamtest1 --jdl ./sleep.jdl job cancelled
>
>
$ ./cream_jobCancel.py --url https://cream-41.pd.infn.it:8443/cream-lsf-creamtest1 -x /tmp/dteam.proxy --jdl ./sleep.jdl OK: job cancelled
 

or like this with --verbose option specified:

Changed:
<
<
$ ./cream_jobCancel.py --url https://cream-41.pd.infn.it:8443/cream-lsf-creamtest1 --jdl ./sleep.jdl --verbose
>
>
$ ./cream_jobCancel.py --url https://cream-41.pd.infn.it:8443/cream-lsf-creamtest1 -x /tmp/dteam.proxy --jdl ./sleep.jdl --verbose
 executing command: /usr/bin/voms-proxy-info -timeleft executing command: /usr/bin/glite-ce-job-submit -d -a -r cream-41.pd.infn.it:8443/cream-lsf-creamtest1 ./sleep.jdl
Changed:
<
<
['2014-01-16 17:22:49,728 DEBUG - Using certificate proxy file [/tmp/x509up_u0]\n', '2014-01-16 17:22:49,744 DEBUG - VO from certificate=[dteam]\n', '2014-01-16 17:22:49,745 WARN - No configuration file suitable for loading. Using built-in configuration\n', '2014-01-16 17:22:49,745 DEBUG - Logfile is [/tmp/glite_cream_cli_logs/glite-ce-job-submit_CREAM_root_20140116-172249.log]\n', '2014-01-16 17:22:49,747 INFO - certUtil::generateUniqueID() - Generated DelegationID: [69e3bdaf4e818e1f71f1b7ff442f74583c869b84]\n', '2014-01-16 17:22:52,165 DEBUG - Registering to [https://cream-41.pd.infn.it:8443/ce-cream/services/CREAM2] JDL=[ StdOutput = "cream.out"; BatchSystem = "lsf"; QueueName = "creamtest1"; Executable = "/bin/sleep"; Type = "Job"; Arguments = "200"; JobType = "Normal"; StdError = "cream.out" ] - JDL File=[./sleep.jdl]\n', '2014-01-16 17:22:52,513 DEBUG - Will invoke JobStart for JobID [CREAM437649288]\n', 'https://cream-41.pd.infn.it:8443/CREAM437649288\n'] job id: https://cream-41.pd.infn.it:8443/CREAM437649288
>
>
['2014-01-31 12:30:42,469 DEBUG - Using certificate proxy file [/tmp/dteam.proxy]\n', '2014-01-31 12:30:42,489 DEBUG - VO from certificate=[dteam]\n', '2014-01-31 12:30:42,489 WARN - No configuration file suitable for loading. Using built-in configuration\n', '2014-01-31 12:30:42,489 DEBUG - Logfile is [/tmp/glite_cream_cli_logs/glite-ce-job-submit_CREAM_root_20140131-123042.log]\n', '2014-01-31 12:30:42,493 INFO - certUtil::generateUniqueID() - Generated DelegationID: [59882bed5243b5788a83404ad027cf571319ef79]\n', '2014-01-31 12:30:44,059 DEBUG - Registering to [https://cream-41.pd.infn.it:8443/ce-cream/services/CREAM2] JDL=[ StdOutput = "cream.out"; BatchSystem = "lsf"; QueueName = "creamtest1"; Executable = "/bin/sleep"; Type = "Job"; Arguments = "200"; JobType = "Normal"; StdError = "cream.out" ] - JDL File=[./sleep.jdl]\n', '2014-01-31 12:30:44,368 DEBUG - Will invoke JobStart for JobID [CREAM076606856]\n', 'https://cream-41.pd.infn.it:8443/CREAM076606856\n'] job id: https://cream-41.pd.infn.it:8443/CREAM076606856
 invoking jobStatus
Changed:
<
<
executing command: /usr/bin/glite-ce-job-status https://cream-41.pd.infn.it:8443/CREAM437649288
>
>
executing command: /usr/bin/glite-ce-job-status https://cream-41.pd.infn.it:8443/CREAM076606856 ['\n', '****** JobID=[https://cream-41.pd.infn.it:8443/CREAM076606856]\n', '\tStatus = [REALLY-RUNNING]\n', '\n', '\n']
 job status: REALLY-RUNNING invoking jobCancel
Changed:
<
<
executing command: /usr/bin/glite-ce-job-cancel --noint https://cream-41.pd.infn.it:8443/CREAM437649288
>
>
executing command: /usr/bin/glite-ce-job-cancel --noint https://cream-41.pd.infn.it:8443/CREAM076606856
 invoking jobStatus
Changed:
<
<
executing command: /usr/bin/glite-ce-job-status https://cream-41.pd.infn.it:8443/CREAM437649288
>
>
executing command: /usr/bin/glite-ce-job-status https://cream-41.pd.infn.it:8443/CREAM076606856 ['\n', '****** JobID=[https://cream-41.pd.infn.it:8443/CREAM076606856]\n', '\tStatus = [REALLY-RUNNING]\n', '\n', '\n']
 job status: REALLY-RUNNING invoking jobStatus
Changed:
<
<
executing command: /usr/bin/glite-ce-job-status https://cream-41.pd.infn.it:8443/CREAM437649288
>
>
executing command: /usr/bin/glite-ce-job-status https://cream-41.pd.infn.it:8443/CREAM076606856 ['\n', '****** JobID=[https://cream-41.pd.infn.it:8443/CREAM076606856]\n', '\tStatus = [CANCELLED]\n', '\tExitCode = []\n', '\tDescription = [Cancelled by user]\n', '\n', '\n'] exitCode= ExitCode = []
 job status: CANCELLED invoking jobPurge
Changed:
<
<
executing command: /usr/bin/glite-ce-job-purge --noint https://cream-41.pd.infn.it:8443/CREAM437649288 job cancelled
>
>
executing command: /usr/bin/glite-ce-job-purge --noint https://cream-41.pd.infn.it:8443/CREAM076606856 OK: job cancelled
 
Line: 344 to 356
  -p PORT, --port=PORT The port of the service. [default: none] -x PROXY, --proxy=PROXY The proxy path
Added:
>
>
-t TIMEOUT, --timeout=TIMEOUT Probe execution time limit. [default:
                1. sec]
  -v, --verbose verbose mode [default: False] -u URL, --url=URL The status endpoint URL of the service. Example: https://[:]/cream--
Line: 354 to 369
 
Changed:
<
<
$ ./cream_jobPurge.py --url https://cream-41.pd.infn.it:8443/cream-lsf-creamtest1 --jdl ./hostname.jdl job purged
>
>
$ ./cream_jobPurge.py --url https://cream-41.pd.infn.it:8443/cream-lsf-creamtest1 -x /tmp/dteam.proxy --jdl ./hostname.jdl OK: job purged
 
Added:
>
>
 
Changed:
<
<
$ ./cream_jobPurge.py --url https://cream-41.pd.infn.it:8443/cream-lsf-creamtest1 --jdl ./hostname.jdl --verbose
>
>
$ ./cream_jobPurge.py --url https://cream-41.pd.infn.it:8443/cream-lsf-creamtest1 -x /tmp/dteam.proxy --jdl ./hostname.jdl --verbose
 executing command: /usr/bin/voms-proxy-info -timeleft
Changed:
<
<
executing command: /usr/bin/glite-ce-job-submit -d -a -r cream-41.pd.infn.it:8443/cream-lsf-creamtest1 ./hostname.jdl ['2014-01-16 14:02:48,282 DEBUG - Using certificate proxy file [/tmp/x509up_u0]\n', '2014-01-16 14:02:48,301 DEBUG - VO from certificate=[dteam]\n', '2014-01-16 14:02:48,301 WARN - No configuration file suitable for loading. Using built-in configuration\n', '2014-01-16 14:02:48,302 DEBUG - Logfile is [/tmp/glite_cream_cli_logs/glite-ce-job-submit_CREAM_root_20140116-140248.log]\n', '2014-01-16 14:02:48,305 INFO - certUtil::generateUniqueID() - Generated DelegationID: [85a80615fc8046baf6dce2a27b708fc82ecbce55]\n', '2014-01-16 14:02:51,105 DEBUG - Registering to [https://cream-41.pd.infn.it:8443/ce-cream/services/CREAM2] JDL=[ StdOutput = "std.out"; BatchSystem = "lsf"; QueueName = "creamtest1"; Executable = "/bin/hostname"; Type = "Job"; JobType = "Normal"; OutputSandboxBaseDestUri = "gsiftp://localhost"; OutputSandbox = { "std.out","std.err" }; StdError = "std.err" ] - JDL File=[./hostname.jdl]\n', '2014-01-16 14:02:51,504 DEBUG - Will invoke JobStart for JobID [CREAM691625071]\n', 'https://cream-41.pd.infn.it:8443/CREAM691625071\n'] job id: https://cream-41.pd.infn.it:8443/CREAM691625071
>
>
executing command: /usr/bin/glite-ce-job-submit -d -a -r cream-41.pd.infn.it:8443/cream-lsf-creamtest1 hostname.jdl ['2014-01-31 12:27:50,347 DEBUG - Using certificate proxy file [/tmp/dteam.proxy]\n', '2014-01-31 12:27:50,364 DEBUG - VO from certificate=[dteam]\n', '2014-01-31 12:27:50,364 WARN - No configuration file suitable for loading. Using built-in configuration\n', '2014-01-31 12:27:50,364 DEBUG - Logfile is [/tmp/glite_cream_cli_logs/glite-ce-job-submit_CREAM_root_20140131-122750.log]\n', '2014-01-31 12:27:50,367 INFO - certUtil::generateUniqueID() - Generated DelegationID: [5978b3f267779dbf4d691889ea316a14dafeb4bb]\n', '2014-01-31 12:27:51,780 DEBUG - Registering to [https://cream-41.pd.infn.it:8443/ce-cream/services/CREAM2] JDL=[ StdOutput = "std.out"; BatchSystem = "lsf"; QueueName = "creamtest1"; Executable = "/bin/hostname"; Type = "Job"; Arguments = "-s"; JobType = "Normal"; OutputSandboxBaseDestUri = "gsiftp://localhost"; OutputSandbox = { "std.out","std.err" }; StdError = "std.err" ] - JDL File=[hostname.jdl]\n', '2014-01-31 12:27:52,109 DEBUG - Will invoke JobStart for JobID [CREAM973322659]\n', 'https://cream-41.pd.infn.it:8443/CREAM973322659\n'] job id: https://cream-41.pd.infn.it:8443/CREAM973322659
 invoking jobStatus
Changed:
<
<
executing command: /usr/bin/glite-ce-job-status https://cream-41.pd.infn.it:8443/CREAM691625071
>
>
executing command: /usr/bin/glite-ce-job-status https://cream-41.pd.infn.it:8443/CREAM973322659 ['\n', '****** JobID=[https://cream-41.pd.infn.it:8443/CREAM973322659]\n', '\tStatus = [DONE-OK]\n', '\tExitCode = [0]\n', '\n', '\n'] exitCode= ExitCode = [0]
 job status: DONE-OK invoking jobPurge
Changed:
<
<
executing command: /usr/bin/glite-ce-job-purge --noint https://cream-41.pd.infn.it:8443/CREAM691625071
>
>
executing command: /usr/bin/glite-ce-job-purge --noint https://cream-41.pd.infn.it:8443/CREAM973322659
 invoking jobStatus
Changed:
<
<
executing command: /usr/bin/glite-ce-job-status https://cream-41.pd.infn.it:8443/CREAM691625071 job purged
>
>
executing command: /usr/bin/glite-ce-job-status https://cream-41.pd.infn.it:8443/CREAM973322659 OK: job purged
 
Added:
>
>

WN-softver probe

This probe checks the middleware version on a WN managed by the CREAM-CE. It makes use of cream_jobSubmit.py in the following way:
$ ./cream_jobSubmit.py --url https://cream-41.pd.infn.it:8443/cream-lsf-creamtest1 -x /tmp/dteam.proxy -j ./WN-softver.jdl
DONE-OK: prod-wn-002 has EMI 1.11.0-1
where
$ cat WN-softver.jdl
[
Type="Job";
JobType="Normal";
Executable = "WN-softver.sh";
#Arguments = "a b c";
StdOutput = "std.out";
StdError = "std.err";
InputSandbox = {"WN-softver.sh"};
OutputSandbox = {"std.out","std.err"};
OutputSandboxBaseDestUri="gsiftp://localhost";
]
and WN-softver.sh is attached.

The verbose option enabled gives the following output:

$ ./cream_jobSubmit.py --url https://cream-41.pd.infn.it:8443/cream-lsf-creamtest1 -x /tmp/dteam.proxy -j ./WN-softver.jdl --verbose
executing command: /usr/bin/voms-proxy-info -timeleft
executing command: /usr/bin/glite-ce-job-submit -d -a -r cream-41.pd.infn.it:8443/cream-lsf-creamtest1 ./WN-softver.jdl
['2014-01-31 13:05:05,802 DEBUG - Using certificate proxy file [/tmp/dteam.proxy]\n', '2014-01-31 13:05:05,823 DEBUG - VO from certificate=[dteam]\n', '2014-01-31 13:05:05,824 WARN - No configuration file suitable for loading. Using built-in configuration\n', '2014-01-31 13:05:05,824 DEBUG - Logfile is [/tmp/glite_cream_cli_logs/glite-ce-job-submit_CREAM_root_20140131-130505.log]\n', '2014-01-31 13:05:05,824 DEBUG - Processing file [WN-softver.sh]...\n', '2014-01-31 13:05:05,824 DEBUG - Adding absolute path [/root/WN-softver.sh]...\n', '2014-01-31 13:05:05,824 DEBUG - Inserting mangled InputSandbox in JDL: [{"/root/WN-softver.sh"}]...\n', '2014-01-31 13:05:05,827 INFO - certUtil::generateUniqueID() - Generated DelegationID: [b18342097b01b309cc9112a683d4c6bd15d35796]\n', '2014-01-31 13:05:07,517 DEBUG - Registering to [https://cream-41.pd.infn.it:8443/ce-cream/services/CREAM2] JDL=[ StdOutput = "std.out"; BatchSystem = "lsf"; QueueName = "creamtest1"; Executable = "WN-softver.sh"; Type = "Job"; JobType = "Normal"; OutputSandboxBaseDestUri = "gsiftp://localhost"; OutputSandbox = { "std.out","std.err" }; InputSandbox = { "/root/WN-softver.sh" }; StdError = "std.err" ] - JDL File=[./WN-softver.jdl]\n', '2014-01-31 13:05:07,870 DEBUG - JobID=[https://cream-41.pd.infn.it:8443/CREAM680089855]\n', '2014-01-31 13:05:07,871 DEBUG - UploadURL=[gsiftp://cream-41.pd.infn.it/var/glite/cream_sandbox/dteam/CN_Marco_Verlato_L_Padova_OU_Personal_Certificate_O_INFN_C_IT_dteam_Role_NULL_Capability_NULL_dteam019/68/CREAM680089855/ISB]\n', '2014-01-31 13:05:07,873 INFO - Sending file [gsiftp://cream-41.pd.infn.it/var/glite/cream_sandbox/dteam/CN_Marco_Verlato_L_Padova_OU_Personal_Certificate_O_INFN_C_IT_dteam_Role_NULL_Capability_NULL_dteam019/68/CREAM680089855/ISB/WN-softver.sh]\n', '2014-01-31 13:05:08,044 DEBUG - Will invoke JobStart for JobID [CREAM680089855]\n', 'https://cream-41.pd.infn.it:8443/CREAM680089855\n']
job id: https://cream-41.pd.infn.it:8443/CREAM680089855
invoking jobStatus
executing command: /usr/bin/glite-ce-job-status https://cream-41.pd.infn.it:8443/CREAM680089855
['\n', '******  JobID=[https://cream-41.pd.infn.it:8443/CREAM680089855]\n', '\tStatus        = [DONE-OK]\n', '\tExitCode      = [0]\n', '\n', '\n']
exitCode=       ExitCode      = [0]

job status: DONE-OK
invoking getOutputSandbox
executing command: /usr/bin/glite-ce-job-output --noint https://cream-41.pd.infn.it:8443/CREAM680089855
output sandbox dir: ./cream-41.pd.infn.it_8443_CREAM680089855
invoking jobPurge
executing command: /usr/bin/glite-ce-job-purge --noint https://cream-41.pd.infn.it:8443/CREAM680089855
DONE-OK: prod-wn-002 has EMI 1.11.0-1
 
Changed:
<
<
-- LisaZangrando - 2014-01-16
>
>

WN-csh probe

This probe checks that csh is there on a WN managed by the CREAM-CE. It makes use of cream_jobSubmit.py in the following way:
$ ./cream_jobSubmit.py --url https://cream-41.pd.infn.it:8443/cream-lsf-creamtest1 -x /tmp/dteam.proxy -j ./WN-csh.jdl
DONE-OK: prod-wn-002 has csh
where
$ cat WN-csh.jdl
[
Type="Job";
JobType="Normal";
Executable = "WN-csh.sh";
#Arguments = "a b c";
StdOutput = "std.out";
StdError = "std.err";
InputSandbox = {"WN-csh.sh"};
OutputSandbox = {"std.out","std.err"};
OutputSandboxBaseDestUri="gsiftp://localhost";
]
and WN-csh.sh is attached.

Deployment example

In a Nagios server version 3.5.0 testing instance, we deployed the files needed to execute the probes described above in the following directories:
$ ls -l /usr/libexec/grid-monitoring/probes/emi.cream/
total 48
-rwxr-xr-x 1 root root  1361 Jan 30 16:58 cream_allowedSubmission.py
-rwxr-xr-x 1 root root  2494 Jan 30 17:00 cream_jobCancel.py
-rwxr-xr-x 1 root root  2103 Jan 30 17:01 cream_jobPurge.py
-rwxr-xr-x 1 root root  2972 Jan 31 12:42 cream_jobSubmit.py
-rwxr-xr-x 1 root root 15527 Jan 30 16:29 cream.py
-rwxr-xr-x 1 root root  1416 Jan 31 12:42 cream_serviceInfo.py
-rw-r--r-- 1 root root   213 Jan 29 14:26 hostname.jdl
-rw-r--r-- 1 root root   129 Jan 30 16:21 sleep.jdl
drwxr-xr-x 2 root root  4096 Jan 31 11:34 wn
and
$ ls -l /usr/libexec/grid-monitoring/probes/emi.cream/wn
total 16
-rw-r--r-- 1 root root  292 Jan 31 11:34 WN-csh.jdl
-rwxr-xr-x 1 root root  603 Jan 31 11:34 WN-csh.sh
-rw-r--r-- 1 root root  300 Jan 31 11:34 WN-softver.jdl
-rwxr-xr-x 1 root root 1144 Jan 31 11:34 WN-softver.sh

and defined the new services adding in the file /etc/nagios/objects/services.cfg the following lines:

define service{
        use                             local-service
        host_name                       cream-41.pd.infn.it
        service_description             emi.cream.CEDIRECT-AllowedSubmission
        check_command                   ncg_check_native!/usr/libexec/grid-monit
oring/probes/emi.cream/cream_allowedSubmission.py!60!-x /tmp/dteam.proxy -p 8443
        normal_check_interval           6
        retry_check_interval            3
        max_check_attempts              2
        obsess_over_service             0
}
define service{
        use                             local-service
        host_name                       cream-41.pd.infn.it
        service_description             emi.cream.CEDIRECT-JobCancel
        check_command                   ncg_check_native!/usr/libexec/grid-monit
oring/probes/emi.cream/cream_jobCancel.py!60!-x /tmp/dteam.proxy -p 8443 -l lsf 
-q creamtest1 -j /usr/libexec/grid-monitoring/probes/emi.cream/sleep.jdl
        normal_check_interval           6
        retry_check_interval            3
        max_check_attempts              2
        obsess_over_service             0
}
define service{
        use                             local-service
        host_name                       cream-41.pd.infn.it
        service_description             emi.cream.CEDIRECT-JobPurge
        check_command                   ncg_check_native!/usr/libexec/grid-monit
oring/probes/emi.cream/cream_jobPurge.py!60!-x /tmp/dteam.proxy -p 8443 -l lsf -
q creamtest1 -j /usr/libexec/grid-monitoring/probes/emi.cream/hostname.jdl
        normal_check_interval           6
        retry_check_interval            3
        max_check_attempts              2
        obsess_over_service             0
}

define service{
        use                             local-service
        host_name                       cream-41.pd.infn.it
        service_description             emi.cream.CEDIRECT-ServiceInfo
        check_command                   ncg_check_native!/usr/libexec/grid-monit
oring/probes/emi.cream/cream_serviceInfo.py!60!-x /tmp/dteam.proxy -p 8443
        normal_check_interval           6
        retry_check_interval            3
        max_check_attempts              2
        obsess_over_service             0
}
define service{
        use                             local-service
        host_name                       cream-41.pd.infn.it
        service_description             emi.cream.CEDIRECT-JobSubmit
        check_command                   ncg_check_native!/usr/libexec/grid-monit
oring/probes/emi.cream/cream_jobSubmit.py!60!-x /tmp/dteam.proxy -p 8443 -l lsf 
-q creamtest1 -j /usr/libexec/grid-monitoring/probes/emi.cream/hostname.jdl --di
r /tmp 
        normal_check_interval           6
        retry_check_interval            3
        max_check_attempts              2
        obsess_over_service             0
}
define service{
        use                             local-service
        host_name                       cream-41.pd.infn.it
        service_description             emi.cream.WN-Softver
        check_command                   ncg_check_native!/usr/libexec/grid-monit
oring/probes/emi.cream/cream_jobSubmit.py!60!-x /tmp/dteam.proxy -p 8443 -l lsf 
-q creamtest1 -j /usr/libexec/grid-monitoring/probes/emi.cream/wn/WN-softver.jdl
 --dir /tmp
        normal_check_interval           6
        retry_check_interval            3
        max_check_attempts              2
        obsess_over_service             0
}
define service{
        use                             local-service
        host_name                       cream-41.pd.infn.it
        service_description             emi.cream.WN-Csh
        check_command                   ncg_check_native!/usr/libexec/grid-monit
oring/probes/emi.cream/cream_jobSubmit.py!60!-x /tmp/dteam.proxy -p 8443 -l lsf 
-q creamtest1 -j /usr/libexec/grid-monitoring/probes/emi.cream/wn/WN-csh.jdl --d
ir /tmp

        normal_check_interval           6
        retry_check_interval            3
        max_check_attempts              2
        obsess_over_service             0
}

The check_command ncg_check_native was defined in the file /etc/nagios/objects/commands.cfg as below:

define command{
        command_name                    ncg_check_native
        command_line                    $ARG1$ -H $HOSTNAME$ -t $ARG2$ $ARG3$
}

When the CREAM-CE is properly working the Nagios output is shown below:

CEDIRECT-probes.jpg

If instead the CREAM-CE is not reachable (e.g. tomcat down in the CE), you will see:

CREAM-failure.jpg

and in the detail of the error message you will recognize that the connection was refused:

CREAM-failure-d.jpg

If instead e.g. the GRIDFTP server of the CREAM-CE is down, you will see:

GRIDFTP-failure.jpg

i.e. only the probes not involving file transfers via GRIDFTP are still OK. Among the others, the ones waiting for the job reaching a terminal state will time out, while the WN-* will fail because these try to transfer the sandbox: in fact in the status information field you will recognize the GRIDFTP problem (ERROR - data_cb_read() - globus_xio: Unable to connect to cream-41.pd.infn.it:2811 )

-- LisaZangrando - 2014-01-16
-- MarcoVerlato - 2014-01-31

META FILEATTACHMENT attachment="WN-softver.sh" attr="" comment="" date="1391171091" name="WN-softver.sh" path="WN-softver.sh" size="1144" user="MarcoVerlato" version="1"
META FILEATTACHMENT attachment="WN-csh.sh" attr="" comment="" date="1391171091" name="WN-csh.sh" path="WN-csh.sh" size="603" user="MarcoVerlato" version="1"
META FILEATTACHMENT attachment="CEDIRECT-probes.jpg" attr="" comment="new CREAM-CE Nagios probes in action" date="1391174324" name="CEDIRECT-probes.jpg" path="CEDIRECT-probes.jpg" size="309312" user="MarcoVerlato" version="1"
META FILEATTACHMENT attachment="CREAM-failure.jpg" attr="" comment="Screenshots when CREAM-CE is down" date="1391175849" name="CREAM-failure.jpg" path="CREAM-failure.jpg" size="420872" user="MarcoVerlato" version="1"
META FILEATTACHMENT attachment="CREAM-failure-d.jpg" attr="" comment="Screenshots when CREAM-CE is down" date="1391175849" name="CREAM-failure-d.jpg" path="CREAM-failure-d.jpg" size="338847" user="MarcoVerlato" version="1"
META FILEATTACHMENT attachment="GRIDFTP-failure.jpg" attr="" comment="Screenshot when GRIDFTP is down" date="1391178142" name="GRIDFTP-failure.jpg" path="GRIDFTP-failure.jpg" size="357316" user="MarcoVerlato" version="1"

Revision 32014-01-17 - LisaZangrando

Line: 1 to 1
 
META TOPICPARENT name="NagiosProbes"

New CREAM-CE direct job submission metrics

Line: 165 to 165
 

cream_jobSubmit.py

Changed:
<
<
This metric submits a job directly to the selected CREAM CE and waits until the job termination by providing the final status:
>
>
This metric submits a job directly to the selected CREAM CE and waits until the job termination by providing the final status. Finally the job is purged. Moreover the stage-in and stage-out phases are both performed automatically by the CE. In particular the stage-out needs the OutputSandboxBaseDestUri="gsiftp://localhost" set in the JDL.
 
$ ./cream_jobSubmit.py --help
Line: 218 to 219
 Job terminated with status DONE-OK
Changed:
<
<
If the verbose mode is enabled, the output of the above command should be like to:
>
>
If the verbose mode is enabled, the output of the above command should be like this:
 
$ ./cream_jobSubmit.py --hostname cream-41.pd.infn.it --port 8443 --lrms lsf --queue creamtest1 --jdl ./hostname.jdl --verbose
executing command: /usr/bin/voms-proxy-info -timeleft
Line: 231 to 232
 invoking jobStatus executing command: /usr/bin/glite-ce-job-status https://cream-41.pd.infn.it:8443/CREAM304354901 job status: DONE-OK
Added:
>
>
invoking getOutputSandbox executing command: /usr/bin/glite-ce-job-output --noint https://cream-41.pd.infn.it:8443/CREAM187996258 output sandbox dir: ./cream-41.pd.infn.it_8443_CREAM187996258
 invoking jobPurge
Changed:
<
<
executing command: /usr/bin/glite-ce-job-purge --noint https://cream-41.pd.infn.it:8443/CREAM304354901
>
>
executing command: /usr/bin/glite-ce-job-purge --noint https://cream-41.pd.infn.it:8443/CREAM187996258
 Job terminated with status DONE-OK
Added:
>
>
Notice the output sandbox dir: ./cream-41.pd.infn.it_8443_CREAM187996258. This is the output sandbox directory containing all the produced files:

$ ls -la ./cream-41.pd.infn.it_8443_CREAM187996258
total 12
drwxr-xr-x   2 root root 4096 17 gen 16:20 .
dr-xr-x---. 23 root root 4096 17 gen 16:20 ..
-rw-------   1 root root    0 17 gen 16:20 std.err
-rw-------   1 root root   26 17 gen 16:20 std.out
 

cream_jobCancel.py

This metric submits a job directly to the selected CREAM CE, waits until the job gain the REALLY-RUNNING state and then tries to cancel it. Finally it checks if the job has been correctly canceled.

Revision 22014-01-17 - LisaZangrando

Line: 1 to 1
 
META TOPICPARENT name="NagiosProbes"

New CREAM-CE direct job submission metrics

Changed:
<
<
The following metrics are a restructured version of the existing ones and try to provide a better approach for probing a CREAM CE.
>
>
The following metrics are a restructured version of the existing ones and provide a better approach for probing a CREAM CE.
 
Added:
>
>
  1. cream_serviceInfo.py - get CREAM CE service info
 
  1. cream_allowedSubmission.py - check if the submission to the selected CREAM CE is allowed
Added:
>
>
  1. cream_jobSubmit.py - submit a job directly to the selected CREAM CE
 
  1. cream_jobCancel.py - cancel an active job.
  2. cream_jobPurge.py - purge a terminted job.
Deleted:
<
<
  1. cream_jobSubmit.py - submit a job directly to the selected CREAM CE
  2. cream_serviceInfo.py - get CREAM CE service info
 
Changed:
<
<
All of them have been implemented in python and are based on the cream-cli commands. They share the same logic structure and provide useful information about the their version, usage (i.e. help) including the options list and their meaning. For example:
>
>
All of them have been implemented in python and are based on the cream-cli commands. They share the same logic structure and provide useful information about their version, usage (i.e. help) including the options list and their meaning. For example:
 
$ ./cream_serviceInfo.py 
Line: 79 to 80
 details: ['2014-01-16 15:59:57,085 FATAL - Received NULL fault; the error is due to another cause: FaultString=[connection error] - FaultCode=[SOAP-ENV:Client] - FaultSubCode=[SOAP-ENV:Client] - FaultDetail=[Connection refused]\n']
Added:
>
>

cream_serviceInfo.py

The serviceInfo.py retrieves information about the status of the CREAM CE. The help shows how the probe must be invoked:

$ ./cream_serviceInfo.py --help
Usage: cream_serviceInfo.py [options]

Options:
  --version             show program's version number and exit
  -h, --help            show this help message and exit
  -H HOSTNAME, --hostname=HOSTNAME
                        The hostname of the CREAM service.
  -p PORT, --port=PORT  The port of the service. [default: none]
  -x PROXY, --proxy=PROXY
                        The proxy path
  -v, --verbose         verbose mode [default: False]
  -u URL, --url=URL     The status endpoint URL of the service. Example:
                        https://<host>[:<port>]

In order to get information about the CREAM service on the host https://cream-41.pd.infn.it:8443, use the following command:

$ ./cream_serviceInfo.py --url https://cream-41.pd.infn.it:8443
Interface Version  = [2.1]
Service Version    = [1.16.1 - EMI version: 3.5.1-1.el6]
Description        = [CREAM 2]
Started at         = [Thu Jan  2 19:06:51 2014]
Submission enabled = [YES]
Status             = [RUNNING]

or similary:

$ ./cream_serviceInfo.py --hostname cream-41.pd.infn.it --port 8443 
Interface Version  = [2.1]
Service Version    = [1.16.1 - EMI version: 3.5.1-1.el6]
Description        = [CREAM 2]
Started at         = [Thu Jan  2 19:06:51 2014]
Submission enabled = [YES]
Status             = [RUNNING]
 

cream_allowedSubmission

Changed:
<
<
This metric check if the submission to the selected CREAM CE is allowed. The help show how the probe should be invoked:
>
>
This is a simple metric which checks if the submission to the selected CREAM CE is allowed. Its usage is analogous to the above metric:
 
$ ./cream_allowedSubmission.py --help
Line: 99 to 144
  -v, --verbose verbose mode [default: False]
Changed:
<
<
Notice: the use of the --url option is equivalent of specifying both the options: --hostname and --port:
>
>
Notice: the use of the --url option is equivalent to specify both the options: --hostname and --port:
 
$ ./cream_allowedSubmission.py --hostname cream-41.pd.infn.it --port 8443
Line: 119 to 164
 
Added:
>
>

cream_jobSubmit.py

This metric submits a job directly to the selected CREAM CE and waits until the job termination by providing the final status:

$ ./cream_jobSubmit.py --help
Usage: cream_jobSubmit.py [options]

Options:
  --version             show program's version number and exit
  -h, --help            show this help message and exit
  -H HOSTNAME, --hostname=HOSTNAME
                        The hostname of the CREAM service.
  -p PORT, --port=PORT  The port of the service. [default: none]
  -u URL, --url=URL     The status endpoint URL of the service. Example:
                        https://<host>[:<port>]/cream-<lrms>-<queue>
  -x PROXY, --proxy=PROXY
                        The proxy path
  -v, --verbose         verbose mode [default: False]
  -l LRMS, --lrms=LRMS  The LRMS name (e.g.: 'lsf', 'pbs' etc)
  -q QUEUE, --queue=QUEUE
                        The queue name (e.g.: 'creamtest')
  -j JDL, --jdl=JDL     The jdl path

The --url (-u) directive must be used to target the probe to a specific CREAM CE identified by its identifier (i.e. CREAM CE ID). Alternatively is it possible to specify the CREAM CE identifier by using the --hostname , --port, --lrms and --queue options which are mutually exclusive with respect to the --url option.

Consider the JDL file hostname.jdl with the following content:

$ cat ./hostname.jdl
[
Type="Job";
JobType="Normal";
Executable = "/bin/hostname";
StdOutput = "std.out";
StdError = "std.err";
OutputSandbox = {"std.out","std.err"};
OutputSandboxBaseDestUri="gsiftp://localhost"
]

If verbose mode is disabled, the output should look like this:

$ ./cream_jobSubmit.py --url https://cream-41.pd.infn.it:8443/cream-lsf-creamtest1 --jdl ./hostname.jdl 
Job terminated with status DONE-OK

Notice: the use of the --url option is equivalent to specify both the options: --hostname, --port --lrms and --queue:

$ ./cream_jobSubmit.py --hostname cream-41.pd.infn.it --port 8443 --lrms lsf --queue creamtest1 --jdl ./hostname.jdl 
Job terminated with status DONE-OK

If the verbose mode is enabled, the output of the above command should be like to:

$ ./cream_jobSubmit.py --hostname cream-41.pd.infn.it --port 8443 --lrms lsf --queue creamtest1 --jdl ./hostname.jdl --verbose
executing command: /usr/bin/voms-proxy-info -timeleft
executing command: /usr/bin/glite-ce-job-submit -d -a -r cream-41.pd.infn.it:8443/cream-lsf-creamtest1 ./hostname.jdl
['2014-01-16 13:52:57,305 DEBUG - Using certificate proxy file [/tmp/x509up_u0]\n', '2014-01-16 13:52:57,324 DEBUG - VO from certificate=[dteam]\n', '2014-01-16 13:52:57,324 WARN - No configuration file suitable for loading. Using built-in configuration\n', '2014-01-16 13:52:57,324 DEBUG - Logfile is [/tmp/glite_cream_cli_logs/glite-ce-job-submit_CREAM_root_20140116-135257.log]\n', '2014-01-16 13:52:57,328 INFO - certUtil::generateUniqueID() - Generated DelegationID: [99b19aafc98e11bb7956cc58d901bd860697227d]\n', '2014-01-16 13:52:59,645 DEBUG - Registering to [https://cream-41.pd.infn.it:8443/ce-cream/services/CREAM2] JDL=[ StdOutput = "std.out"; BatchSystem = "lsf"; QueueName = "creamtest1"; Executable = "/bin/hostname"; Type = "Job"; JobType = "Normal"; OutputSandboxBaseDestUri = "gsiftp://localhost"; OutputSandbox = { "std.out","std.err" }; StdError = "std.err" ] - JDL File=[./hostname.jdl]\n', '2014-01-16 13:53:00,022 DEBUG - Will invoke JobStart for JobID [CREAM304354901]\n', 'https://cream-41.pd.infn.it:8443/CREAM304354901\n']
job id: https://cream-41.pd.infn.it:8443/CREAM304354901
invoking jobStatus
executing command: /usr/bin/glite-ce-job-status https://cream-41.pd.infn.it:8443/CREAM304354901
job status: REALLY-RUNNING
invoking jobStatus
executing command: /usr/bin/glite-ce-job-status https://cream-41.pd.infn.it:8443/CREAM304354901
job status: DONE-OK
invoking jobPurge
executing command: /usr/bin/glite-ce-job-purge --noint https://cream-41.pd.infn.it:8443/CREAM304354901
Job terminated with status DONE-OK
 

cream_jobCancel.py

Added:
>
>
This metric submits a job directly to the selected CREAM CE, waits until the job gain the REALLY-RUNNING state and then tries to cancel it. Finally it checks if the job has been correctly canceled.
 
$ ./cream_jobCancel.py --help
Line: 142 to 261
  -j JDL, --jdl=JDL The jdl path
Added:
>
>
The job must be enough long in terms of execution time, in order to allow the probe to check the current job status and invoke the glite-ce-job-cancel. For example consider the job (i.e. hostnane.jdl) of the above metric. In this case the probe will fail because the job already terminated before the execution of the e glite-ce-job-cancel command:
 
$ ./cream_jobCancel.py --url https://cream-41.pd.infn.it:8443/cream-lsf-creamtest1 --jdl ./hostname.jdl 
job already terminated
Added:
>
>
Now consider the following job:
 
$ cat ./sleep.jdl 
[
Line: 159 to 283
 ]
Added:
>
>
The output of the probe should be like:
 
$ ./cream_jobCancel.py --url https://cream-41.pd.infn.it:8443/cream-lsf-creamtest1 --jdl ./sleep.jdl 
job cancelled
Added:
>
>
or like this with --verbose option specified:
 
$ ./cream_jobCancel.py --url https://cream-41.pd.infn.it:8443/cream-lsf-creamtest1 --jdl ./sleep.jdl --verbose
executing command: /usr/bin/voms-proxy-info -timeleft
Line: 188 to 315
 

cream_jobPurge.py

Added:
>
>
This metric is analogous of cream_jobCancel.py. It submits a short job (e.g. hostname.jdl), waits its termination (e.g DONE-OK) and then it tries to purge it. Finally, in order to verify the purging operation was successfully executed, the probe checks the job status by executing the glite-ce-job-status command which just in this scenario, must fail because the job doesn't exist anymore.
 
$ ./cream_jobPurge.py --help
Line: 232 to 359
 job purged
Deleted:
<
<

cream_jobSubmit.py

This metric submits a job directly to the selected CREAM CE and waits until the job termination by providing the final status:

$ ./cream_jobSubmit.py --help
Usage: cream_jobSubmit.py [options]

Options:
  --version             show program's version number and exit
  -h, --help            show this help message and exit
  -H HOSTNAME, --hostname=HOSTNAME
                        The hostname of the CREAM service.
  -p PORT, --port=PORT  The port of the service. [default: none]
  -u URL, --url=URL     The status endpoint URL of the service. Example:
                        https://<host>[:<port>]/cream-<lrms>-<queue>
  -x PROXY, --proxy=PROXY
                        The proxy path
  -v, --verbose         verbose mode [default: False]
  -l LRMS, --lrms=LRMS  The LRMS name (e.g.: 'lsf', 'pbs' etc)
  -q QUEUE, --queue=QUEUE
                        The queue name (e.g.: 'creamtest')
  -j JDL, --jdl=JDL     The jdl path

The JDL used for this example asks about the CREAM hostname:

$ cat ./hostname.jdl
[
Type="Job";
JobType="Normal";
Executable = "/bin/hostname";
StdOutput = "std.out";
StdError = "std.err";
OutputSandbox = {"std.out","std.err"};
OutputSandboxBaseDestUri="gsiftp://localhost"
]

The probe execution with the verbose mode disabled:

$ ./cream_jobSubmit.py --url https://cream-41.pd.infn.it:8443/cream-lsf-creamtest1 --jdl ./hostname.jdl 
Job terminated with status DONE-OK

Notice: the use of the --url option is equivalent of specifying both the options: --hostname, --port --lrms and --queue:

$ ./cream_jobSubmit.py --hostname cream-41.pd.infn.it --port 8443 --lrms lsf --queue creamtest1 --jdl ./hostname.jdl 
Job terminated with status DONE-OK

The execution of the above command with the verbose mode enabled:

$ ./cream_jobSubmit.py --hostname cream-41.pd.infn.it --port 8443 --lrms lsf --queue creamtest1 --jdl ./hostname.jdl --verbose
executing command: /usr/bin/voms-proxy-info -timeleft
executing command: /usr/bin/glite-ce-job-submit -d -a -r cream-41.pd.infn.it:8443/cream-lsf-creamtest1 ./hostname.jdl
['2014-01-16 13:52:57,305 DEBUG - Using certificate proxy file [/tmp/x509up_u0]\n', '2014-01-16 13:52:57,324 DEBUG - VO from certificate=[dteam]\n', '2014-01-16 13:52:57,324 WARN - No configuration file suitable for loading. Using built-in configuration\n', '2014-01-16 13:52:57,324 DEBUG - Logfile is [/tmp/glite_cream_cli_logs/glite-ce-job-submit_CREAM_root_20140116-135257.log]\n', '2014-01-16 13:52:57,328 INFO - certUtil::generateUniqueID() - Generated DelegationID: [99b19aafc98e11bb7956cc58d901bd860697227d]\n', '2014-01-16 13:52:59,645 DEBUG - Registering to [https://cream-41.pd.infn.it:8443/ce-cream/services/CREAM2] JDL=[ StdOutput = "std.out"; BatchSystem = "lsf"; QueueName = "creamtest1"; Executable = "/bin/hostname"; Type = "Job"; JobType = "Normal"; OutputSandboxBaseDestUri = "gsiftp://localhost"; OutputSandbox = { "std.out","std.err" }; StdError = "std.err" ] - JDL File=[./hostname.jdl]\n', '2014-01-16 13:53:00,022 DEBUG - Will invoke JobStart for JobID [CREAM304354901]\n', 'https://cream-41.pd.infn.it:8443/CREAM304354901\n']
job id: https://cream-41.pd.infn.it:8443/CREAM304354901
invoking jobStatus
executing command: /usr/bin/glite-ce-job-status https://cream-41.pd.infn.it:8443/CREAM304354901
job status: REALLY-RUNNING
invoking jobStatus
executing command: /usr/bin/glite-ce-job-status https://cream-41.pd.infn.it:8443/CREAM304354901
job status: DONE-OK
invoking jobPurge
executing command: /usr/bin/glite-ce-job-purge --noint https://cream-41.pd.infn.it:8443/CREAM304354901
Job terminated with status DONE-OK

cream_serviceInfo.py

$ ./cream_serviceInfo.py --help
Usage: cream_serviceInfo.py [options]

Options:
  --version             show program's version number and exit
  -h, --help            show this help message and exit
  -H HOSTNAME, --hostname=HOSTNAME
                        The hostname of the CREAM service.
  -p PORT, --port=PORT  The port of the service. [default: none]
  -x PROXY, --proxy=PROXY
                        The proxy path
  -v, --verbose         verbose mode [default: False]
  -u URL, --url=URL     The status endpoint URL of the service. Example:
                        https://<host>[:<port>]

$ ./cream_serviceInfo.py --url https://cream-41.pd.infn.it:8443
Interface Version  = [2.1]
Service Version    = [1.16.1 - EMI version: 3.5.1-1.el6]
Description        = [CREAM 2]
Started at         = [Thu Jan  2 19:06:51 2014]
Submission enabled = [YES]
Status             = [RUNNING]
  -- LisaZangrando - 2014-01-16

Revision 12014-01-16 - LisaZangrando

Line: 1 to 1
Added:
>
>
META TOPICPARENT name="NagiosProbes"

New CREAM-CE direct job submission metrics

The following metrics are a restructured version of the existing ones and try to provide a better approach for probing a CREAM CE.

  1. cream_allowedSubmission.py - check if the submission to the selected CREAM CE is allowed
  2. cream_jobCancel.py - cancel an active job.
  3. cream_jobPurge.py - purge a terminted job.
  4. cream_jobSubmit.py - submit a job directly to the selected CREAM CE
  5. cream_serviceInfo.py - get CREAM CE service info

All of them have been implemented in python and are based on the cream-cli commands. They share the same logic structure and provide useful information about the their version, usage (i.e. help) including the options list and their meaning. For example:

$ ./cream_serviceInfo.py 
Usage: cream_serviceInfo.py [options]

cream_serviceInfo.py: error: Specify either option -u URL or option -H HOSTNAME (and -p PORT) or read the help (-h)

$ ./cream_serviceInfo.py --help
Usage: cream_serviceInfo.py [options]

Options:
  --version             show program's version number and exit
  -h, --help            show this help message and exit
  -H HOSTNAME, --hostname=HOSTNAME
                        The hostname of the CREAM service.
  -p PORT, --port=PORT  The port of the service. [default: none]
  -u URL, --url=URL     The status endpoint URL of the service. Example:
                        https://<host>[:<port>]
  -x PROXY, --proxy=PROXY
                        The proxy path
  -v, --verbose         verbose mode [default: False]

$ ./cream_serviceInfo.py --version
cream_serviceInfo v.1.0

The interaction with the CREAM CE requires the use of a valid VOMS proxy expressed by the X509_USER_PROXY env variable or through the --proxy option. All metrics check the existence of the proxy file and calculate the time left. In case of error, the related error message will be thrown:

$ ./cream_serviceInfo.py --hostname cream-41.pd.infn.it --port 8443 --proxy /tmp/x509up_u0 --verbose
Proxy file not found or not readable

The verbose mode (--verbose) could be enabled to each metric. It provides several details about the probe execution itself by highlighting the internal commands:

$ ./cream_serviceInfo.py --hostname cream-41.pd.infn.it --port 8443 --verbose
executing command: /usr/bin/voms-proxy-info -timeleft
invoking service info
executing command: /usr/bin/glite-ce-service-info cream-41.pd.infn.it:8443
Interface Version  = [2.1]
Service Version    = [1.16.1 - EMI version: 3.5.1-1.el6]
Description        = [CREAM 2]
Started at         = [Thu Jan  2 19:06:51 2014]
Submission enabled = [YES]
Status             = [RUNNING]

In case of mistakes on the selected options or on their values, the probe tries to explain what is wrong. For example the cream_serviceInfo doesn't support the --queue option:

$ ./cream_serviceInfo.py --hostname cream-41.pd.infn.it --port 8443 --queue creamtest1 --verbose
Usage: cream_serviceInfo.py [options]

cream_serviceInfo.py: error: no such option: --queue

In case of the errors in interacting with the CREAM CE, useful details will be provided about the failure:

$ ./cream_allowedSubmission.py --url https://cream-43.pd.infn.it:8443
command '/usr/bin/glite-ce-allowed-submission cream-43.pd.infn.it:8443' failed: return_code=1
details: ['2014-01-16 15:59:57,085 FATAL - Received NULL fault; the error is due to another cause: FaultString=[connection error] - FaultCode=[SOAP-ENV:Client] - FaultSubCode=[SOAP-ENV:Client] - FaultDetail=[Connection refused]\n']

cream_allowedSubmission

This metric check if the submission to the selected CREAM CE is allowed. The help show how the probe should be invoked:

$ ./cream_allowedSubmission.py --help
Usage: cream_allowedSubmission.py [options]

Options:
  --version             show program's version number and exit
  -h, --help            show this help message and exit
  -H HOSTNAME, --hostname=HOSTNAME
                        The hostname of the CREAM service.
  -p PORT, --port=PORT  The port of the service. [default: none]
  -u URL, --url=URL     The status endpoint URL of the service. Example:
                        https://<host>[:<port>]
  -x PROXY, --proxy=PROXY
                        The proxy path
  -v, --verbose         verbose mode [default: False]

Notice: the use of the --url option is equivalent of specifying both the options: --hostname and --port:

$ ./cream_allowedSubmission.py --hostname cream-41.pd.infn.it --port 8443
ENABLED

$ ./cream_allowedSubmission.py --url https://cream-41.pd.infn.it:8443
ENABLED

The verbose mode highlights the internal commands:

$ ./cream_allowedSubmission.py --url https://cream-41.pd.infn.it:8443 --verbose
executing command: /usr/bin/voms-proxy-info -timeleft
invoking allowedSubmission
executing command: /usr/bin/glite-ce-allowed-submission cream-41.pd.infn.it:8443
ENABLED

cream_jobCancel.py

$ ./cream_jobCancel.py --help
Usage: cream_jobCancel.py [options]

Options:
  --version             show program's version number and exit
  -h, --help            show this help message and exit
  -H HOSTNAME, --hostname=HOSTNAME
                        The hostname of the CREAM service.
  -p PORT, --port=PORT  The port of the service. [default: none]
  -x PROXY, --proxy=PROXY
                        The proxy path
  -v, --verbose         verbose mode [default: False]
  -u URL, --url=URL     The status endpoint URL of the service. Example:
                        https://<host>[:<port>]/cream-<lrms>-<queue>
  -l LRMS, --lrms=LRMS  The LRMS name (e.g.: 'lsf', 'pbs' etc)
  -q QUEUE, --queue=QUEUE
                        The queue name (e.g.: 'creamtest')
  -j JDL, --jdl=JDL     The jdl path

$ ./cream_jobCancel.py --url https://cream-41.pd.infn.it:8443/cream-lsf-creamtest1 --jdl ./hostname.jdl 
job already terminated

$ cat ./sleep.jdl 
[
Type="Job";
JobType="Normal";
Executable = "/bin/sleep";
Arguments = "200";
StdOutput = "cream.out";
StdError = "cream.out";
]

$ ./cream_jobCancel.py --url https://cream-41.pd.infn.it:8443/cream-lsf-creamtest1 --jdl ./sleep.jdl 
job cancelled

$ ./cream_jobCancel.py --url https://cream-41.pd.infn.it:8443/cream-lsf-creamtest1 --jdl ./sleep.jdl --verbose
executing command: /usr/bin/voms-proxy-info -timeleft
executing command: /usr/bin/glite-ce-job-submit -d -a -r cream-41.pd.infn.it:8443/cream-lsf-creamtest1 ./sleep.jdl
['2014-01-16 17:22:49,728 DEBUG - Using certificate proxy file [/tmp/x509up_u0]\n', '2014-01-16 17:22:49,744 DEBUG - VO from certificate=[dteam]\n', '2014-01-16 17:22:49,745 WARN - No configuration file suitable for loading. Using built-in configuration\n', '2014-01-16 17:22:49,745 DEBUG - Logfile is [/tmp/glite_cream_cli_logs/glite-ce-job-submit_CREAM_root_20140116-172249.log]\n', '2014-01-16 17:22:49,747 INFO - certUtil::generateUniqueID() - Generated DelegationID: [69e3bdaf4e818e1f71f1b7ff442f74583c869b84]\n', '2014-01-16 17:22:52,165 DEBUG - Registering to [https://cream-41.pd.infn.it:8443/ce-cream/services/CREAM2] JDL=[ StdOutput = "cream.out"; BatchSystem = "lsf"; QueueName = "creamtest1"; Executable = "/bin/sleep"; Type = "Job"; Arguments = "200"; JobType = "Normal"; StdError = "cream.out" ] - JDL File=[./sleep.jdl]\n', '2014-01-16 17:22:52,513 DEBUG - Will invoke JobStart for JobID [CREAM437649288]\n', 'https://cream-41.pd.infn.it:8443/CREAM437649288\n']
job id: https://cream-41.pd.infn.it:8443/CREAM437649288
invoking jobStatus
executing command: /usr/bin/glite-ce-job-status https://cream-41.pd.infn.it:8443/CREAM437649288
job status: REALLY-RUNNING
invoking jobCancel
executing command: /usr/bin/glite-ce-job-cancel --noint https://cream-41.pd.infn.it:8443/CREAM437649288
invoking jobStatus
executing command: /usr/bin/glite-ce-job-status https://cream-41.pd.infn.it:8443/CREAM437649288
job status: REALLY-RUNNING
invoking jobStatus
executing command: /usr/bin/glite-ce-job-status https://cream-41.pd.infn.it:8443/CREAM437649288
job status: CANCELLED
invoking jobPurge
executing command: /usr/bin/glite-ce-job-purge --noint https://cream-41.pd.infn.it:8443/CREAM437649288
job cancelled

cream_jobPurge.py

$ ./cream_jobPurge.py --help
Usage: cream_jobPurge.py [options]

Options:
  --version             show program's version number and exit
  -h, --help            show this help message and exit
  -H HOSTNAME, --hostname=HOSTNAME
                        The hostname of the CREAM service.
  -p PORT, --port=PORT  The port of the service. [default: none]
  -x PROXY, --proxy=PROXY
                        The proxy path
  -v, --verbose         verbose mode [default: False]
  -u URL, --url=URL     The status endpoint URL of the service. Example:
                        https://<host>[:<port>]/cream-<lrms>-<queue>
  -l LRMS, --lrms=LRMS  The LRMS name (e.g.: 'lsf', 'pbs' etc)
  -q QUEUE, --queue=QUEUE
                        The queue name (e.g.: 'creamtest')
  -j JDL, --jdl=JDL     The jdl path

$ ./cream_jobPurge.py --url https://cream-41.pd.infn.it:8443/cream-lsf-creamtest1 --jdl ./hostname.jdl 
job purged

$ ./cream_jobPurge.py --url https://cream-41.pd.infn.it:8443/cream-lsf-creamtest1 --jdl ./hostname.jdl --verbose
executing command: /usr/bin/voms-proxy-info -timeleft
executing command: /usr/bin/glite-ce-job-submit -d -a -r cream-41.pd.infn.it:8443/cream-lsf-creamtest1 ./hostname.jdl
['2014-01-16 14:02:48,282 DEBUG - Using certificate proxy file [/tmp/x509up_u0]\n', '2014-01-16 14:02:48,301 DEBUG - VO from certificate=[dteam]\n', '2014-01-16 14:02:48,301 WARN - No configuration file suitable for loading. Using built-in configuration\n', '2014-01-16 14:02:48,302 DEBUG - Logfile is [/tmp/glite_cream_cli_logs/glite-ce-job-submit_CREAM_root_20140116-140248.log]\n', '2014-01-16 14:02:48,305 INFO - certUtil::generateUniqueID() - Generated DelegationID: [85a80615fc8046baf6dce2a27b708fc82ecbce55]\n', '2014-01-16 14:02:51,105 DEBUG - Registering to [https://cream-41.pd.infn.it:8443/ce-cream/services/CREAM2] JDL=[ StdOutput = "std.out"; BatchSystem = "lsf"; QueueName = "creamtest1"; Executable = "/bin/hostname"; Type = "Job"; JobType = "Normal"; OutputSandboxBaseDestUri = "gsiftp://localhost"; OutputSandbox = { "std.out","std.err" }; StdError = "std.err" ] - JDL File=[./hostname.jdl]\n', '2014-01-16 14:02:51,504 DEBUG - Will invoke JobStart for JobID [CREAM691625071]\n', 'https://cream-41.pd.infn.it:8443/CREAM691625071\n']
job id: https://cream-41.pd.infn.it:8443/CREAM691625071
invoking jobStatus
executing command: /usr/bin/glite-ce-job-status https://cream-41.pd.infn.it:8443/CREAM691625071
job status: DONE-OK
invoking jobPurge
executing command: /usr/bin/glite-ce-job-purge --noint https://cream-41.pd.infn.it:8443/CREAM691625071
invoking jobStatus
executing command: /usr/bin/glite-ce-job-status https://cream-41.pd.infn.it:8443/CREAM691625071
job purged

cream_jobSubmit.py

This metric submits a job directly to the selected CREAM CE and waits until the job termination by providing the final status:

$ ./cream_jobSubmit.py --help
Usage: cream_jobSubmit.py [options]

Options:
  --version             show program's version number and exit
  -h, --help            show this help message and exit
  -H HOSTNAME, --hostname=HOSTNAME
                        The hostname of the CREAM service.
  -p PORT, --port=PORT  The port of the service. [default: none]
  -u URL, --url=URL     The status endpoint URL of the service. Example:
                        https://<host>[:<port>]/cream-<lrms>-<queue>
  -x PROXY, --proxy=PROXY
                        The proxy path
  -v, --verbose         verbose mode [default: False]
  -l LRMS, --lrms=LRMS  The LRMS name (e.g.: 'lsf', 'pbs' etc)
  -q QUEUE, --queue=QUEUE
                        The queue name (e.g.: 'creamtest')
  -j JDL, --jdl=JDL     The jdl path

The JDL used for this example asks about the CREAM hostname:

$ cat ./hostname.jdl
[
Type="Job";
JobType="Normal";
Executable = "/bin/hostname";
StdOutput = "std.out";
StdError = "std.err";
OutputSandbox = {"std.out","std.err"};
OutputSandboxBaseDestUri="gsiftp://localhost"
]

The probe execution with the verbose mode disabled:

$ ./cream_jobSubmit.py --url https://cream-41.pd.infn.it:8443/cream-lsf-creamtest1 --jdl ./hostname.jdl 
Job terminated with status DONE-OK

Notice: the use of the --url option is equivalent of specifying both the options: --hostname, --port --lrms and --queue:

$ ./cream_jobSubmit.py --hostname cream-41.pd.infn.it --port 8443 --lrms lsf --queue creamtest1 --jdl ./hostname.jdl 
Job terminated with status DONE-OK

The execution of the above command with the verbose mode enabled:

$ ./cream_jobSubmit.py --hostname cream-41.pd.infn.it --port 8443 --lrms lsf --queue creamtest1 --jdl ./hostname.jdl --verbose
executing command: /usr/bin/voms-proxy-info -timeleft
executing command: /usr/bin/glite-ce-job-submit -d -a -r cream-41.pd.infn.it:8443/cream-lsf-creamtest1 ./hostname.jdl
['2014-01-16 13:52:57,305 DEBUG - Using certificate proxy file [/tmp/x509up_u0]\n', '2014-01-16 13:52:57,324 DEBUG - VO from certificate=[dteam]\n', '2014-01-16 13:52:57,324 WARN - No configuration file suitable for loading. Using built-in configuration\n', '2014-01-16 13:52:57,324 DEBUG - Logfile is [/tmp/glite_cream_cli_logs/glite-ce-job-submit_CREAM_root_20140116-135257.log]\n', '2014-01-16 13:52:57,328 INFO - certUtil::generateUniqueID() - Generated DelegationID: [99b19aafc98e11bb7956cc58d901bd860697227d]\n', '2014-01-16 13:52:59,645 DEBUG - Registering to [https://cream-41.pd.infn.it:8443/ce-cream/services/CREAM2] JDL=[ StdOutput = "std.out"; BatchSystem = "lsf"; QueueName = "creamtest1"; Executable = "/bin/hostname"; Type = "Job"; JobType = "Normal"; OutputSandboxBaseDestUri = "gsiftp://localhost"; OutputSandbox = { "std.out","std.err" }; StdError = "std.err" ] - JDL File=[./hostname.jdl]\n', '2014-01-16 13:53:00,022 DEBUG - Will invoke JobStart for JobID [CREAM304354901]\n', 'https://cream-41.pd.infn.it:8443/CREAM304354901\n']
job id: https://cream-41.pd.infn.it:8443/CREAM304354901
invoking jobStatus
executing command: /usr/bin/glite-ce-job-status https://cream-41.pd.infn.it:8443/CREAM304354901
job status: REALLY-RUNNING
invoking jobStatus
executing command: /usr/bin/glite-ce-job-status https://cream-41.pd.infn.it:8443/CREAM304354901
job status: DONE-OK
invoking jobPurge
executing command: /usr/bin/glite-ce-job-purge --noint https://cream-41.pd.infn.it:8443/CREAM304354901
Job terminated with status DONE-OK

cream_serviceInfo.py

$ ./cream_serviceInfo.py --help
Usage: cream_serviceInfo.py [options]

Options:
  --version             show program's version number and exit
  -h, --help            show this help message and exit
  -H HOSTNAME, --hostname=HOSTNAME
                        The hostname of the CREAM service.
  -p PORT, --port=PORT  The port of the service. [default: none]
  -x PROXY, --proxy=PROXY
                        The proxy path
  -v, --verbose         verbose mode [default: False]
  -u URL, --url=URL     The status endpoint URL of the service. Example:
                        https://<host>[:<port>]

$ ./cream_serviceInfo.py --url https://cream-41.pd.infn.it:8443
Interface Version  = [2.1]
Service Version    = [1.16.1 - EMI version: 3.5.1-1.el6]
Description        = [CREAM 2]
Started at         = [Thu Jan  2 19:06:51 2014]
Submission enabled = [YES]
Status             = [RUNNING]

-- LisaZangrando - 2014-01-16

 
This site is powered by the TWiki collaboration platformCopyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback