Tags:
create new tag
,
view all tags
---+ CREAM-CE direct job submission metrics %TOC% These metrics are used to probe cream-ce using cream-cli commands. 1 *emi.cream.CREAMCEDJS-DirectJobState*. Direct job submission to CREAM-CE. 1 <strong>emi.cream.CREAMCEDJS-DirectJobMonit. </strong>Babysit submitted grid jobs. 1 <strong>emi.cream.CREAMCEDJS-DelegateProxy. </strong>Delegate proxy to CREAM CE 1 <strong>emi.cream.CREAMCEDJS-DirectJobCancel. </strong>Cancel active job. 1 <strong>emi.cream.CREAMCEDJS-ServiceInfo. </strong>Get CREAM CE service info 1 <strong>emi.cream.CREAMCEDJS-SubmitAllowed. </strong>Check if submission to the CREAM CE is allowed 1 *emi.cream.CREAMCEDJS-DirectJobSubmit*. Passive. Final status of direct job submission to CREAM CE ---++ emi.cream.CREAMCEDJS-DirectJobState Direct submission to a CREAM-CE, which can be choosen using these parameters: | --resource <URI> | CREAM CE to send job to. Format : <host>[:<port>]/cream-<lrms-system-name>-<queue-name> <br />If not given - resource discovery will be performed. | | --ldap-uri <URI> | Format [ldap://]hostname[:port[/]] (Default: ldap://sam-bdii.cern.ch:2170) | | --prev-status <0-3> | Previous Nagios status of the metric. | As specified if the destination CREAM-CE is not explicited a resuorce discovery will be performed using the given ldap server. At the moment the template jdl used for submission is very simple: <verbatim> Type="Job"; JobType="Normal"; Executable = "<jdlExecutable>"; Arguments = "<jdlArguments>"; StdOutput = "cream.out"; StdError = "cream.out"; OutputSandbox = {"cream.out"}; OutputSandboxBaseDestUri="gsiftp://localhost"; </verbatim> where the Executable is the command "/bin/hostname/" ---++ emi.cream.CREAMCEDJS-DirectJobMonit Monitors submitted grid jobs. Threaded implementation with one thread per monitored resource with max 10 threads. Passively updates emi.cream.CREAMCEDJS-DirectJobState with the latest state of the job according to CREAM when job is not in a terminal state. When job enters terminal state or was canceled the metric updates both emi.cream.CREAMCEDJS-DirectJobState and emi.cream.CREAMCEDJS-DirectJobSubmit with the final job status. The latter metrics are updated (as passive checks) either via Naigos command file or NSCA. emi.cream.CREAMCEDJS-DirectJobSubmit is the metric which goes to Metric Store Database. ---++ Test To test the probe you have to create a valid proxy. ---+++ JobState + JobMonit First you have to "submit" a job: */usr/libexec/grid-monitoring/probes/emi.cream/CREAMCEDJS-probe --vo <vo> -x <path of the proxy> -H <CREAM hostname> -m emi.cream.CREAMCEDJS-DirectJobState --resource <CREAM CE url>* <verbatim> [ale@cream-48 ~]$ /usr/libexec/grid-monitoring/probes/emi.cream/CREAMCEDJS-probe --vo dteam -x /tmp/x509up_u501 -H cream-30.pd.infn.it -m emi.cream.CREAMCEDJS-DirectJobState --resource cream-30.pd.infn.it:8443/cream-pbs-cert OK: Job was submitted [https://cream-30.pd.infn.it:8443/CREAM126240562]. OK: Job was submitted [https://cream-30.pd.infn.it:8443/CREAM126240562]. Testing from: cream-48.pd.infn.it DN: /C=IT/O=INFN/OU=Personal Certificate/L=Padova/CN=Alessio Gianelle/CN=proxy VOMS FQANs: /dteam/Role=NULL/Capability=NULL, /dteam/NGI_IT/Role=NULL/Capability=NULL https://cream-30.pd.infn.it:8443/CREAM126240562 </verbatim> Then you can monitor the job: */usr/libexec/grid-monitoring/probes/emi.cream/CREAMCEDJS-probe --vo <vo> -x <path of the proxy> -H <CREAM hostname> -m emi.cream.CREAMCEDJS-DirectJobMonit --pass-check-dest active* <verbatim> [ale@cream-48 ~]$ /usr/libexec/grid-monitoring/probes/emi.cream/CREAMCEDJS-probe --vo dteam -x /tmp/x509up_u501 -H cream-30.pd.infn.it -m emi.cream.CREAMCEDJS-DirectJobMonit --pass-check-dest active OK: DONE. metric results >>> <cream-30.pd.infn.it,emi.cream.CREAMCEDJS-DirectJobSubmit-dteam> metric results >>> <cream-30.pd.infn.it,emi.cream.CREAMCEDJS-DirectJobState-dteam> OK: Jobs processed - 1 OK: Jobs processed - 1 DONE : 1|jobs_processed=1;; DONE=1;; REALLY-RUNNING=0;; RUNNING=0;; REGISTERED=0;; PENDING=0;; IDLE=0;; HELD=0;; CANCELLED=0;; ABORTED=0;; UNKNOWN=0;; MISSED=0;; UNDETERMINED=0;; unknown=0;1;2 </verbatim> When it finishes the output file is retrieve and stored into /var/lib/gridprobes/<VO or FQAN>/emi.cream/CREAMCEDJS/<hostname>/jobOutput. The output file should contains the hostname of the worker node where job have run. <verbatim> [ale@cream-48 ~]$ cat /var/lib/gridprobes/dteam/emi.cream/CREAMCEDJS/cream-30.pd.infn.it/jobOutput/cream-30.pd.infn.it_8443_CREAM126240562/cream.out cream-wn-030.pn.pd.infn.it </verbatim> ---+++ JobState + JobMonit + JobCancel To test easily the "Cancel" metrics you need to modify the JDL template to increment job duration: <verbatim> [ale@cream-48 ~]$ cat /usr/libexec/grid-monitoring/probes/emi.cream/CREAMDJS-jdl.template [ Type="Job"; JobType="Normal"; #Executable = "<jdlExecutable>"; Executable = "/bin/sleep"; #Arguments = "<jdlArguments>"; Arguments = "100"; StdOutput = "cream.out"; StdError = "cream.out"; OutputSandbox = {"cream.out"}; OutputSandboxBaseDestUri="gsiftp://localhost"; ] </verbatim> Then you have to "submit" the job: */usr/libexec/grid-monitoring/probes/emi.cream/CREAMCEDJS-probe --vo <vo> -x <path of the proxy> -H <CREAM hostname> -m emi.cream.CREAMCEDJS-DirectJobState --resource <CREAM CE url>* <verbatim> [ale@cream-48 ~]$ /usr/libexec/grid-monitoring/probes/emi.cream/CREAMCEDJS-probe --vo dteam -x /tmp/x509up_u501 -H cream-30.pd.infn.it -m emi.cream.CREAMCEDJS-DirectJobState --resource cream-30.pd.infn.it:8443/cream-pbs-cert OK: Job was submitted [https://cream-30.pd.infn.it:8443/CREAM226348631]. OK: Job was submitted [https://cream-30.pd.infn.it:8443/CREAM226348631]. Testing from: cream-48.pd.infn.it DN: /C=IT/O=INFN/OU=Personal Certificate/L=Padova/CN=Alessio Gianelle/CN=proxy VOMS FQANs: /dteam/Role=NULL/Capability=NULL, /dteam/NGI_IT/Role=NULL/Capability=NULL https://cream-30.pd.infn.it:8443/CREAM226348631 </verbatim> Monitor the job until it arrives to the _RUNNING_ state: */usr/libexec/grid-monitoring/probes/emi.cream/CREAMCEDJS-probe --vo <vo> -x <path of the proxy> -H <CREAM hostname> -m emi.cream.CREAMCEDJS-DirectJobMonit --pass-check-dest active* <verbatim> [ale@cream-48 ~]$ /usr/libexec/grid-monitoring/probes/emi.cream/CREAMCEDJS-probe --vo dteam -x /tmp/x509up_u501 -H cream-30.pd.infn.it -m emi.cream.CREAMCEDJS-DirectJobMonit --pass-check-dest active metric results >>> <cream-30.pd.infn.it,emi.cream.CREAMCEDJS-DirectJobState-dteam> OK: [RUNNING] https://cream-30.pd.infn.it:8443/CREAM226348631 OK: [RUNNING] https://cream-30.pd.infn.it:8443/CREAM226348631 glite-ce-job-status https://cream-30.pd.infn.it:8443/CREAM226348631 ****** JobID=[https://cream-30.pd.infn.it:8443/CREAM226348631] Status = [RUNNING] OK: Jobs processed - 1 OK: Jobs processed - 1 [RUNNING] : 1|jobs_processed=1;; DONE=0;; REALLY-RUNNING=0;; RUNNING=1;; REGISTERED=0;; PENDING=0;; IDLE=0;; HELD=0;; CANCELLED=0;; ABORTED=0;; UNKNOWN=0;; MISSED=0;; UNDETERMINED=0;; unknown=0;1;2 </verbatim> Then you can cancel it: */usr/libexec/grid-monitoring/probes/emi.cream/CREAMCEDJS-probe --vo <vo> -x <path of the proxy> -H <CREAM hostname> -m emi.cream.CREAMCEDJS-DirectJobCancel* <verbatim> [ale@cream-48 ~]$ /usr/libexec/grid-monitoring/probes/emi.cream/CREAMCEDJS-probe --vo dteam -x /tmp/x509up_u501 -H cream-30.pd.infn.it -m emi.cream.CREAMCEDJS-DirectJobCancel OK: job cancelled OK: job cancelled Testing from: cream-48.pd.infn.it DN: /C=IT/O=INFN/OU=Personal Certificate/L=Padova/CN=Alessio Gianelle/CN=proxy VOMS FQANs: /dteam/Role=NULL/Capability=NULL, /dteam/NGI_IT/Role=NULL/Capability=NULL Job cancellation request sent: glite-ce-job-cancel --noint https://cream-30.pd.infn.it:8443/CREAM226348631 Job bookkeeping files deleted. </verbatim> You can check the manually if the final status of the job is _CANCELLED_ as expected: <verbatim> [ale@cream-48 ~]$ glite-ce-job-status https://cream-30.pd.infn.it:8443/CREAM226348631 ****** JobID=[https://cream-30.pd.infn.it:8443/CREAM226348631] Status = [CANCELLED] ExitCode = [] Description = [Cancelled by user] </verbatim> ---+++ DelegateProxy To test delegation use this command: */usr/libexec/grid-monitoring/probes/emi.cream/CREAMCEDJS-probe --vo <vo> -x <path of the proxy> -H <CREAM hostname> -m emi.cream.CREAMCEDJS-DelegateProxy* <verbatim> [ale@cream-48 ~]$ /usr/libexec/grid-monitoring/probes/emi.cream/CREAMCEDJS-probe --vo dteam -x /tmp/x509up_u501 -H cream-30.pd.infn.it -m emi.cream.CREAMCEDJS-DelegateProxy OK: [Delegated] OK: [Delegated] glite-ce-delegate-proxy -e cream-30.pd.infn.it:8443 dteam-551a6 2011-11-10 13:40:24,178 NOTICE - Proxy with delegation id [dteam-551a6] succesfully delegated to endpoint [https://cream-30.pd.infn.it:8443//ce-cream/services/gridsite-delegation] </verbatim> You can check if the delegation is correct submitting a job using the returned delegation id <verbatim> [ale@cream-48 ~]$ glite-ce-job-submit -D dteam-551a6 -r cream-30.pd.infn.it:8443/cream-pbs-cert test.jdl https://cream-30.pd.infn.it:8443/CREAM290551353 [ale@cream-48 ~]$ glite-ce-job-status https://cream-30.pd.infn.it:8443/CREAM290551353 ****** JobID=[https://cream-30.pd.infn.it:8443/CREAM290551353] Status = [DONE-OK] ExitCode = [0] </verbatim> ---+++ ServiceInfo To verify this metric simply issue this command: */usr/libexec/grid-monitoring/probes/emi.cream/CREAMCEDJS-probe --vo <vo> -x <path of the proxy> -H <CREAM hostname> -m emi.cream.CREAMCEDJS-ServiceInfo* <verbatim> [ale@cream-48 ~]$ /usr/libexec/grid-monitoring/probes/emi.cream/CREAMCEDJS-probe --vo dteam -x /tmp/x509up_u501 -H cream-30.pd.infn.it -m emi.cream.CREAMCEDJS-ServiceInfo OK: success OK: success Testing from: cream-48.pd.infn.it DN: /C=IT/O=INFN/OU=Personal Certificate/L=Padova/CN=Alessio Gianelle/CN=proxy VOMS FQANs: /dteam/Role=NULL/Capability=NULL, /dteam/NGI_IT/Role=NULL/Capability=NULL success description = CREAM 2 doesAcceptNewJobSubmissions = True interfaceVersion = 2.1 property = [(Property){ name = "cemon_url" value = "NA" }] serviceVersion = 1.13 startupTime = 2011-10-10 14:44:12.000638 status = RUNNING </verbatim> ---+++ SubmitAllowed To verify this metric simply issue this command: */usr/libexec/grid-monitoring/probes/emi.cream/CREAMCEDJS-probe --vo <vo> -x <path of the proxy> -H <CREAM hostname> -m emi.cream.CREAMCEDJS-SubmitAllowed* <verbatim> [ale@cream-48 ~]$ /usr/libexec/grid-monitoring/probes/emi.cream/CREAMCEDJS-probe --vo dteam -x /tmp/x509up_u501 -H cream-30.pd.infn.it -m emi.cream.CREAMCEDJS-SubmitAllowed OK: [Submission Allowed] OK: [Submission Allowed] glite-ce-allowed-submission cream-30.pd.infn.it:8443 Job Submission to this CREAM CE is enabled </verbatim> You can also disable submission to the CREAM CE (you MUST be an _admin_ for this CE) <verbatim> [ale@cream-48 ~]$ glite-ce-disable-submission cream-30.pd.infn.it Operation for disabling new submissions succeeded </verbatim> Then verify that the metrics returns the correct message: <verbatim> [ale@cream-48 ~]$ /usr/libexec/grid-monitoring/probes/emi.cream/CREAMCEDJS-probe --vo dteam -x /tmp/x509up_u501 -H cream-30.pd.infn.it -m emi.cream.CREAMCEDJS-SubmitAllowed OK: [Submission Allowed] OK: [Submission Allowed] glite-ce-allowed-submission cream-30.pd.infn.it:8443 Job Submission to this CREAM CE is disabled </verbatim>
E
dit
|
A
ttach
|
PDF
|
H
istory
: r5
<
r4
<
r3
<
r2
<
r1
|
B
acklinks
|
V
iew topic
|
M
ore topic actions
Topic revision: r5 - 2011-11-10
-
AlessioGianelle
Home
Site map
CEMon web
CREAM web
Cloud web
Cyclops web
DGAS web
EgeeJra1It web
Gows web
GridOversight web
IGIPortal web
IGIRelease web
MPI web
Main web
MarcheCloud web
MarcheCloudPilotaCNAF web
Middleware web
Operations web
Sandbox web
Security web
SiteAdminCorner web
TWiki web
Training web
UserSupport web
VOMS web
WMS web
WMSMonitor web
WeNMR web
General Doc
Functional Description
Batch System Support
CREAM and Information Service
Release Notes
Known Issues
Security in CREAM
Nagios Probes to monitor CREAM and WN
Papers
Presentations
User Doc
CREAM User Guide for EMI-1
CREAM User Guide for EMI-2
CREAM User Guide for EMI-3
CREAM JDL Guide
BLAH User Guide
Troubleshooting Guide
System Administrator Doc
System Administrator Guide for CREAM (EMI-3 release)
System Administrator Guide for CREAM (EMI-2 release)
System Administrator Guide for CREAM (EMI-1 release)
The CREAM configuration file
The CEMonitor configuration file
The CREAM CE Service Reference Card (EMI-2 release)
The CREAM CE Service Reference Card (EMI-1 release)
Batch System related documentation
Troubleshooting Guide
The guide for integrating EMIR in CREAM
]
Developers Doc
CREAM Client API C++ Documentation
CREAM Client API for Python
Other Doc
Contacts
Moving to CREAM from LCG-CE
Testing
Internal Collaboration Information
Credits
CREAM Web utilities
Create New Topic
Index
Search
Changes
Notifications
RSS Feed
Statistics
Preferences
P
P
View
Raw View
Print version
Find backlinks
History
More topic actions
Edit
Raw edit
Attach file or image
Edit topic preference settings
Set new parent
More topic actions
Account
Log In
E
dit
A
ttach
Copyright © 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki?
Send feedback