Normal
jobs through Dag
jobs through: Collection
jobs through: Parametric
jobs through: [ JobType = "parametric"; Executable = "/usr/bin/env"; Environment = {"MYPATH_PARAM_=$PATH:/bin:/usr/bin:$HOME"}; StdOutput = "echo_PARAM_.out"; StdError = "echo_PARAM_.err"; OutputSandbox = {"echo_PARAM_.out","echo_PARAM_.err"}; Parameters = 5; usertags = [ jdl = "parametric" ]; ]
Bulk
jobs sent both through ICE and JC and RetryCount = 0; : Perusal
jobs through: MPICH
jobs: BrokerInfo
Resubmission
Job Recovery
Prologue
and Epilogue
jobs [root@wms007 jobdir]# service gLite stop [...] [root@wms007 workload_manager]# pwd /var/glite/workload_manager [root@wms007 workload_manager]# ls ismdump.fl jobdir [root@wms007 workload_manager]# rm -rf jobdir [root@wms007 workload_manager]# ls ismdump.fl
[root@wms007 workload_manager]# /opt/glite/etc/init.d/glite-wms-lm start Starting LogMonitor... [ OK ] [root@wms007 workload_manager]# ls ismdump.fl jobdir [root@wms007 workload_manager]# /opt/glite/etc/init.d/glite-wms-lm status Logmonitor running...
[root@wms007 jobcontrol]# pwd /var/glite/jobcontrol [root@wms007 jobcontrol]# rm -rf jobdir [root@wms007 jobcontrol]# ls condorio submit
[root@wms007 jobcontrol]# /opt/glite/etc/init.d/glite-wms-jc start JobController Starting JobController daemon(s) Starting JobController... [ OK ] [root@wms007 jobcontrol]# ls condorio jobdir lock submit [root@wms007 ice]# /opt/glite/etc/init.d/glite-wms-jc status JobController JobController running in pid: 3625
[root@wms007 ice]# pwd /var/glite/ice [root@wms007 ice]# ls jobdir persist_dir [root@wms007 ice]# rm -rf jobdir/ [root@wms007 ice]# ls persist_dir
[root@wms007 ice]# /opt/glite/etc/init.d/glite-wms-ice start starting ICE... ok [root@wms007 ice]# ls jobdir persist_dir [root@wms007 ice]# /opt/glite/etc/init.d/glite-wms-ice status /opt/glite/bin/glite-wms-ice-safe (pid 22783) is running...
[root@wms007 glite]# ls workload_manager/ jobcontrol/ ice/ ice/: persist_dir jobcontrol/: condorio submit workload_manager/: ismdump.fl
[root@wms007 glite]# /opt/glite/etc/init.d/glite-wms-wm start starting workload manager... ok [root@wms007 glite]# ls workload_manager/ jobcontrol/ ice/ ice/: jobdir persist_dir jobcontrol/: condorio jobdir submit workload_manager/: ismdump.fl jobdir [root@wms007 glite]# /opt/glite/etc/init.d/glite-wms-wm status /opt/glite/bin/glite-wms-workload_manager (pid 23259) is running...
[root@wms007 workload_manager]# /opt/glite/etc/init.d/glite-wms-jc start JobController Starting !JobController daemon(s) Please set Input parameter in glite_wms.conf - JC section [FAILED] [root@wms007 workload_manager]# /opt/glite/etc/init.d/glite-wms-jc status JobController JobController stopped.
[root@wms007 workload_manager]# /opt/glite/etc/init.d/glite-wms-lm start Starting LogMonitor...Please set Input parameter in glite_wms.conf - WM section [FAILED] [root@wms007 workload_manager]# /opt/glite/etc/init.d/glite-wms-lm status LogMonitor stopped.
[root@wms007 workload_manager]# /opt/glite/etc/init.d/glite-wms-ice start starting ICE... failure [root@wms007 workload_manager]# /opt/glite/etc/init.d/glite-wms-ice status /opt/glite/bin/glite-wms-ice-safe is not running
[root@wms007 workload_manager]# /opt/glite/etc/init.d/glite-wms-wm start starting workload manager... Please set Input parameter in - WM section Please set DispatcherType parameter in - WM section Please set Input parameter in - JC section Please set InputType parameter in - JC section Please set Input parameter in - ICE section Please set InputType parameter in - ICE section failure [root@wms007 workload_manager]# /opt/glite/etc/init.d/glite-wms-wm status /opt/glite/bin/glite-wms-workload_manager is not running
2010-03-22 16:47:50,496 INFO - scoped_timer iceCommandEventQuery::execute() - SOAP Connection for QueryEvent - TID=[150673032] 1269272870.288498 1269272870.496129 0.207631 2010-03-22 16:47:50,496 DEBUG - iceCommandEventQuery::execute() - TID=[150673032] There're [2] event(s) for the couple DN [/C=IT/O=INFN/OU=Personal Certificate/L=Padova/CN=Alessio Gianelle-/dteam/Role=NULL/Capability=NULL] CEUrl [https://cream-30.pd.infn.it:8443/ce-cream/services/CREAM2] 2010-03-22 16:47:50,496 DEBUG - iceCommandEventQuery::execute() - TID=[150673032] Database ID=[1261041182000] 2010-03-22 16:47:50,496 DEBUG - iceCommandEventQuery::execute() - TID=[150673032] Exec time ID=[3] 2010-03-22 16:47:50,496 DEBUG - iceCommandEventQuery::processEventsForJob() - TID=[150673032] Processing [2] event(s) for Job [gridJobID="https://devel17.cnaf.infn.it:9000/uKbQNcbh7kIohBz6bDMNZQ" CREAMJobID="https://cream-30.pd.infn.it:8443/CREAM396193798"] userdn [/C=IT/O=INFN/OU=Personal Certificate/L=Padova/CN=Alessio Gianelle-/dteam/Role=NULL/Capability=NULL] and ce url [https://cream-30.pd.infn.it:8443/ce-cream/services/CREAM2]. 2010-03-22 16:47:50,496 DEBUG - iceCommandEventQuery::processEventsForJob() - TID=[150673032] EventID [685143] timestsamp [1269272804] 2010-03-22 16:47:50,496 INFO - scoped_timer iceCommandEventQuery::processSingleEvent - TID=[150673032] InsertStat 1269272870.496682 1269272870.496864 0.000182
2010-03-22 16:55:55,397 INFO - scoped_timer iceCommandEventQuery::execute() - SOAP Connection for QueryEvent - TID=[150673032] 1269273355.242918 1269273355.397806 0.154888 2010-03-22 16:55:55,397 ERROR - iceCommandEventQuery::execute() - TID=[150673032] Cannot query events for UserDN [/C=IT/O=INFN/OU=Personal Certificate/L=Padova/CN=Alessio Gianelle-/dteam/Role=NULL/Capability=NULL] CEUrl [https://cream-34.pd.infn.it:8443/ce-cream/services/CREAM2]. Exception Internal ex is [Received NULL fault; the error is due to another cause: FaultString=[No such operation 'QueryEventRequest'] - FaultCode=["http://xml.apache.org/axis/":Client] - FaultSubCode=["http://xml.apache.org/axis/":Client] - FaultDetail=[<ns2:hostname>cream-34.pd.infn.it</ns2:hostname>]] 2010-03-22 16:55:55,398 WARN - iceCommandEventQuery::execute() - TID=[150673032] Not present QueryEvent on CE [https://cream-34.pd.infn.it:8443/ce-cream/services/CREAM2]. Falling back to old-style StatusPoller. 2010-03-22 16:55:55,398 INFO - iceCommandStatusPoller::execute() - Getting [100] jobs to poll for user [/C=IT/O=INFN/OU=Personal Certificate/L=Padova/CN=Alessio Gianelle-/dteam/Role=NULL/Capability=NULL] creamurl [https://cream-34.pd.infn.it:8443/ce-cream/services/CREAM2] 2010-03-22 16:55:55,398 DEBUG - iceCommandStatusPoller::get_jobs_to_poll() - Collecting jobs to poll for userdn=[/C=IT/O=INFN/OU=Personal Certificate/L=Padova/CN=Alessio Gianelle-/dteam/Role=NULL/Capability=NULL] creamurl=[https://cream-34.pd.infn.it:8443/ce-cream/services/CREAM2]. LIMIT set to [100]...
[root@wms007 ~]# ps ax | grep ice 1283 pts/2 S+ 0:00 grep ice 30985 ? Ss 0:00 /opt/glite/bin/glite-wms-ice-safe --conf glite_wms.conf --daemon /var/glite/glite-wms-ice-safe.pid 30989 ? S 0:00 sh -c /opt/glite/bin/glite-wms-ice --conf glite_wms.conf > /var/log/glite/ice_console.log 2>&1 30990 ? Sl 0:15 /opt/glite/bin/glite-wms-ice --conf glite_wms.conf [root@wms007 ~]# /opt/glite/etc/init.d/glite-wms-ice stop stopping ICE... ok [root@wms007 ~]# ps ax | grep ice 1321 pts/2 S+ 0:00 grep ice [root@wms007 ~]# /opt/glite/etc/init.d/glite-wms-ice start starting ICE... ok [root@wms007 ~]# ps ax | grep ice 1353 ? Ss 0:00 /opt/glite/bin/glite-wms-ice-safe --conf glite_wms.conf --daemon /var/glite/glite-wms-ice-safe.pid 1357 ? S 0:00 sh -c /opt/glite/bin/glite-wms-ice --conf glite_wms.conf > /var/log/glite/ice_console.log 2>&1 1358 ? Sl 0:00 /opt/glite/bin/glite-wms-ice --conf glite_wms.conf 1398 pts/2 S+ 0:00 grep ice [root@wms007 ~]# /opt/glite/etc/init.d/glite-wms-ice restart stopping ICE... ok starting ICE... ok [root@wms007 ~]# ps ax | grep ice 1433 ? Ss 0:00 /opt/glite/bin/glite-wms-ice-safe --conf glite_wms.conf --daemon /var/glite/glite-wms-ice-safe.pid 1437 ? S 0:00 sh -c /opt/glite/bin/glite-wms-ice --conf glite_wms.conf > /var/log/glite/ice_console.log 2>&1 1438 ? Sl 0:00 /opt/glite/bin/glite-wms-ice --conf glite_wms.conf 1470 pts/2 S+ 0:00 grep ice
[root@wms007 ~]# ll /var/glite/ice/persist_dir/ice.db -rw-r--r-- 1 glite glite 1280000 Mar 22 17:05 /var/glite/ice/persist_dir/ice.db [root@wms007 ~]# /opt/glite/bin/queryDb -c glite_wms.conf -s RUNNING,REALLY_RUNNING 0 item(s) found [root@wms007 ~]# ll /var/glite/ice/persist_dir/ice.db -rw-r--r-- 1 glite glite 1280000 Mar 22 17:05 /var/glite/ice/persist_dir/ice.db
[root@wms007 ~]# /opt/glite/etc/init.d/glite-wms-ice status /opt/glite/bin/glite-wms-ice-safe (pid 1433) is running... [root@wms007 ~]# /opt/glite/etc/init.d/glite-wms-ice stop stopping ICE... ok [root@wms007 ~]# ps ax | grep ice 19866 pts/2 S+ 0:00 grep ice [root@wms007 ~]# /opt/glite/etc/init.d/glite-wms-ice start starting ICE... ok [root@wms007 ~]# ps ax | grep ice 19899 ? Ss 0:00 /opt/glite/bin/glite-wms-ice-safe --conf glite_wms.conf --daemon /var/glite/glite-wms-ice-safe.pid 19903 ? S 0:00 sh -c /opt/glite/bin/glite-wms-ice --conf glite_wms.conf > /var/log/glite/ice_console.log 2>&1 19904 ? Sl 0:00 /opt/glite/bin/glite-wms-ice --conf glite_wms.conf 19932 pts/2 S+ 0:00 grep ice [root@wms007 ~]# /opt/glite/etc/init.d/glite-wms-ice stop stopping ICE... ok [root@wms007 ~]# ps ax | grep ice 19978 pts/2 S+ 0:00 grep ice [root@wms007 ~]# /opt/glite/etc/init.d/glite-wms-ice start starting ICE... ok [root@wms007 ~]# ps ax | grep ice 20009 ? Ss 0:00 /opt/glite/bin/glite-wms-ice-safe --conf glite_wms.conf --daemon /var/glite/glite-wms-ice-safe.pid 20013 ? S 0:00 sh -c /opt/glite/bin/glite-wms-ice --conf glite_wms.conf > /var/log/glite/ice_console.log 2>&1 20014 ? Sl 0:00 /opt/glite/bin/glite-wms-ice --conf glite_wms.conf 20046 pts/2 S+ 0:00 grep ice
2010-03-23 10:20:37,696 INFO - iceLBLogger::logEvent() - Job Done Failed Event, ExitCode=[0], FailureReason=[Proxy is expired; /opt/glite/bin/glite-lb-logevent: edg_wll_LogEvent*(): LB server (bkserver,lbproxy) store protocol error (edg_wll_LogEvent(): LB server (bkserver,lbproxy) store protocol error;; Logging library ERROR: LB server (bkserver,lbproxy) store protocol error;; edg_wll_DoLogEvent(): edg_wll_log_connect error Transport endpoint is not connected;; edg_wll_gss_connect();; System Error: Connection refused) /opt/glite/bin/glite-lb-logevent: edg_wll_LogEvent*(): LB server (bkserver,lbproxy) store protocol error (edg_wll_LogEvent(): LB server (bkserver,lbproxy) store protocol error;; Logging library ERROR: LB server (bkserver,lbproxy) store protocol error;; edg_wll_DoLogEvent(): edg_wll_log_connect error Transport endpoint is not connected;; edg_wll_gss_connect();; System Error: Connection refused) Proxy expired: job killed Terminated Master process killed] - [gridJobID="https://devel17.cnaf.infn.it:9000/jw2aeAy1skHY3mRJHCF8YA" CREAMJobID="https://ce202.cern.ch:8443/CREAM030114428"] 2010-03-23 10:20:37,817 DEBUG - iceLBContext::testCode() - L&B call succeeded. 2010-03-23 10:20:37,828 ERROR - Ice::resubmit_job() - Will NOT resubmit job [gridJobID="https://devel17.cnaf.infn.it:9000/jw2aeAy1skHY3mRJHCF8YA" CREAMJobID="https://ce202.cern.ch:8443/CREAM030114428"] because it's Input Sandbox proxy file is not valid: The proxy is EXPIRED! 2010-03-23 10:20:37,828 INFO - iceLBContext::setLoggingJob - Setting log job to jobid=[https://devel17.cnaf.infn.it:9000/jw2aeAy1skHY3mRJHCF8YA] LB server=[devel17.cnaf.infn.it:9000] (port is not used, actually...) 2010-03-23 10:20:37,828 INFO - iceLBLogger::logEvent() - Job Aborted Event, reason=[Input sandbox's proxy is missing. Cannot resubmit job] - [gridJobID="https://devel17.cnaf.infn.it:9000/jw2aeAy1skHY3mRJHCF8YA" CREAMJobID="https://ce202.cern.ch:8443/CREAM030114428"]
************************************************************* BOOKKEEPING INFORMATION: Status info for the Job : https://devel17.cnaf.infn.it:9000/jw2aeAy1skHY3mRJHCF8YA Current Status: Aborted Logged Reason(s): - Proxy is expired; /opt/glite/bin/glite-lb-logevent: edg_wll_LogEvent*(): LB server (bkserver,lbproxy) store protocol error (edg_wll_LogEvent(): LB server (bkserver,lbproxy) store protocol error;; Logging library ERROR: LB server (bkserver,lbproxy) store protocol error;; edg_wll_DoLogEvent(): edg_wll_log_connect error Transport endpoint is not connected;; edg_wll_gss_connect();; System Error: Connection refused) /opt/glite/bin/glite-lb-logevent: edg_wll_LogEvent*(): LB server (bkserver,lbproxy) store protocol error (edg_wll_LogEvent(): LB server (bkserver,lbproxy) store protocol error;; Logging library ERROR: LB server (bkserver,lbproxy) store protocol error;; edg_wll_DoLogEvent(): edg_wll_log_connect error Transport endpoint is not connected;; edg_wll_gss_connect();; System Error: Connection refused) Proxy expired: job killed Terminated Master process killed Status Reason: Input sandbox's proxy is missing. Cannot resubmit job Destination: ce202.cern.ch:8443/cream-lsf-grid_dteam Submitted: Tue Mar 23 09:49:42 2010 CET *************************************************************
2010-03-23 15:55:43,604 DEBUG - iceCommandSubmit::try_to_submit() - TID=[168434952] Going to START CreamJobID [https://cream -32.pd.infn.it:8443/CREAM036926381] related to GridJobID [https://devel17.cnaf.infn.it:9000/iM8C3YV12fwhvIG5mNip5Q]...
2010-03-23 15:55:45,760 DEBUG - ICE VersionID is [Fri Mar 19 13:53:17 CET 2010] ProcessID=[23579] 2010-03-23 15:55:45,760 INFO - glite-wms-ice::main() - Host certificate is [/home/glite/.certs/hostcert.pem] 2010-03-23 15:55:45,817 DEBUG - iceThreadPool::iceThreadPool(ICE Submission Pool) - Creating 10 worker threads 2010-03-23 15:55:45,819 DEBUG - iceThreadPool::iceThreadPool(ICE Poller Pool) - Creating 5 worker threads [...] 2010-03-23 15:55:48,967 INFO - iceCommandSubmit::execute() - TID=[144321160] This request is a Submission... 2010-03-23 15:55:48,968 INFO - iceCommandSubmit::try_to_submit() - TID=[144321160] GridJobID [https://devel17.cnaf.infn.it:9 000/iM8C3YV12fwhvIG5mNip5Q] has already been REGISTERED. Will only START it... 2010-03-23 15:55:48,968 DEBUG - iceCommandSubmit::try_to_submit() - TID=[144321160] Going to START CreamJobID [https://cream -32.pd.infn.it:8443/CREAM036926381] related to GridJobID [https://devel17.cnaf.infn.it:9000/iM8C3YV12fwhvIG5mNip5Q]... 2010-03-23 15:55:49,154 INFO - iceLBContext::setLoggingJob - Setting log job to jobid=[https://devel17.cnaf.infn.it:9000/iM8C 3YV12fwhvIG5mNip5Q] LB server=[devel17.cnaf.infn.it:9000] (port is not used, actually...) 2010-03-23 15:55:49,155 INFO - iceLBLogger::logEvent() - Cream Transfer OK Event - [gridJobID="https://devel17.cnaf.infn.it:9 000/iM8C3YV12fwhvIG5mNip5Q" CREAMJobID="https://cream-32.pd.infn.it:8443/CREAM036926381"]
mysql> select * from events where jobid="YFyqjw3FF-BO-0U5BxCOtA"; +------------------------+-------+------+-----------------+---------------------+---------------------+----------------------------------+--------+-------+---------------------+ | jobid | event | code | prog | host | time_stamp | userid | usec | level | arrived | +------------------------+-------+------+-----------------+---------------------+---------------------+----------------------------------+--------+-------+---------------------+ | YFyqjw3FF-BO-0U5BxCOtA | 0 | 5 | WorkloadManager | wms007.cnaf.infn.it | 2010-03-24 12:04:39 | bdd27610035bb0ec9287e2ecaa3da2eb | 394848 | 8 | 2010-03-24 12:04:39 | | YFyqjw3FF-BO-0U5BxCOtA | 1 | 15 | WorkloadManager | wms007.cnaf.infn.it | 2010-03-24 12:04:39 | bdd27610035bb0ec9287e2ecaa3da2eb | 548652 | 8 | 2010-03-24 12:04:39 | | YFyqjw3FF-BO-0U5BxCOtA | 2 | 4 | WorkloadManager | wms007.cnaf.infn.it | 2010-03-24 12:04:39 | bdd27610035bb0ec9287e2ecaa3da2eb | 608084 | 8 | 2010-03-24 12:04:39 | | YFyqjw3FF-BO-0U5BxCOtA | 3 | 4 | WorkloadManager | wms007.cnaf.infn.it | 2010-03-24 12:04:39 | bdd27610035bb0ec9287e2ecaa3da2eb | 657231 | 8 | 2010-03-24 12:04:39 | +------------------------+-------+------+-----------------+---------------------+---------------------+----------------------------------+--------+-------+---------------------+ 4 rows in set (0.00 sec)
mysql> select * from events where jobid="SlKOGSnaW0oKO3TJqw9tbA"; +------------------------+-------+------+-----------------+---------------------+---------------------+----------------------------------+--------+-------+---------------------+ | jobid | event | code | prog | host | time_stamp | userid | usec | level | arrived | +------------------------+-------+------+-----------------+---------------------+---------------------+----------------------------------+--------+-------+---------------------+ | SlKOGSnaW0oKO3TJqw9tbA | 0 | 17 | NetworkServer | wms007.cnaf.infn.it | 2010-03-24 12:09:53 | 3f82b966e8a77413044be1a9144a4af4 | 342720 | 8 | 2010-03-24 12:09:53 | | SlKOGSnaW0oKO3TJqw9tbA | 1 | 21 | NetworkServer | wms007.cnaf.infn.it | 2010-03-24 12:09:53 | 3f82b966e8a77413044be1a9144a4af4 | 470416 | 8 | 2010-03-24 12:09:53 | | SlKOGSnaW0oKO3TJqw9tbA | 2 | 21 | NetworkServer | wms007.cnaf.infn.it | 2010-03-24 12:09:53 | 3f82b966e8a77413044be1a9144a4af4 | 526402 | 8 | 2010-03-24 12:09:53 | | SlKOGSnaW0oKO3TJqw9tbA | 3 | 2 | NetworkServer | wms007.cnaf.infn.it | 2010-03-24 12:09:54 | 3f82b966e8a77413044be1a9144a4af4 | 606511 | 8 | 2010-03-24 12:09:54 | | SlKOGSnaW0oKO3TJqw9tbA | 4 | 4 | NetworkServer | wms007.cnaf.infn.it | 2010-03-24 12:09:54 | 3f82b966e8a77413044be1a9144a4af4 | 712100 | 8 | 2010-03-24 12:09:54 | | SlKOGSnaW0oKO3TJqw9tbA | 5 | 4 | NetworkServer | wms007.cnaf.infn.it | 2010-03-24 12:09:55 | 3f82b966e8a77413044be1a9144a4af4 | 43631 | 8 | 2010-03-24 12:09:55 | | SlKOGSnaW0oKO3TJqw9tbA | 6 | 5 | WorkloadManager | wms007.cnaf.infn.it | 2010-03-24 12:09:55 | bdd27610035bb0ec9287e2ecaa3da2eb | 167414 | 8 | 2010-03-24 12:09:55 | | SlKOGSnaW0oKO3TJqw9tbA | 7 | 15 | WorkloadManager | wms007.cnaf.infn.it | 2010-03-24 12:09:55 | bdd27610035bb0ec9287e2ecaa3da2eb | 297333 | 8 | 2010-03-24 12:09:55 | | SlKOGSnaW0oKO3TJqw9tbA | 8 | 4 | WorkloadManager | wms007.cnaf.infn.it | 2010-03-24 12:09:55 | bdd27610035bb0ec9287e2ecaa3da2eb | 369636 | 8 | 2010-03-24 12:09:55 | | SlKOGSnaW0oKO3TJqw9tbA | 9 | 4 | WorkloadManager | wms007.cnaf.infn.it | 2010-03-24 12:09:55 | bdd27610035bb0ec9287e2ecaa3da2eb | 431565 | 8 | 2010-03-24 12:09:55 | | SlKOGSnaW0oKO3TJqw9tbA | 10 | 5 | JobController | wms007.cnaf.infn.it | 2010-03-24 12:09:55 | bdd27610035bb0ec9287e2ecaa3da2eb | 745052 | 8 | 2010-03-24 12:09:55 | | SlKOGSnaW0oKO3TJqw9tbA | 11 | 1 | LogMonitor | wms007.cnaf.infn.it | 2010-03-24 12:09:55 | bdd27610035bb0ec9287e2ecaa3da2eb | 846002 | 8 | 2010-03-24 12:09:55 | | SlKOGSnaW0oKO3TJqw9tbA | 12 | 1 | LogMonitor | wms007.cnaf.infn.it | 2010-03-24 12:10:04 | bdd27610035bb0ec9287e2ecaa3da2eb | 869424 | 8 | 2010-03-24 12:10:04 | | SlKOGSnaW0oKO3TJqw9tbA | 13 | 8 | LogMonitor | wms007.cnaf.infn.it | 2010-03-24 12:11:39 | bdd27610035bb0ec9287e2ecaa3da2eb | 94855 | 8 | 2010-03-24 12:11:39 | | SlKOGSnaW0oKO3TJqw9tbA | 14 | 25 | LogMonitor | wms007.cnaf.infn.it | 2010-03-24 12:11:39 | bdd27610035bb0ec9287e2ecaa3da2eb | 181448 | 8 | 2010-03-24 12:11:39 | | SlKOGSnaW0oKO3TJqw9tbA | 15 | 10 | LogMonitor | wms007.cnaf.infn.it | 2010-03-24 12:11:39 | bdd27610035bb0ec9287e2ecaa3da2eb | 250291 | 8 | 2010-03-24 12:11:39 | +------------------------+-------+------+-----------------+---------------------+---------------------+----------------------------------+--------+-------+---------------------+ 16 rows in set (0.00 sec)
2010-03-24 13:40:38,128 ERROR - iceCommandEventQuery::execute() - TID=[159321352] Cannot query events for UserDN [/C=IT/O=INFN/OU=Personal Certificate/L=Padova/CN=Alessio Gianelle-/dteam/Role=NULL/Capability=NULL] CEUrl [https://cream-33.pd.infn.it:8443/ce-cream/services/CREAM2]. Exception Internal ex is [Received NULL fault; the error is due to another cause: FaultString=[No such operation 'QueryEventRequest'] - FaultCode=["http://xml.apache.org/axis/":Client] - FaultSubCode=["http://xml.apache.org/axis/":Client] - FaultDetail=[<ns2:hostname>cream-33.pd.infn.it</ns2:hostname>]] 2010-03-24 13:40:38,128 WARN - iceCommandEventQuery::execute() - TID=[159321352] Not present QueryEvent on CE [https://cream-33.pd.infn.it:8443/ce-cream/services/CREAM2]. Falling back to old-style StatusPoller. 2010-03-24 13:40:38,128 INFO - iceCommandStatusPoller::execute() - Getting [100] jobs to poll for user [/C=IT/O=INFN/OU=Personal Certificate/L=Padova/CN=Alessio Gianelle-/dteam/Role=NULL/Capability=NULL] creamurl [https://cream-33.pd.infn.it:8443/ce-cream/services/CREAM2] 2010-03-24 13:40:38,128 DEBUG - iceCommandStatusPoller::get_jobs_to_poll() - Collecting jobs to poll for userdn=[/C=IT/O=INFN/OU=Personal Certificate/L=Padova/CN=Alessio Gianelle-/dteam/Role=NULL/Capability=NULL] creamurl=[https://cream-33.pd.infn.it:8443/ce-cream/services/CREAM2]. LIMIT set to [100]... 2010-03-24 13:40:38,129 DEBUG - iceCommandStatusPoller::get_jobs_to_poll() - Finished collecting jobs to poll. [5] jobs are to poll [...]
[ale@cream-15 UI]$ glite-wms-job-status -v 0 -i testo --noint ************************************************************* BOOKKEEPING INFORMATION: Status info for the Job : https://devel17.cnaf.infn.it:9000/tt3GLYuIiHuwrmnl7fGVtA Current Status: Done (Success) ************************************************************* BOOKKEEPING INFORMATION: Status info for the Job : https://devel17.cnaf.infn.it:9000/lY9fdOgQk5RcaH99g23z5g Current Status: Done (Success) ************************************************************* BOOKKEEPING INFORMATION: Status info for the Job : https://devel17.cnaf.infn.it:9000/jta5KlBZEP-r2KbE0SB0Vw Current Status: Done (Success) ************************************************************* BOOKKEEPING INFORMATION: Status info for the Job : https://devel17.cnaf.infn.it:9000/TNqI_PbRyqgFAN3L52IpKQ Current Status: Done (Success) ************************************************************* BOOKKEEPING INFORMATION: Status info for the Job : https://devel17.cnaf.infn.it:9000/V7Pnv2yE47CdHKgQmRaIvQ Current Status: Done (Success) *************************************************************
2010-03-24 15:58:40,753 ERROR - CreamProxyMethod::execute() - Connection timed out to CREAM: "EOF detected during communicati on. Probably service closed connection or SOCKET TIMEOUT occurred." on try 3/3. Blacklisting endpoint and giving up. 2010-03-24 15:58:40,753 DEBUG - CEBlackList::blacklist_endpoint() - Blacklisting CE https://cream-25.pd.infn.it:8443/ce-cream /services/gridsite-delegation until Wed Mar 24 16:08:40 2010
2010-03-24 16:05:28,952 DEBUG - eventStatusPoller::body() - Adding EventQuery command for couple (/C=IT/O=INFN/OU=Personal Ce rtificate/L=Padova/CN=Alessio Gianelle-/dteam/Role=NULL/Capability=NULL, https://cream-25.pd.infn.it:8443/ce-cream/services/C REAM2) to the thread pool...
2010-03-24 15:58:43,265 DEBUG - Delegation_manager::delegate() - Creating new delegation with delegation id [12694427232E2651 16wms0072Ecnaf2Einfn2Eit] CREAM URL [https://cream-25.pd.infn.it:8443/ce-cream/services/CREAM2] Delegation URL [https://cream -25.pd.infn.it:8443/ce-cream/services/gridsite-delegation] user DN [/C=IT/O=INFN/OU=Personal Certificate/L=Padova/CN=Alessio Gianelle-/dteam/Role=NULL/Capability=NULL] proxy hash [/C=IT/O=INFN/OU=Personal Certificate/L=Padova/CN=Alessio Gianelle-/dte am/Role=NULL/Capability=NULL] MyProxy Server [myproxy.cern.ch] Expiring on [Thu Mar 25 12:54:02 2010] 2010-03-24 15:58:43,265 DEBUG - CEBlackList::is_blacklisted() - CE https://cream-25.pd.infn.it:8443/ce-cream/services/gridsit e-delegation is blacklisted until Wed Mar 24 16:08:40 2010 2010-03-24 15:58:43,265 ERROR - Delegation_manager::delegate() - FAILED Creation of a new delegation with delegation id [1269 4427232E265116wms0072Ecnaf2Einfn2Eit] CREAM URL [https://cream-25.pd.infn.it:8443/ce-cream/services/CREAM2] Delegation URL [h ttps://cream-25.pd.infn.it:8443/ce-cream/services/gridsite-delegation] user DN [/C=IT/O=INFN/OU=Personal Certificate/L=Padova /CN=Alessio Gianelle-/dteam/Role=NULL/Capability=NULL] proxy hash [/C=IT/O=INFN/OU=Personal Certificate/L=Padova/CN=Alessio G ianelle-/dteam/Role=NULL/Capability=NULL] MyProxy Server [myproxy.cern.ch] - ERROR is: [The endpoint is blacklisted] 2010-03-24 15:58:43,265 ERROR - iceCommandSubmit::execute() - TID=[159308760] Error during submission of jdl= Fatal Exceptio n is:Failed to create a delegation id for job https://devel17.cnaf.infn.it:9000/UoVsvjIj1CPluHb81xM_pQ: reason is The endpoin t is blacklisted
[root@wms007 jobdir]# chmod 111 new/ [root@wms007 jobdir]# ls -l total 48 d--x--x--x 2 glite glite 40960 Mar 24 16:13 new drwxr-xr-x 2 glite glite 4096 Mar 24 16:13 old drwxr-xr-x 2 glite glite 4096 Mar 24 16:13 tmp
2010-03-25 09:45:39,545 ERROR - Request_source_jobdir::get_requests() - Error returned by method jobDir::new_entries(): boost::filesystem::directory_iterator constructor: "/var/glite/ice/jobdir/new": Permission denied 2010-03-25 09:45:40,546 ERROR - Request_source_jobdir::get_requests() - Error returned by method jobDir::new_entries(): boost::filesystem::directory_iterator constructor: "/var/glite/ice/jobdir/new": Permission denied
![]() |
![]() |
|
![]() |
|
![]() |