Difference: WmsTestsP3621 (1 vs. 75)

Revision 752010-06-21 - ElisabettaMolinari

Line: 1 to 1
 

Certification report patch 3621

Line: 18 to 18
 

Upgrade from production

Changed:
<
<
  • Starting from a Production WMS we update it.
>
>
  • Starting from a Production WMS we update it.
 

Test Report

The test report has been produced following the guidelines from here
Line: 1100 to 1100
 
Added:
>
>
 
META FILEATTACHMENT attachment="collection-toICE" attr="" comment="" date="1272363349" name="collection-toICE" path="collection-toICE" size="5756" stream="collection-toICE" tmpFilename="/usr/tmp/CGItemp8330" user="ElisabettaMolinari" version="2"
META FILEATTACHMENT attachment="collection-toLcg" attr="" comment="" date="1272362565" name="collection-toLcg" path="collection-toLcg" size="5232" stream="collection-toLcg" tmpFilename="/usr/tmp/CGItemp5069" user="ElisabettaMolinari" version="1"
META FILEATTACHMENT attachment="parametric-toICE" attr="" comment="" date="1272367585" name="parametric-toICE" path="parametric-toICE" size="9982" stream="parametric-toICE" tmpFilename="/usr/tmp/CGItemp5056" user="ElisabettaMolinari" version="1"
Line: 1127 to 1129
 
META FILEATTACHMENT attachment="yum_install_wms" attr="" comment="yum_install_wms" date="1276856016" name="yum_install_wms" path="yum_install_wms" size="49684" stream="yum_install_wms" tmpFilename="/usr/tmp/CGItemp5115" user="ElisabettaMolinari" version="1"
META FILEATTACHMENT attachment="yaim_conf_wms_log" attr="" comment="yaim_conf_wms" date="1276856163" name="yaim_conf_wms_log" path="yaim_conf_wms_log" size="9023" stream="yaim_conf_wms_log" tmpFilename="/usr/tmp/CGItemp5102" user="ElisabettaMolinari" version="1"
META FILEATTACHMENT attachment="test_upgrade_wms" attr="" comment="test wms upgrade" date="1276867402" name="test_upgrade_wms" path="test_upgrade_wms" size="28824" stream="test_upgrade_wms" tmpFilename="/usr/tmp/CGItemp9416" user="ElisabettaMolinari" version="1"
Added:
>
>
META FILEATTACHMENT attachment="yaim_conf_overwrite_true" attr="" comment="yaim_conf_wms_overwrite" date="1277116697" name="yaim_conf_overwrite_true" path="yaim_conf_overwrite_true" size="28672" stream="yaim_conf_overwrite_true" tmpFilename="/usr/tmp/CGItemp8283" user="ElisabettaMolinari" version="1"

Revision 742010-06-18 - ElisabettaMolinari

Line: 1 to 1
 

Certification report patch 3621

Line: 18 to 18
 

Upgrade from production

Changed:
<
<
  • Starting from a Production WMS we update it.
>
>
  • Starting from a Production WMS we update it.
 

Test Report

The test report has been produced following the guidelines from here
Line: 1098 to 1098
 
Added:
>
>
 
META FILEATTACHMENT attachment="collection-toICE" attr="" comment="" date="1272363349" name="collection-toICE" path="collection-toICE" size="5756" stream="collection-toICE" tmpFilename="/usr/tmp/CGItemp8330" user="ElisabettaMolinari" version="2"
META FILEATTACHMENT attachment="collection-toLcg" attr="" comment="" date="1272362565" name="collection-toLcg" path="collection-toLcg" size="5232" stream="collection-toLcg" tmpFilename="/usr/tmp/CGItemp5069" user="ElisabettaMolinari" version="1"
META FILEATTACHMENT attachment="parametric-toICE" attr="" comment="" date="1272367585" name="parametric-toICE" path="parametric-toICE" size="9982" stream="parametric-toICE" tmpFilename="/usr/tmp/CGItemp5056" user="ElisabettaMolinari" version="1"
Line: 1124 to 1126
 
META FILEATTACHMENT attachment="test_recovery" attr="" comment="test recovery" date="1276681444" name="test_recovery" path="test_recovery" size="49001" stream="test_recovery" tmpFilename="/usr/tmp/CGItemp11264" user="ElisabettaMolinari" version="1"
META FILEATTACHMENT attachment="yum_install_wms" attr="" comment="yum_install_wms" date="1276856016" name="yum_install_wms" path="yum_install_wms" size="49684" stream="yum_install_wms" tmpFilename="/usr/tmp/CGItemp5115" user="ElisabettaMolinari" version="1"
META FILEATTACHMENT attachment="yaim_conf_wms_log" attr="" comment="yaim_conf_wms" date="1276856163" name="yaim_conf_wms_log" path="yaim_conf_wms_log" size="9023" stream="yaim_conf_wms_log" tmpFilename="/usr/tmp/CGItemp5102" user="ElisabettaMolinari" version="1"
Added:
>
>
META FILEATTACHMENT attachment="test_upgrade_wms" attr="" comment="test wms upgrade" date="1276867402" name="test_upgrade_wms" path="test_upgrade_wms" size="28824" stream="test_upgrade_wms" tmpFilename="/usr/tmp/CGItemp9416" user="ElisabettaMolinari" version="1"

Revision 732010-06-18 - AlessioGianelle

Line: 1 to 1
 

Certification report patch 3621

Line: 1052 to 1052
 /opt/glite/yaim/functions/config_condor_wms: setValue NORDUGRID_GAHP "\$(SBIN)/nordugrid_gahp"
Added:
>
>
  • Bug #68891: ICE falls into an infinite loop when a job has expired proxy and has been submitted to a CREAM without EventQuery FIXED NOT CERTIFIED
    • It is not easy to reproduce the problem.
 -- AlessioGianelle - 2010-02-05

Revision 722010-06-18 - ElisabettaMolinari

Line: 1 to 1
 

Certification report patch 3621

Line: 8 to 8
 Outcome: Certified

Clean installation

Changed:
<
<
  • copied registered repo for wms_3_2_14_5 into '/etc/yum.repos.d':
    wget http://etics-repository.cern.ch/repository/pm/registered/repomd/id/7c25a3e0-f9aa-4ad0-9c74-e52ed3166562/slc4_ia32_gcc346
  • launched 'yum install glite-WMS', yum install log is here
>
>
  • copied registered repo for glite-wms_R_3_2_14_6 into '/etc/yum.repos.d':
    wget http://etics-repository.cern.ch/repository/pm/registered/repomd/id/5e0f9d1b-de35-48ec-ad71-3b32a55f2b46/slc4_ia32_gcc346/etics-registered-build-by-id.repo
  • launched 'yum install glite-WMS', yum install log is here
 
  • copied /opt/glite/yaim/examples/siteinfo/site-info.def into ~/siteinfo/site-info.def and /opt/glite/yaim/examples/siteinfo/services/glite-wms into ~/siteinfo/services/glite-wms
  • launched 'yum install lcg-CA'
  • copied host certificate and key into '/etc/grid-security'
  • launched yaim configuration '/opt/glite/yaim/bin/yaim -c -s site-info.def -n glite-WMS'
Changed:
<
<
  • yaim configuration log file is here
>
>
  • yaim configuration log file is here
 

Upgrade from production

Line: 1090 to 1090
 
Added:
>
>

 
META FILEATTACHMENT attachment="collection-toICE" attr="" comment="" date="1272363349" name="collection-toICE" path="collection-toICE" size="5756" stream="collection-toICE" tmpFilename="/usr/tmp/CGItemp8330" user="ElisabettaMolinari" version="2"
META FILEATTACHMENT attachment="collection-toLcg" attr="" comment="" date="1272362565" name="collection-toLcg" path="collection-toLcg" size="5232" stream="collection-toLcg" tmpFilename="/usr/tmp/CGItemp5069" user="ElisabettaMolinari" version="1"
META FILEATTACHMENT attachment="parametric-toICE" attr="" comment="" date="1272367585" name="parametric-toICE" path="parametric-toICE" size="9982" stream="parametric-toICE" tmpFilename="/usr/tmp/CGItemp5056" user="ElisabettaMolinari" version="1"
Line: 1114 to 1118
 
META FILEATTACHMENT attachment="deepresub.txt" attr="" comment="Deer resub repo" date="1276266541" name="deepresub.txt" path="deepresub.txt" size="5192" stream="deepresub.txt" tmpFilename="/usr/tmp/CGItemp10251" user="AlessioGianelle" version="1"
META FILEATTACHMENT attachment="dag.txt" attr="" comment="Dag repo" date="1276610005" name="dag.txt" path="dag.txt" size="9411" stream="dag.txt" tmpFilename="/usr/tmp/CGItemp14958" user="AlessioGianelle" version="1"
META FILEATTACHMENT attachment="test_recovery" attr="" comment="test recovery" date="1276681444" name="test_recovery" path="test_recovery" size="49001" stream="test_recovery" tmpFilename="/usr/tmp/CGItemp11264" user="ElisabettaMolinari" version="1"
Added:
>
>
META FILEATTACHMENT attachment="yum_install_wms" attr="" comment="yum_install_wms" date="1276856016" name="yum_install_wms" path="yum_install_wms" size="49684" stream="yum_install_wms" tmpFilename="/usr/tmp/CGItemp5115" user="ElisabettaMolinari" version="1"
META FILEATTACHMENT attachment="yaim_conf_wms_log" attr="" comment="yaim_conf_wms" date="1276856163" name="yaim_conf_wms_log" path="yaim_conf_wms_log" size="9023" stream="yaim_conf_wms_log" tmpFilename="/usr/tmp/CGItemp5102" user="ElisabettaMolinari" version="1"

Revision 712010-06-16 - AlessioGianelle

Line: 1 to 1
 

Certification report patch 3621

Author(s): Elisabetta Molinari & Alessio Gianelle

Changed:
<
<
Outcome: in certification...
>
>
Outcome: Certified
 

Clean installation

  • copied registered repo for wms_3_2_14_5 into '/etc/yum.repos.d':
    wget http://etics-repository.cern.ch/repository/pm/registered/repomd/id/7c25a3e0-f9aa-4ad0-9c74-e52ed3166562/slc4_ia32_gcc346

Revision 702010-06-16 - ElisabettaMolinari

Line: 1 to 1
 

Certification report patch 3621

Line: 379 to 379
 
  • Job Recovery
    • Tested with a few collections re-starting the wm while some node jobs are still in a 'submitted or 'waiting' status Yes / Done
Added:
>
>
 
  • Prologue and Epilogue jobs Yes / Done
Line: 1087 to 1088
 
Added:
>
>
 
META FILEATTACHMENT attachment="collection-toICE" attr="" comment="" date="1272363349" name="collection-toICE" path="collection-toICE" size="5756" stream="collection-toICE" tmpFilename="/usr/tmp/CGItemp8330" user="ElisabettaMolinari" version="2"
META FILEATTACHMENT attachment="collection-toLcg" attr="" comment="" date="1272362565" name="collection-toLcg" path="collection-toLcg" size="5232" stream="collection-toLcg" tmpFilename="/usr/tmp/CGItemp5069" user="ElisabettaMolinari" version="1"
META FILEATTACHMENT attachment="parametric-toICE" attr="" comment="" date="1272367585" name="parametric-toICE" path="parametric-toICE" size="9982" stream="parametric-toICE" tmpFilename="/usr/tmp/CGItemp5056" user="ElisabettaMolinari" version="1"
Line: 1110 to 1113
 
META FILEATTACHMENT attachment="shallowresub.txt" attr="" comment="Shallow repo" date="1276258119" name="shallowresub.txt" path="shallowresub.txt" size="5100" stream="shallowresub.txt" tmpFilename="/usr/tmp/CGItemp13767" user="AlessioGianelle" version="2"
META FILEATTACHMENT attachment="deepresub.txt" attr="" comment="Deer resub repo" date="1276266541" name="deepresub.txt" path="deepresub.txt" size="5192" stream="deepresub.txt" tmpFilename="/usr/tmp/CGItemp10251" user="AlessioGianelle" version="1"
META FILEATTACHMENT attachment="dag.txt" attr="" comment="Dag repo" date="1276610005" name="dag.txt" path="dag.txt" size="9411" stream="dag.txt" tmpFilename="/usr/tmp/CGItemp14958" user="AlessioGianelle" version="1"
Added:
>
>
META FILEATTACHMENT attachment="test_recovery" attr="" comment="test recovery" date="1276681444" name="test_recovery" path="test_recovery" size="49001" stream="test_recovery" tmpFilename="/usr/tmp/CGItemp11264" user="ElisabettaMolinari" version="1"

Revision 692010-06-15 - AlessioGianelle

Line: 1 to 1
 

Certification report patch 3621

Line: 821 to 821
 
  • Bug #59240: [ICE] abort reasons not always printed in its logfile FIXED NOT CERTIFIED
Changed:
<
<
  • Bug #59399: [ICE] doesn't correctly handle request in jobdir/old when it is restarted FIXED
>
>
  • Bug #59339: [ICE] doesn't correctly handle request in jobdir/old when it is restarted FIXED
 
    • Verify submitting a big collection to cream CEs, and then restarting ICE in the middle of the submit process:
      2010-03-23 15:55:43,604 DEBUG - iceCommandSubmit::try_to_submit() -  TID=[168434952] Going to START CreamJobID [https://cream
      -32.pd.infn.it:8443/CREAM036926381] related to GridJobID [https://devel17.cnaf.infn.it:9000/iM8C3YV12fwhvIG5mNip5Q]...

Revision 682010-06-15 - AlessioGianelle

Line: 1 to 1
 

Certification report patch 3621

Line: 300 to 300
 

DAG jobs

  • Dag jobs through:
    • JC work: Yes / Done
Added:
>
>
 

Collection jobs

  • Collection jobs through:
Line: 379 to 380
 
  • Job Recovery
    • Tested with a few collections re-starting the wm while some node jobs are still in a 'submitted or 'waiting' status Yes / Done
Changed:
<
<
  • Prologue and Epilogue jobs
    • ICE: Yes / Done
    • JC: Yes / Done
>
>
  • Prologue and Epilogue jobs Yes / Done
 


Line: 1110 to 1109
 
META FILEATTACHMENT attachment="epilogprolog.txt" attr="" comment="EpilogProlog" date="1276250095" name="epilogprolog.txt" path="epilogprolog.txt" size="5833" stream="epilogprolog.txt" tmpFilename="/usr/tmp/CGItemp10400" user="AlessioGianelle" version="1"
META FILEATTACHMENT attachment="shallowresub.txt" attr="" comment="Shallow repo" date="1276258119" name="shallowresub.txt" path="shallowresub.txt" size="5100" stream="shallowresub.txt" tmpFilename="/usr/tmp/CGItemp13767" user="AlessioGianelle" version="2"
META FILEATTACHMENT attachment="deepresub.txt" attr="" comment="Deer resub repo" date="1276266541" name="deepresub.txt" path="deepresub.txt" size="5192" stream="deepresub.txt" tmpFilename="/usr/tmp/CGItemp10251" user="AlessioGianelle" version="1"
Added:
>
>
META FILEATTACHMENT attachment="dag.txt" attr="" comment="Dag repo" date="1276610005" name="dag.txt" path="dag.txt" size="9411" stream="dag.txt" tmpFilename="/usr/tmp/CGItemp14958" user="AlessioGianelle" version="1"

Revision 672010-06-14 - AlessioGianelle

Line: 1 to 1
 

Certification report patch 3621

Line: 386 to 386
 

Added:
>
>

Stress test

 
Added:
>
>

Description:

  • 1200 collections each of 20 jobs
  • One collection every 60 seconds
  • Four users
  • No requirements (so randomly submitted to all the production CEs)
  • Use automatic-delegation
  • The job is a "sleep random(172)"
  • Resubmission is enabled
  • Use 2 proxy renewal services (myproxy.cern.ch and myproxy.cnaf.infn.it) plus jobs without specification of the MyproxyServer (equally distributed)
  • Use a 3.2.0.12 LB Server (patch #4083)

Final results

  • 1172 collections submitted in 6935 seconds: 3/6/125 (min/avg/max)
    • 28 submissions fails (due to LB error)
  • Jobs correctly submitted: 23440 (90% to LCG CEs and 10% to CREAM CEs)
    • DONE OK: 23335 (99.52%)
    • ABORTED: 105 ( 0.48 %)
    • Resubmitted: 3067 ( 13.08 %)
  • Aborted reasons:
    • Job proxy is expired: 42 times (40 %)
    • (LB query failed): 57 times (54.4 %)
    • Removal retries exceeded: 4 times (3.8%)
    • hit job shallow retry count (3): 1 time (0.9%)
    • Submission to condor failed: 1 time (0.9%)
  • Failures (3377)
    • 7 an authentication operation failed (~63%)
    • File not available.Cannot read JobWrapper output, both from Condor and from Maradona. (~18%)
    • Transfer to CREAM failed due to exception (~12%)
    • Got a job held event (~5%)
    • ... others (~2%)



 

Check bugs

Revision 662010-06-11 - AlessioGianelle

Line: 1 to 1
 

Certification report patch 3621

Line: 374 to 374
 
    • Shallow: Yes / Done
    • Deep: Yes / Done
Added:
>
>
 
  • Job Recovery
    • Tested with a few collections re-starting the wm while some node jobs are still in a 'submitted or 'waiting' status Yes / Done
Line: 1074 to 1075
 
META FILEATTACHMENT attachment="yum_install_wms_log_good" attr="" comment="yum_install_wms_log" date="1276088974" name="yum_install_wms_log_good" path="yum_install_wms_log_good" size="49684" stream="yum_install_wms_log_good" tmpFilename="/usr/tmp/CGItemp12430" user="ElisabettaMolinari" version="1"
META FILEATTACHMENT attachment="epilogprolog.txt" attr="" comment="EpilogProlog" date="1276250095" name="epilogprolog.txt" path="epilogprolog.txt" size="5833" stream="epilogprolog.txt" tmpFilename="/usr/tmp/CGItemp10400" user="AlessioGianelle" version="1"
META FILEATTACHMENT attachment="shallowresub.txt" attr="" comment="Shallow repo" date="1276258119" name="shallowresub.txt" path="shallowresub.txt" size="5100" stream="shallowresub.txt" tmpFilename="/usr/tmp/CGItemp13767" user="AlessioGianelle" version="2"
Added:
>
>
META FILEATTACHMENT attachment="deepresub.txt" attr="" comment="Deer resub repo" date="1276266541" name="deepresub.txt" path="deepresub.txt" size="5192" stream="deepresub.txt" tmpFilename="/usr/tmp/CGItemp10251" user="AlessioGianelle" version="1"

Revision 652010-06-11 - AlessioGianelle

Line: 1 to 1
 

Certification report patch 3621

Line: 372 to 372
 
  • Resubmission
    • Shallow: Yes / Done
Added:
>
>
 
    • Deep: Yes / Done

  • Job Recovery
Line: 1072 to 1073
 
META FILEATTACHMENT attachment="mpirepo.txt" attr="" comment="Mpi repo" date="1276082788" name="mpirepo.txt" path="mpirepo.txt" size="6509" stream="mpirepo.txt" tmpFilename="/usr/tmp/CGItemp13326" user="AlessioGianelle" version="2"
META FILEATTACHMENT attachment="yum_install_wms_log_good" attr="" comment="yum_install_wms_log" date="1276088974" name="yum_install_wms_log_good" path="yum_install_wms_log_good" size="49684" stream="yum_install_wms_log_good" tmpFilename="/usr/tmp/CGItemp12430" user="ElisabettaMolinari" version="1"
META FILEATTACHMENT attachment="epilogprolog.txt" attr="" comment="EpilogProlog" date="1276250095" name="epilogprolog.txt" path="epilogprolog.txt" size="5833" stream="epilogprolog.txt" tmpFilename="/usr/tmp/CGItemp10400" user="AlessioGianelle" version="1"
Added:
>
>
META FILEATTACHMENT attachment="shallowresub.txt" attr="" comment="Shallow repo" date="1276258119" name="shallowresub.txt" path="shallowresub.txt" size="5100" stream="shallowresub.txt" tmpFilename="/usr/tmp/CGItemp13767" user="AlessioGianelle" version="2"

Revision 642010-06-11 - AlessioGianelle

Line: 1 to 1
 

Certification report patch 3621

Line: 380 to 380
 
  • Prologue and Epilogue jobs
    • ICE: Yes / Done
    • JC: Yes / Done
Added:
>
>
 

Line: 1070 to 1071
 
META FILEATTACHMENT attachment="updatelog.txt" attr="" comment="Update log" date="1276004495" name="updatelog.txt" path="updatelog.txt" size="23395" stream="updatelog.txt" tmpFilename="/usr/tmp/CGItemp7277" user="AlessioGianelle" version="1"
META FILEATTACHMENT attachment="mpirepo.txt" attr="" comment="Mpi repo" date="1276082788" name="mpirepo.txt" path="mpirepo.txt" size="6509" stream="mpirepo.txt" tmpFilename="/usr/tmp/CGItemp13326" user="AlessioGianelle" version="2"
META FILEATTACHMENT attachment="yum_install_wms_log_good" attr="" comment="yum_install_wms_log" date="1276088974" name="yum_install_wms_log_good" path="yum_install_wms_log_good" size="49684" stream="yum_install_wms_log_good" tmpFilename="/usr/tmp/CGItemp12430" user="ElisabettaMolinari" version="1"
Added:
>
>
META FILEATTACHMENT attachment="epilogprolog.txt" attr="" comment="EpilogProlog" date="1276250095" name="epilogprolog.txt" path="epilogprolog.txt" size="5833" stream="epilogprolog.txt" tmpFilename="/usr/tmp/CGItemp10400" user="AlessioGianelle" version="1"

Revision 632010-06-10 - AlessioGianelle

Line: 1 to 1
 

Certification report patch 3621

Line: 533 to 533
 
      • Owner = /C=IT/O=INFN/OU=Personal Certificate/L=Padova/CN=Alessio Gianelle
      • MyProxyServer = "myproxy.cnaf.infn.it";
Changed:
<
<
  • Bug #52937: ICE uses the wrong DN to log to LB TO VERIFY
>
>
  • Bug #52937: ICE uses the wrong DN to log to LB Hopefully FIXED
    • Submit a job to a cream CE and the check with a glite-wms-job-logging-info -v 2:
      [ale@cream-15 UI]$ glite-wms-job-logging-info -v 2 https://devel15.cnaf.infn.it:9000/RITl2CsEH_KRk2nEusZVKw | grep User
      - User                       =    /C=IT/O=INFN/OU=Personal Certificate/L=Padova/CN=Alessio Gianelle
      - User                       =    /C=IT/O=INFN/OU=Personal Certificate/L=Padova/CN=Alessio Gianelle
      - User                       =    /C=IT/O=INFN/OU=Personal Certificate/L=Padova/CN=Alessio Gianelle/CN=proxy/CN=proxy
      - User                       =    /C=IT/O=INFN/OU=Personal Certificate/L=Padova/CN=Alessio Gianelle/CN=proxy/CN=proxy
      - User                       =    /C=IT/O=INFN/OU=Personal Certificate/L=Padova/CN=Alessio Gianelle/CN=proxy/CN=proxy
      - User                       =    /C=IT/O=INFN/OU=Personal Certificate/L=Padova/CN=Alessio Gianelle/CN=proxy/CN=proxy
      - User                       =    /C=IT/O=INFN/OU=Personal Certificate/L=Padova/CN=Alessio Gianelle/CN=proxy/CN=proxy
      - User                       =    /C=IT/O=INFN/OU=Personal Certificate/L=Padova/CN=Alessio Gianelle/CN=proxy/CN=proxy
      - User                       =    /C=IT/O=INFN/OU=Personal Certificate/L=Padova/CN=Alessio Gianelle/CN=proxy/CN=proxy
      - User                       =    /C=IT/O=INFN/OU=Personal Certificate/L=Padova/CN=Alessio Gianelle/CN=proxy/CN=proxy
      - User                       =    /C=IT/O=INFN/OU=Personal Certificate/L=Padova/CN=Alessio Gianelle/CN=proxy/CN=proxy
      - User                       =    /C=IT/O=INFN/OU=Personal Certificate/L=Padova/CN=Alessio Gianelle/CN=proxy/CN=proxy
      - User                       =    /C=IT/O=INFN/OU=Personal Certificate/L=Padova/CN=Alessio Gianelle/CN=proxy/CN=proxy
      - User                       =    /C=IT/O=INFN/OU=Personal Certificate/L=Padova/CN=Alessio Gianelle/CN=proxy/CN=proxy
      - User                       =    /C=IT/O=INFN/OU=Personal Certificate/L=Padova/CN=Alessio Gianelle/CN=proxy/CN=proxy
      - User                       =    /C=IT/O=INFN/OU=Personal Certificate/L=Padova/CN=Alessio Gianelle/CN=proxy/CN=proxy
      - User                       =    /C=IT/O=INFN/OU=Personal Certificate/L=Padova/CN=Alessio Gianelle/CN=proxy/CN=proxy
      - User                       =    /C=IT/O=INFN/OU=Personal Certificate/L=Padova/CN=Alessio Gianelle/CN=proxy/CN=proxy
      - User                       =    /C=IT/O=INFN/OU=Personal Certificate/L=Padova/CN=Alessio Gianelle/CN=proxy/CN=proxy
      - User                       =    /C=IT/O=INFN/OU=Personal Certificate/L=Padova/CN=Alessio Gianelle/CN=proxy/CN=proxy
      - User                       =    /C=IT/O=INFN/OU=Personal Certificate/L=Padova/CN=Alessio Gianelle/CN=proxy/CN=proxy
      - User                       =    /C=IT/O=INFN/OU=Personal Certificate/L=Padova/CN=Alessio Gianelle/CN=proxy/CN=proxy
      - User                       =    /C=IT/O=INFN/OU=Personal Certificate/L=Padova/CN=Alessio Gianelle
      - User                       =    /C=IT/O=INFN/OU=Personal Certificate/L=Padova/CN=Alessio Gianelle
      - User                       =    /C=IT/O=INFN/OU=Personal Certificate/L=Padova/CN=Alessio Gianelle
      - User                       =    /C=IT/O=INFN/OU=Personal Certificate/L=Padova/CN=Alessio Gianelle/CN=proxy/CN=proxy
      - User                       =    /C=IT/O=INFN/OU=Personal Certificate/L=Padova/CN=Alessio Gianelle/CN=proxy/CN=proxy
      - User                       =    /C=IT/O=INFN/OU=Personal Certificate/L=Padova/CN=Alessio Gianelle/CN=proxy/CN=proxy
 
Changed:
<
<
  • Bug #53297: [ yaim-wms ] glite_wms.conf hardcoded parameters FIXED
>
>
  • Bug #53297: [ yaim-wms ] glite_wms.conf hardcoded parameters FIXED
 
    • tested by setting the parameter 'WMS_CONF_FILE_OVERWRITE' in the ~/siteinfo/services/glite-wms file
      • set the parameter 'WMS_CONF_FILE_OVERWRITE' to true: a backup copy of the glite_wms.conf file gets created in /opt/glite/etc/glite_wms.conf.bkp_20100608_101305 and the glite_wms.conf file gets overwritten
      • set the parameter 'WMS_CONF_FILE_OVERWRITE' to false: a new copy of the glite_wms.conf file gets created into /opt/glite/etc/glite_wms.conf.yaimnew_20100608_101633
Line: 978 to 1008
 ====================================================================================
Changed:
<
<
  • Bug #66986: ICE must be able to print out on file the stack trace trapping SIGSEGV, SIGILL, SIGABRT etc. TO VERIFY
>
>
  • Bug #66986: ICE must be able to print out on file the stack trace trapping SIGSEGV, SIGILL, SIGABRT etc. FIXED NOT CERTIFIED
 
  • Bug #67097: [yaim-wms] Removed lcg-condor-extra usage FIXED
    • checked NORDUGRID_GAHP is set to the right value:
      grep NORDUGRID_GAHP /opt/glite/yaim/functions/*

Revision 622010-06-09 - ElisabettaMolinari

Line: 1 to 1
 

Certification report patch 3621

Line: 9 to 9
 

Clean installation

  • copied registered repo for wms_3_2_14_5 into '/etc/yum.repos.d':
    wget http://etics-repository.cern.ch/repository/pm/registered/repomd/id/7c25a3e0-f9aa-4ad0-9c74-e52ed3166562/slc4_ia32_gcc346
Changed:
<
<
  • launched 'yum install glite-WMS', yum install log is here
>
>
  • launched 'yum install glite-WMS', yum install log is here
 
  • copied /opt/glite/yaim/examples/siteinfo/site-info.def into ~/siteinfo/site-info.def and /opt/glite/yaim/examples/siteinfo/services/glite-wms into ~/siteinfo/services/glite-wms
  • launched 'yum install lcg-CA'
Added:
>
>
  • copied host certificate and key into '/etc/grid-security'
 
  • launched yaim configuration '/opt/glite/yaim/bin/yaim -c -s site-info.def -n glite-WMS'
  • yaim configuration log file is here
Line: 1018 to 1019
 
Added:
>
>
 
META FILEATTACHMENT attachment="collection-toICE" attr="" comment="" date="1272363349" name="collection-toICE" path="collection-toICE" size="5756" stream="collection-toICE" tmpFilename="/usr/tmp/CGItemp8330" user="ElisabettaMolinari" version="2"
META FILEATTACHMENT attachment="collection-toLcg" attr="" comment="" date="1272362565" name="collection-toLcg" path="collection-toLcg" size="5232" stream="collection-toLcg" tmpFilename="/usr/tmp/CGItemp5069" user="ElisabettaMolinari" version="1"
META FILEATTACHMENT attachment="parametric-toICE" attr="" comment="" date="1272367585" name="parametric-toICE" path="parametric-toICE" size="9982" stream="parametric-toICE" tmpFilename="/usr/tmp/CGItemp5056" user="ElisabettaMolinari" version="1"
Line: 1036 to 1039
 
META FILEATTACHMENT attachment="brokerinfo-ICE" attr="" comment="" date="1273138279" name="brokerinfo-ICE" path="brokerinfo-ICE" size="2561" stream="brokerinfo-ICE" tmpFilename="/usr/tmp/CGItemp12612" user="ElisabettaMolinari" version="1"
META FILEATTACHMENT attachment="updatelog.txt" attr="" comment="Update log" date="1276004495" name="updatelog.txt" path="updatelog.txt" size="23395" stream="updatelog.txt" tmpFilename="/usr/tmp/CGItemp7277" user="AlessioGianelle" version="1"
META FILEATTACHMENT attachment="mpirepo.txt" attr="" comment="Mpi repo" date="1276082788" name="mpirepo.txt" path="mpirepo.txt" size="6509" stream="mpirepo.txt" tmpFilename="/usr/tmp/CGItemp13326" user="AlessioGianelle" version="2"
Added:
>
>
META FILEATTACHMENT attachment="yum_install_wms_log_good" attr="" comment="yum_install_wms_log" date="1276088974" name="yum_install_wms_log_good" path="yum_install_wms_log_good" size="49684" stream="yum_install_wms_log_good" tmpFilename="/usr/tmp/CGItemp12430" user="ElisabettaMolinari" version="1"

Revision 612010-06-09 - AlessioGianelle

Line: 1 to 1
 

Certification report patch 3621

Line: 1035 to 1035
 
META FILEATTACHMENT attachment="brokerinfo" attr="" comment="" date="1273063204" name="brokerinfo" path="brokerinfo" size="3169" stream="brokerinfo" tmpFilename="/usr/tmp/CGItemp12650" user="ElisabettaMolinari" version="1"
META FILEATTACHMENT attachment="brokerinfo-ICE" attr="" comment="" date="1273138279" name="brokerinfo-ICE" path="brokerinfo-ICE" size="2561" stream="brokerinfo-ICE" tmpFilename="/usr/tmp/CGItemp12612" user="ElisabettaMolinari" version="1"
META FILEATTACHMENT attachment="updatelog.txt" attr="" comment="Update log" date="1276004495" name="updatelog.txt" path="updatelog.txt" size="23395" stream="updatelog.txt" tmpFilename="/usr/tmp/CGItemp7277" user="AlessioGianelle" version="1"
Changed:
<
<
META FILEATTACHMENT attachment="mpirepo.txt" attr="" comment="Mpi repo" date="1276006777" name="mpirepo.txt" path="mpirepo.txt" size="4528" stream="mpirepo.txt" tmpFilename="/usr/tmp/CGItemp10372" user="AlessioGianelle" version="1"
>
>
META FILEATTACHMENT attachment="mpirepo.txt" attr="" comment="Mpi repo" date="1276082788" name="mpirepo.txt" path="mpirepo.txt" size="6509" stream="mpirepo.txt" tmpFilename="/usr/tmp/CGItemp13326" user="AlessioGianelle" version="2"

Revision 602010-06-08 - AlessioGianelle

Line: 1 to 1
 

Certification report patch 3621

Line: 17 to 17
 

Upgrade from production

Added:
>
>
  • Starting from a Production WMS we update it.
 

Test Report

The test report has been produced following the guidelines from here

List Match

Line: 339 to 341
 
    • ICE work: Yes / Done
Changed:
<
<
  • MPICH jobs: No
>
>
  • MPICH jobs: Yes / Done
 

Cancel

Line: 1030 to 1034
 
META FILEATTACHMENT attachment="bulk_jobs_toICE" attr="" comment="" date="1272977501" name="bulk_jobs_toICE" path="bulk_jobs_toICE" size="8004" stream="bulk_jobs_toICE" tmpFilename="/usr/tmp/CGItemp8332" user="ElisabettaMolinari" version="1"
META FILEATTACHMENT attachment="brokerinfo" attr="" comment="" date="1273063204" name="brokerinfo" path="brokerinfo" size="3169" stream="brokerinfo" tmpFilename="/usr/tmp/CGItemp12650" user="ElisabettaMolinari" version="1"
META FILEATTACHMENT attachment="brokerinfo-ICE" attr="" comment="" date="1273138279" name="brokerinfo-ICE" path="brokerinfo-ICE" size="2561" stream="brokerinfo-ICE" tmpFilename="/usr/tmp/CGItemp12612" user="ElisabettaMolinari" version="1"
Added:
>
>
META FILEATTACHMENT attachment="updatelog.txt" attr="" comment="Update log" date="1276004495" name="updatelog.txt" path="updatelog.txt" size="23395" stream="updatelog.txt" tmpFilename="/usr/tmp/CGItemp7277" user="AlessioGianelle" version="1"
META FILEATTACHMENT attachment="mpirepo.txt" attr="" comment="Mpi repo" date="1276006777" name="mpirepo.txt" path="mpirepo.txt" size="4528" stream="mpirepo.txt" tmpFilename="/usr/tmp/CGItemp10372" user="AlessioGianelle" version="1"

Revision 592010-06-08 - ElisabettaMolinari

Line: 1 to 1
 

Certification report patch 3621

Line: 8 to 8
 Outcome: in certification...

Clean installation

Added:
>
>
  • copied registered repo for wms_3_2_14_5 into '/etc/yum.repos.d':
    wget http://etics-repository.cern.ch/repository/pm/registered/repomd/id/7c25a3e0-f9aa-4ad0-9c74-e52ed3166562/slc4_ia32_gcc346
  • launched 'yum install glite-WMS', yum install log is here
  • copied /opt/glite/yaim/examples/siteinfo/site-info.def into ~/siteinfo/site-info.def and /opt/glite/yaim/examples/siteinfo/services/glite-wms into ~/siteinfo/services/glite-wms
  • launched 'yum install lcg-CA'
  • launched yaim configuration '/opt/glite/yaim/bin/yaim -c -s site-info.def -n glite-WMS'
  • yaim configuration log file is here
 

Upgrade from production

Line: 524 to 530
 
  • Bug #52937: ICE uses the wrong DN to log to LB TO VERIFY
Changed:
<
<
  • Bug #53297: [ yaim-wms ] glite_wms.conf hardcoded parameters TO VERIFY
>
>
  • Bug #53297: [ yaim-wms ] glite_wms.conf hardcoded parameters FIXED
    • tested by setting the parameter 'WMS_CONF_FILE_OVERWRITE' in the ~/siteinfo/services/glite-wms file
      • set the parameter 'WMS_CONF_FILE_OVERWRITE' to true: a backup copy of the glite_wms.conf file gets created in /opt/glite/etc/glite_wms.conf.bkp_20100608_101305 and the glite_wms.conf file gets overwritten
      • set the parameter 'WMS_CONF_FILE_OVERWRITE' to false: a new copy of the glite_wms.conf file gets created into /opt/glite/etc/glite_wms.conf.yaimnew_20100608_101633
 
  • Bug #53460: [ICE] Detection of job status changes for CREAM jobs should be improved FIXED
    • Using a new CE (1.6) looking in ice's log there is:
Line: 925 to 934
 -rw-r--r-- 1 dteam008 dteam 9.6M Apr 7 16:26 bigfile.tail -rw-r--r-- 1 dteam008 dteam 637 Apr 7 16:26 ls.out
Changed:
<
<
  • Bug #66721: Ineffective and never removed Job cancels TO VERIFY
>
>
  • Bug #66721: Ineffective and never removed Job cancels FIXED
    • submitted a job with the option '--register-only' as in the following:
      glite-wms-job-submit --config ../glite_wms_devel14.conf --register-only -a ../myjob.jdl 
 
Changed:
<
<
  • Bug #66986: ICE must be able to print out on file the stack trace trapping SIGSEGV, SIGILL, SIGABRT etc. TO VERIFY
>
>
Connecting to the service https://devel14.cnaf.infn.it:7443/glite_wms_wmproxy_server

================== glite-wms-job-submit Success ==================

The job has been successfully registered to the WMProxy Your job identifier is:

https://devel15.cnaf.infn.it:9000/mS8cXbg9szmcXsUETm8g2g

======================================================================

To complete the operation, the following file containing the InputSandbox of the job needs to be transferred: ====================================================================================================== ISB ZIP file : /tmp/ISBfiles_eCXKdK2egtqk7jZRlfFQpw_0.tar.gz Destination : gsiftp://devel14.cnaf.infn.it:2811/var/glite/SandboxDir/mS/https_3a_2f_2fdevel15.cnaf.infn.it_3a9000_2fmS8cXbg9szmcXsUETm8g2g/input/ISBfiles_eCXKdK2egtqk7jZRlfFQpw_0.tar.gz


 
Changed:
<
<
  • Bug #67097: [yaim-wms] Removed lcg-condor-extra usage TO VERIFY
>
>
then start the job by issuing a submissiong with the option: --start https://devel15.cnaf.infn.it:9000/mS8cXbg9szmcXsUETm8g2g
    • cancel the previously submitted job as in the following:
      glite-wms-job-cancel https://devel15.cnaf.infn.it:9000/mS8cXbg9szmcXsUETm8g2g
      
      Are you sure you want to remove specified job(s) [y/n]y : y
      
      Connecting to the service https://devel14.cnaf.infn.it:7443/glite_wms_wmproxy_server
      
      
      ============================= glite-wms-job-cancel Success =============================
      
      The cancellation request has been successfully submitted for the following job(s):
      
      - https://devel15.cnaf.infn.it:9000/mS8cXbg9szmcXsUETm8g2g
      
      ========================================================================================
      

  • Bug #66986: ICE must be able to print out on file the stack trace trapping SIGSEGV, SIGILL, SIGABRT etc. TO VERIFY
 
Added:
>
>
  • Bug #67097: [yaim-wms] Removed lcg-condor-extra usage FIXED
    • checked NORDUGRID_GAHP is set to the right value:
      grep NORDUGRID_GAHP /opt/glite/yaim/functions/*
      /opt/glite/yaim/functions/config_condor_wms:   setValue NORDUGRID_GAHP "\$(SBIN)/nordugrid_gahp"
 

-- AlessioGianelle - 2010-02-05

Revision 582010-06-07 - AlessioGianelle

Line: 1 to 1
 

Certification report patch 3621

Line: 522 to 522
 
      • Owner = /C=IT/O=INFN/OU=Personal Certificate/L=Padova/CN=Alessio Gianelle
      • MyProxyServer = "myproxy.cnaf.infn.it";
Added:
>
>
  • Bug #52937: ICE uses the wrong DN to log to LB TO VERIFY
 
  • Bug #53297: [ yaim-wms ] glite_wms.conf hardcoded parameters TO VERIFY

  • Bug #53460: [ICE] Detection of job status changes for CREAM jobs should be improved FIXED

Revision 572010-05-10 - AlessioGianelle

Line: 1 to 1
 

Certification report patch 3621

Line: 522 to 522
 
      • Owner = /C=IT/O=INFN/OU=Personal Certificate/L=Padova/CN=Alessio Gianelle
      • MyProxyServer = "myproxy.cnaf.infn.it";
Added:
>
>
  • Bug #53297: [ yaim-wms ] glite_wms.conf hardcoded parameters TO VERIFY
 
  • Bug #53460: [ICE] Detection of job status changes for CREAM jobs should be improved FIXED
    • Using a new CE (1.6) looking in ice's log there is:
      2010-03-22 16:47:50,496 INFO - scoped_timer iceCommandEventQuery::execute() - SOAP Connection for QueryEvent - TID=[150673032] 1269272870.288498 1269272870.496129 0.207631
Line: 921 to 923
 -rw-r--r-- 1 dteam008 dteam 9.6M Apr 7 16:26 bigfile.tail -rw-r--r-- 1 dteam008 dteam 637 Apr 7 16:26 ls.out
Changed:
<
<
  • Bug #66721: Ineffective and never removed Job cancels TO VERIFY
>
>
  • Bug #66721: Ineffective and never removed Job cancels TO VERIFY

  • Bug #66986: ICE must be able to print out on file the stack trace trapping SIGSEGV, SIGILL, SIGABRT etc. TO VERIFY

  • Bug #67097: [yaim-wms] Removed lcg-condor-extra usage TO VERIFY
 

Revision 562010-05-06 - ElisabettaMolinari

Line: 1 to 1
 

Certification report patch 3621

Line: 356 to 356
 

Others

  • BrokerInfo
Changed:
<
<
    • ICE creation Yes / Done
>
>
    • ICE creation Yes / Done test report here
 
    • JC creation: Yes / Done test report here

  • Resubmission
Line: 957 to 957
 
Added:
>
>
 
META FILEATTACHMENT attachment="collection-toICE" attr="" comment="" date="1272363349" name="collection-toICE" path="collection-toICE" size="5756" stream="collection-toICE" tmpFilename="/usr/tmp/CGItemp8330" user="ElisabettaMolinari" version="2"
META FILEATTACHMENT attachment="collection-toLcg" attr="" comment="" date="1272362565" name="collection-toLcg" path="collection-toLcg" size="5232" stream="collection-toLcg" tmpFilename="/usr/tmp/CGItemp5069" user="ElisabettaMolinari" version="1"
META FILEATTACHMENT attachment="parametric-toICE" attr="" comment="" date="1272367585" name="parametric-toICE" path="parametric-toICE" size="9982" stream="parametric-toICE" tmpFilename="/usr/tmp/CGItemp5056" user="ElisabettaMolinari" version="1"
Line: 972 to 974
 
META FILEATTACHMENT attachment="bulk_jobs" attr="" comment="bulk jobs to JC" date="1272963872" name="bulk_jobs" path="bulk_jobs" size="17035" stream="bulk_jobs" tmpFilename="/usr/tmp/CGItemp12989" user="ElisabettaMolinari" version="1"
META FILEATTACHMENT attachment="bulk_jobs_toICE" attr="" comment="" date="1272977501" name="bulk_jobs_toICE" path="bulk_jobs_toICE" size="8004" stream="bulk_jobs_toICE" tmpFilename="/usr/tmp/CGItemp8332" user="ElisabettaMolinari" version="1"
META FILEATTACHMENT attachment="brokerinfo" attr="" comment="" date="1273063204" name="brokerinfo" path="brokerinfo" size="3169" stream="brokerinfo" tmpFilename="/usr/tmp/CGItemp12650" user="ElisabettaMolinari" version="1"
Added:
>
>
META FILEATTACHMENT attachment="brokerinfo-ICE" attr="" comment="" date="1273138279" name="brokerinfo-ICE" path="brokerinfo-ICE" size="2561" stream="brokerinfo-ICE" tmpFilename="/usr/tmp/CGItemp12612" user="ElisabettaMolinari" version="1"

Revision 552010-05-05 - ElisabettaMolinari

Line: 1 to 1
 

Certification report patch 3621

Line: 357 to 357
 
  • BrokerInfo
    • ICE creation Yes / Done
Changed:
<
<
    • JC creation: Yes / Done
>
>
    • JC creation: Yes / Done test report here
 
  • Resubmission
    • Shallow: Yes / Done
Line: 955 to 955
 
Added:
>
>
 
META FILEATTACHMENT attachment="collection-toICE" attr="" comment="" date="1272363349" name="collection-toICE" path="collection-toICE" size="5756" stream="collection-toICE" tmpFilename="/usr/tmp/CGItemp8330" user="ElisabettaMolinari" version="2"
META FILEATTACHMENT attachment="collection-toLcg" attr="" comment="" date="1272362565" name="collection-toLcg" path="collection-toLcg" size="5232" stream="collection-toLcg" tmpFilename="/usr/tmp/CGItemp5069" user="ElisabettaMolinari" version="1"
META FILEATTACHMENT attachment="parametric-toICE" attr="" comment="" date="1272367585" name="parametric-toICE" path="parametric-toICE" size="9982" stream="parametric-toICE" tmpFilename="/usr/tmp/CGItemp5056" user="ElisabettaMolinari" version="1"
Line: 969 to 971
 
META FILEATTACHMENT attachment="listmatchdata.txt" attr="" comment="List Match with data" date="1272545642" name="listmatchdata.txt" path="listmatchdata.txt" size="3493" stream="listmatchdata.txt" tmpFilename="/usr/tmp/CGItemp10527" user="AlessioGianelle" version="1"
META FILEATTACHMENT attachment="bulk_jobs" attr="" comment="bulk jobs to JC" date="1272963872" name="bulk_jobs" path="bulk_jobs" size="17035" stream="bulk_jobs" tmpFilename="/usr/tmp/CGItemp12989" user="ElisabettaMolinari" version="1"
META FILEATTACHMENT attachment="bulk_jobs_toICE" attr="" comment="" date="1272977501" name="bulk_jobs_toICE" path="bulk_jobs_toICE" size="8004" stream="bulk_jobs_toICE" tmpFilename="/usr/tmp/CGItemp8332" user="ElisabettaMolinari" version="1"
Added:
>
>
META FILEATTACHMENT attachment="brokerinfo" attr="" comment="" date="1273063204" name="brokerinfo" path="brokerinfo" size="3169" stream="brokerinfo" tmpFilename="/usr/tmp/CGItemp12650" user="ElisabettaMolinari" version="1"

Revision 542010-05-04 - ElisabettaMolinari

Line: 1 to 1
 

Certification report patch 3621

Line: 324 to 324
 
    • Submit a bulk of 500 jobs -> success 99.9% Yes / Done both to ICE and JC
    • Submit a bulk of 1000 jobs -> success 99.9% Yes / Done both to ICE and JC
      • bulk test report to JC here
Added:
>
>
      • bulk test report to ICE here
 

Perusal jobs

  • Perusal jobs through:
Line: 952 to 953
 
Added:
>
>
 
META FILEATTACHMENT attachment="collection-toICE" attr="" comment="" date="1272363349" name="collection-toICE" path="collection-toICE" size="5756" stream="collection-toICE" tmpFilename="/usr/tmp/CGItemp8330" user="ElisabettaMolinari" version="2"
META FILEATTACHMENT attachment="collection-toLcg" attr="" comment="" date="1272362565" name="collection-toLcg" path="collection-toLcg" size="5232" stream="collection-toLcg" tmpFilename="/usr/tmp/CGItemp5069" user="ElisabettaMolinari" version="1"
META FILEATTACHMENT attachment="parametric-toICE" attr="" comment="" date="1272367585" name="parametric-toICE" path="parametric-toICE" size="9982" stream="parametric-toICE" tmpFilename="/usr/tmp/CGItemp5056" user="ElisabettaMolinari" version="1"
Line: 965 to 968
 
META FILEATTACHMENT attachment="cancel-collection-toJC" attr="" comment="" date="1272536482" name="cancel-collection-toJC" path="cancel-collection-toJC" size="3999" stream="cancel-collection-toJC" tmpFilename="/usr/tmp/CGItemp5031" user="ElisabettaMolinari" version="1"
META FILEATTACHMENT attachment="listmatchdata.txt" attr="" comment="List Match with data" date="1272545642" name="listmatchdata.txt" path="listmatchdata.txt" size="3493" stream="listmatchdata.txt" tmpFilename="/usr/tmp/CGItemp10527" user="AlessioGianelle" version="1"
META FILEATTACHMENT attachment="bulk_jobs" attr="" comment="bulk jobs to JC" date="1272963872" name="bulk_jobs" path="bulk_jobs" size="17035" stream="bulk_jobs" tmpFilename="/usr/tmp/CGItemp12989" user="ElisabettaMolinari" version="1"
Added:
>
>
META FILEATTACHMENT attachment="bulk_jobs_toICE" attr="" comment="" date="1272977501" name="bulk_jobs_toICE" path="bulk_jobs_toICE" size="8004" stream="bulk_jobs_toICE" tmpFilename="/usr/tmp/CGItemp8332" user="ElisabettaMolinari" version="1"

Revision 532010-05-04 - ElisabettaMolinari

Line: 1 to 1
 

Certification report patch 3621

Line: 323 to 323
 
    • Submit a bulk of 100 jobs -> success 100% Yes / Done both to ICE and JC
    • Submit a bulk of 500 jobs -> success 99.9% Yes / Done both to ICE and JC
    • Submit a bulk of 1000 jobs -> success 99.9% Yes / Done both to ICE and JC
Added:
>
>
      • bulk test report to JC here
 

Perusal jobs

  • Perusal jobs through:
Line: 949 to 950
 
Added:
>
>
 
META FILEATTACHMENT attachment="collection-toICE" attr="" comment="" date="1272363349" name="collection-toICE" path="collection-toICE" size="5756" stream="collection-toICE" tmpFilename="/usr/tmp/CGItemp8330" user="ElisabettaMolinari" version="2"
META FILEATTACHMENT attachment="collection-toLcg" attr="" comment="" date="1272362565" name="collection-toLcg" path="collection-toLcg" size="5232" stream="collection-toLcg" tmpFilename="/usr/tmp/CGItemp5069" user="ElisabettaMolinari" version="1"
META FILEATTACHMENT attachment="parametric-toICE" attr="" comment="" date="1272367585" name="parametric-toICE" path="parametric-toICE" size="9982" stream="parametric-toICE" tmpFilename="/usr/tmp/CGItemp5056" user="ElisabettaMolinari" version="1"
Line: 961 to 964
 
META FILEATTACHMENT attachment="cancel-collection-toICE" attr="" comment="" date="1272535687" name="cancel-collection-toICE" path="cancel-collection-toICE" size="3387" stream="cancel-collection-toICE" tmpFilename="/usr/tmp/CGItemp4924" user="ElisabettaMolinari" version="1"
META FILEATTACHMENT attachment="cancel-collection-toJC" attr="" comment="" date="1272536482" name="cancel-collection-toJC" path="cancel-collection-toJC" size="3999" stream="cancel-collection-toJC" tmpFilename="/usr/tmp/CGItemp5031" user="ElisabettaMolinari" version="1"
META FILEATTACHMENT attachment="listmatchdata.txt" attr="" comment="List Match with data" date="1272545642" name="listmatchdata.txt" path="listmatchdata.txt" size="3493" stream="listmatchdata.txt" tmpFilename="/usr/tmp/CGItemp10527" user="AlessioGianelle" version="1"
Added:
>
>
META FILEATTACHMENT attachment="bulk_jobs" attr="" comment="bulk jobs to JC" date="1272963872" name="bulk_jobs" path="bulk_jobs" size="17035" stream="bulk_jobs" tmpFilename="/usr/tmp/CGItemp12989" user="ElisabettaMolinari" version="1"

Revision 522010-05-03 - AlessioGianelle

Line: 1 to 1
 

Certification report patch 3621

Line: 919 to 919
 -rw-r--r-- 1 dteam008 dteam 9.6M Apr 7 16:26 bigfile.tail -rw-r--r-- 1 dteam008 dteam 637 Apr 7 16:26 ls.out
Added:
>
>
  • Bug #66721: Ineffective and never removed Job cancels TO VERIFY
 

-- AlessioGianelle - 2010-02-05

Revision 512010-04-29 - AlessioGianelle

Line: 1 to 1
 

Certification report patch 3621

Line: 189 to 189
 

List match with data

  • with data: Yes / Done
Added:
>
>
 

Submission/GetOutput

Normal Jobs

Line: 957 to 958
 
META FILEATTACHMENT attachment="jobcancel-toLcg" attr="" comment="" date="1272534665" name="jobcancel-toLcg" path="jobcancel-toLcg" size="3424" stream="jobcancel-toLcg" tmpFilename="/usr/tmp/CGItemp4773" user="ElisabettaMolinari" version="1"
META FILEATTACHMENT attachment="cancel-collection-toICE" attr="" comment="" date="1272535687" name="cancel-collection-toICE" path="cancel-collection-toICE" size="3387" stream="cancel-collection-toICE" tmpFilename="/usr/tmp/CGItemp4924" user="ElisabettaMolinari" version="1"
META FILEATTACHMENT attachment="cancel-collection-toJC" attr="" comment="" date="1272536482" name="cancel-collection-toJC" path="cancel-collection-toJC" size="3999" stream="cancel-collection-toJC" tmpFilename="/usr/tmp/CGItemp5031" user="ElisabettaMolinari" version="1"
Added:
>
>
META FILEATTACHMENT attachment="listmatchdata.txt" attr="" comment="List Match with data" date="1272545642" name="listmatchdata.txt" path="listmatchdata.txt" size="3493" stream="listmatchdata.txt" tmpFilename="/usr/tmp/CGItemp10527" user="AlessioGianelle" version="1"

Revision 502010-04-29 - ElisabettaMolinari

Line: 1 to 1
 

Certification report patch 3621

Line: 336 to 336
 
  • Normal jobs
    • ICE: Yes / Done
Added:
>
>
 
    • JC: Yes / Done
Added:
>
>
 
  • Dag: Yes / Done
Added:
>
>
 
    • Note that children nodes in status 'submitted' don't get cancelled
  • Collection
    • ICE: Yes / Done
Added:
>
>
 
    • JC: Yes / Done
Added:
>
>
 
  • Node of a collection: Yes / Done
    • Note: collections stay in status 'waiting' when all the nodes are Done (Success) except for one that is 'Cancelled'
Line: 931 to 936
 
Added:
>
>

 
META FILEATTACHMENT attachment="collection-toICE" attr="" comment="" date="1272363349" name="collection-toICE" path="collection-toICE" size="5756" stream="collection-toICE" tmpFilename="/usr/tmp/CGItemp8330" user="ElisabettaMolinari" version="2"
META FILEATTACHMENT attachment="collection-toLcg" attr="" comment="" date="1272362565" name="collection-toLcg" path="collection-toLcg" size="5232" stream="collection-toLcg" tmpFilename="/usr/tmp/CGItemp5069" user="ElisabettaMolinari" version="1"
META FILEATTACHMENT attachment="parametric-toICE" attr="" comment="" date="1272367585" name="parametric-toICE" path="parametric-toICE" size="9982" stream="parametric-toICE" tmpFilename="/usr/tmp/CGItemp5056" user="ElisabettaMolinari" version="1"
META FILEATTACHMENT attachment="parametric-toLcg" attr="" comment="" date="1272367999" name="parametric-toLcg" path="parametric-toLcg" size="9531" stream="parametric-toLcg" tmpFilename="/usr/tmp/CGItemp5115" user="ElisabettaMolinari" version="1"
META FILEATTACHMENT attachment="perusal-toICE" attr="" comment="" date="1272372340" name="perusal-toICE" path="perusal-toICE" size="4906" stream="perusal-toICE" tmpFilename="/usr/tmp/CGItemp5018" user="ElisabettaMolinari" version="1"
META FILEATTACHMENT attachment="perusal-toLcg" attr="" comment="" date="1272450102" name="perusal-toLcg" path="perusal-toLcg" size="5545" stream="perusal-toLcg" tmpFilename="/usr/tmp/CGItemp5152" user="ElisabettaMolinari" version="1"
Added:
>
>
META FILEATTACHMENT attachment="cancel-dag" attr="" comment="" date="1272534427" name="cancel-dag" path="cancel-dag" size="3644" stream="cancel-dag" tmpFilename="/usr/tmp/CGItemp4877" user="ElisabettaMolinari" version="1"
META FILEATTACHMENT attachment="jobcancel-toICE" attr="" comment="" date="1272534620" name="jobcancel-toICE" path="jobcancel-toICE" size="3320" stream="jobcancel-toICE" tmpFilename="/usr/tmp/CGItemp4925" user="ElisabettaMolinari" version="1"
META FILEATTACHMENT attachment="jobcancel-toLcg" attr="" comment="" date="1272534665" name="jobcancel-toLcg" path="jobcancel-toLcg" size="3424" stream="jobcancel-toLcg" tmpFilename="/usr/tmp/CGItemp4773" user="ElisabettaMolinari" version="1"
META FILEATTACHMENT attachment="cancel-collection-toICE" attr="" comment="" date="1272535687" name="cancel-collection-toICE" path="cancel-collection-toICE" size="3387" stream="cancel-collection-toICE" tmpFilename="/usr/tmp/CGItemp4924" user="ElisabettaMolinari" version="1"
META FILEATTACHMENT attachment="cancel-collection-toJC" attr="" comment="" date="1272536482" name="cancel-collection-toJC" path="cancel-collection-toJC" size="3999" stream="cancel-collection-toJC" tmpFilename="/usr/tmp/CGItemp5031" user="ElisabettaMolinari" version="1"

Revision 492010-04-28 - ElisabettaMolinari

Line: 1 to 1
 

Certification report patch 3621

Line: 326 to 326
 

Perusal jobs

  • Perusal jobs through:
    • JC work: Yes / Done
Added:
>
>
 
    • ICE work: Yes / Done
Line: 928 to 929
 
Added:
>
>
 
META FILEATTACHMENT attachment="collection-toICE" attr="" comment="" date="1272363349" name="collection-toICE" path="collection-toICE" size="5756" stream="collection-toICE" tmpFilename="/usr/tmp/CGItemp8330" user="ElisabettaMolinari" version="2"
META FILEATTACHMENT attachment="collection-toLcg" attr="" comment="" date="1272362565" name="collection-toLcg" path="collection-toLcg" size="5232" stream="collection-toLcg" tmpFilename="/usr/tmp/CGItemp5069" user="ElisabettaMolinari" version="1"
META FILEATTACHMENT attachment="parametric-toICE" attr="" comment="" date="1272367585" name="parametric-toICE" path="parametric-toICE" size="9982" stream="parametric-toICE" tmpFilename="/usr/tmp/CGItemp5056" user="ElisabettaMolinari" version="1"
META FILEATTACHMENT attachment="parametric-toLcg" attr="" comment="" date="1272367999" name="parametric-toLcg" path="parametric-toLcg" size="9531" stream="parametric-toLcg" tmpFilename="/usr/tmp/CGItemp5115" user="ElisabettaMolinari" version="1"
META FILEATTACHMENT attachment="perusal-toICE" attr="" comment="" date="1272372340" name="perusal-toICE" path="perusal-toICE" size="4906" stream="perusal-toICE" tmpFilename="/usr/tmp/CGItemp5018" user="ElisabettaMolinari" version="1"
Added:
>
>
META FILEATTACHMENT attachment="perusal-toLcg" attr="" comment="" date="1272450102" name="perusal-toLcg" path="perusal-toLcg" size="5545" stream="perusal-toLcg" tmpFilename="/usr/tmp/CGItemp5152" user="ElisabettaMolinari" version="1"

Revision 482010-04-27 - ElisabettaMolinari

Line: 1 to 1
 

Certification report patch 3621

Line: 323 to 323
 
    • Submit a bulk of 500 jobs -> success 99.9% Yes / Done both to ICE and JC
    • Submit a bulk of 1000 jobs -> success 99.9% Yes / Done both to ICE and JC
Added:
>
>

Perusal jobs

 
  • Perusal jobs through:
    • JC work: Yes / Done
    • ICE work: Yes / Done
Added:
>
>
 
  • MPICH jobs: No
Line: 924 to 926
 
Added:
>
>
 
META FILEATTACHMENT attachment="collection-toICE" attr="" comment="" date="1272363349" name="collection-toICE" path="collection-toICE" size="5756" stream="collection-toICE" tmpFilename="/usr/tmp/CGItemp8330" user="ElisabettaMolinari" version="2"
META FILEATTACHMENT attachment="collection-toLcg" attr="" comment="" date="1272362565" name="collection-toLcg" path="collection-toLcg" size="5232" stream="collection-toLcg" tmpFilename="/usr/tmp/CGItemp5069" user="ElisabettaMolinari" version="1"
META FILEATTACHMENT attachment="parametric-toICE" attr="" comment="" date="1272367585" name="parametric-toICE" path="parametric-toICE" size="9982" stream="parametric-toICE" tmpFilename="/usr/tmp/CGItemp5056" user="ElisabettaMolinari" version="1"
META FILEATTACHMENT attachment="parametric-toLcg" attr="" comment="" date="1272367999" name="parametric-toLcg" path="parametric-toLcg" size="9531" stream="parametric-toLcg" tmpFilename="/usr/tmp/CGItemp5115" user="ElisabettaMolinari" version="1"
Added:
>
>
META FILEATTACHMENT attachment="perusal-toICE" attr="" comment="" date="1272372340" name="perusal-toICE" path="perusal-toICE" size="4906" stream="perusal-toICE" tmpFilename="/usr/tmp/CGItemp5018" user="ElisabettaMolinari" version="1"

Revision 472010-04-27 - ElisabettaMolinari

Line: 1 to 1
 

Certification report patch 3621

Line: 299 to 299
 
    • also job-output for collections works even though only the parent node is set to 'Cleared'
Added:
>
>

Parametric jobs

 
  • Parametric jobs through:
    • ICE work: Yes / Done
Added:
>
>
 
    • JC work: Yes / Done
Added:
>
>
 
      • tested with the following
         [
          JobType = "parametric";
          Executable = "/usr/bin/env";
Line: 917 to 920
 
Added:
>
>

 
META FILEATTACHMENT attachment="collection-toICE" attr="" comment="" date="1272363349" name="collection-toICE" path="collection-toICE" size="5756" stream="collection-toICE" tmpFilename="/usr/tmp/CGItemp8330" user="ElisabettaMolinari" version="2"
META FILEATTACHMENT attachment="collection-toLcg" attr="" comment="" date="1272362565" name="collection-toLcg" path="collection-toLcg" size="5232" stream="collection-toLcg" tmpFilename="/usr/tmp/CGItemp5069" user="ElisabettaMolinari" version="1"
Added:
>
>
META FILEATTACHMENT attachment="parametric-toICE" attr="" comment="" date="1272367585" name="parametric-toICE" path="parametric-toICE" size="9982" stream="parametric-toICE" tmpFilename="/usr/tmp/CGItemp5056" user="ElisabettaMolinari" version="1"
META FILEATTACHMENT attachment="parametric-toLcg" attr="" comment="" date="1272367999" name="parametric-toLcg" path="parametric-toLcg" size="9531" stream="parametric-toLcg" tmpFilename="/usr/tmp/CGItemp5115" user="ElisabettaMolinari" version="1"

Revision 462010-04-27 - ElisabettaMolinari

Line: 1 to 1
 

Certification report patch 3621

Line: 191 to 191
 
  • with data: Yes / Done

Submission/GetOutput

Changed:
<
<
>
>

Normal Jobs

 
  • Normal jobs through
    • ICE work: Yes / Done
      • glite-wms-job-submit --config glite_wms_devel20.conf -a myjob-toICE.jdl
Line: 287 to 287
 err.log message.txt

Added:
>
>

DAG jobs

 
  • Dag jobs through:
    • JC work: Yes / Done
Added:
>
>

Collection jobs

 
  • Collection jobs through:
    • ICE work: Yes / Done
Added:
>
>
 
    • JC work: Yes / Done
Added:
>
>
 
    • also job-output for collections works even though only the parent node is set to 'Cleared'

  • Parametric jobs through:
Line: 906 to 910
 

-- AlessioGianelle - 2010-02-05

Added:
>
>

META FILEATTACHMENT attachment="collection-toICE" attr="" comment="" date="1272363349" name="collection-toICE" path="collection-toICE" size="5756" stream="collection-toICE" tmpFilename="/usr/tmp/CGItemp8330" user="ElisabettaMolinari" version="2"
META FILEATTACHMENT attachment="collection-toLcg" attr="" comment="" date="1272362565" name="collection-toLcg" path="collection-toLcg" size="5232" stream="collection-toLcg" tmpFilename="/usr/tmp/CGItemp5069" user="ElisabettaMolinari" version="1"

Revision 452010-04-27 - ElisabettaMolinari

Line: 1 to 1
 

Certification report patch 3621

Line: 283 to 283
 /tmp/jobOutput/emolinari_k0ue-prhJAvtAvjL2x_7Qg

============================================================================

Added:
>
>
ls /tmp/jobOutput/emolinari_k0ue-prhJAvtAvjL2x_7Qg err.log message.txt
 

  • Dag jobs through:

Revision 442010-04-26 - ElisabettaMolinari

Line: 1 to 1
 

Certification report patch 3621

Line: 12 to 12
 

Upgrade from production

Test Report

Changed:
<
<
>
>
The test report has been produced following the guidelines from here
 

List Match

Added:
>
>

List match without data

 
  • without data: Yes / Done
Added:
>
>
    • tried with the following
       cat myjob-toICE.jdl
      [
      Type = "Job";
      JobType = "normal";
      InputSandbox = { "file:///home/emolinari/test.sh"};
      VirtualOrganisation = "dteam";
      Executable="test.sh";
      Arguments="Hello ";
      Requirements = ( RegExp("/cream-",other.GlueCEUniqueID));
      Rank = 0;
      fuzzyrank = true;
      StdOutput="message.txt";
      StdError="err.log";
      OutputSandbox={"message.txt","err.log",".BrokerInfo"};
      usertags = [ jdl = "normal job to ICE" ];
      RetryCount = 0;
      ShallowRetryCount = 3;
      ]
       glite-wms-job-list-match --config glite_wms_devel20.conf -a myjob-toICE.jdl
      
      Connecting to the service https://devel20.cnaf.infn.it:7443/glite_wms_wmproxy_server
      
      ==========================================================================
      
                           COMPUTING ELEMENT IDs LIST
       The following CE(s) matching your job requirements have been found:
      
              *CEId*
       - atlas-creamce-01.roma1.infn.it:8443/cream-lsf-atlasgcert
       - bocecream.bo.infn.it:8443/cream-pbs-cert
       - bocecream.bo.infn.it:8443/cream-pbs-certSL5
       - cccreamceli01.in2p3.fr:8443/cream-bqs-medium
       - cccreamceli01.in2p3.fr:8443/cream-bqs-short
       - ce01-lcg.cr.cnaf.infn.it:8443/cream-lsf-dteam
       - ce07-lcg.cr.cnaf.infn.it:8443/cream-lsf-dteam
       - ce201.cern.ch:8443/cream-lsf-grid_2nh_dteam
       - ce201.cern.ch:8443/cream-lsf-grid_dteam
       - ce202.cern.ch:8443/cream-lsf-grid_2nh_dteam
       - ce202.cern.ch:8443/cream-lsf-grid_dteam
       - cert-15.pd.infn.it:8443/cream-lsf-cert
       - cream-38.pd.infn.it:8443/cream-pbs-creamtest1
       - cream-38.pd.infn.it:8443/cream-pbs-creamtest2
       - cream-ce.ct.infn.it:8443/cream-lsf-cert
       - cream-ce.pr.infn.it:8443/cream-pbs-cert
       - cream-ce.research-infrastructures.eu:8443/cream-pbs-cert
       - devce.cnaf.infn.it:8443/cream-pbs-cert
       - gridce0.pi.infn.it:8443/cream-lsf-cert
       - prod-ce-01.pd.infn.it:8443/cream-lsf-cert
       - t2-ce-01.to.infn.it:8443/cream-pbs-cert
       - t2-ce-01.to.infn.it:8443/cream-pbs-short
       - t2-ce-05.lnl.infn.it:8443/cream-lsf-cert1
      
      ==========================================================================
      
    • tried substituting the Requirement with
      Requirements = ( !RegExp("/cream-",other.GlueCEUniqueID));
      glite-wms-job-list-match --config glite_wms_devel20.conf -a myjob-toLcg.jdl
      
      Connecting to the service https://devel20.cnaf.infn.it:7443/glite_wms_wmproxy_server
      
      ==========================================================================
      
                           COMPUTING ELEMENT IDs LIST
       The following CE(s) matching your job requirements have been found:
      
              *CEId*
       - argoce01.na.infn.it:2119/jobmanager-lcgpbs-cert
       - atlas-ce-01.roma1.infn.it:2119/jobmanager-lcglsf-atlasgcert
       - atlas-ce-02.roma1.infn.it:2119/jobmanager-lcglsf-atlasgcert
       - atlasce01.na.infn.it:2119/jobmanager-lcgpbs-cert
       - boalice3.bo.infn.it:2119/jobmanager-lcgpbs-cert
       - boalice3.bo.infn.it:2119/jobmanager-lcgpbs-certSL5
       - cclcgceli01.in2p3.fr:2119/jobmanager-bqs-long
       - cclcgceli01.in2p3.fr:2119/jobmanager-bqs-medium
       - cclcgceli01.in2p3.fr:2119/jobmanager-bqs-short
       - cclcgceli02.in2p3.fr:2119/jobmanager-bqs-long
       - cclcgceli02.in2p3.fr:2119/jobmanager-bqs-medium
       - cclcgceli02.in2p3.fr:2119/jobmanager-bqs-short
       - cclcgceli03.in2p3.fr:2119/jobmanager-bqs-long
       - cclcgceli03.in2p3.fr:2119/jobmanager-bqs-medium
       - cclcgceli03.in2p3.fr:2119/jobmanager-bqs-short
       - cclcgceli04.in2p3.fr:2119/jobmanager-bqs-long
       - cclcgceli04.in2p3.fr:2119/jobmanager-bqs-medium
       - cclcgceli04.in2p3.fr:2119/jobmanager-bqs-short
       - cclcgceli07.in2p3.fr:2119/jobmanager-bqs-long
       - cclcgceli07.in2p3.fr:2119/jobmanager-bqs-medium
       - cclcgceli07.in2p3.fr:2119/jobmanager-bqs-short
       - cclcgceli08.in2p3.fr:2119/jobmanager-bqs-long
       - cclcgceli08.in2p3.fr:2119/jobmanager-bqs-medium
       - cclcgceli08.in2p3.fr:2119/jobmanager-bqs-short
       - ce-01.grid.sissa.it:2119/jobmanager-lcgpbs-cert
       - ce-01.roma3.infn.it:2119/jobmanager-lcgpbs-cert
       - ce01-lhcb-t2.cr.cnaf.infn.it:2119/jobmanager-lcglsf-cert_t2
       - ce02-lhcb-t2.cr.cnaf.infn.it:2119/jobmanager-lcglsf-cert_t2
       - ce04-lcg.cr.cnaf.infn.it:2119/jobmanager-lcglsf-dteam
       - ce05-lcg.cr.cnaf.infn.it:2119/jobmanager-lcglsf-dteam
       - ce06-lcg.cr.cnaf.infn.it:2119/jobmanager-lcglsf-dteam
       - ce103.cern.ch:2119/jobmanager-lcglsf-grid_2nh_dteam
       - ce103.cern.ch:2119/jobmanager-lcglsf-grid_dteam
       - ce104.cern.ch:2119/jobmanager-lcglsf-grid_2nh_dteam
       - ce104.cern.ch:2119/jobmanager-lcglsf-grid_dteam
       - ce105.cern.ch:2119/jobmanager-lcglsf-grid_2nh_dteam
       - ce105.cern.ch:2119/jobmanager-lcglsf-grid_dteam
       - ce106.cern.ch:2119/jobmanager-lcglsf-grid_2nh_dteam
       - ce106.cern.ch:2119/jobmanager-lcglsf-grid_dteam
       - ce107.cern.ch:2119/jobmanager-lcglsf-grid_2nh_dteam
       - ce107.cern.ch:2119/jobmanager-lcglsf-grid_dteam
       - ce112.cern.ch:2119/jobmanager-lcglsf-grid_2nh_dteam
       - ce112.cern.ch:2119/jobmanager-lcglsf-grid_dteam
       - ce113.cern.ch:2119/jobmanager-lcglsf-grid_2nh_dteam
       - ce113.cern.ch:2119/jobmanager-lcglsf-grid_dteam
       - ce114.cern.ch:2119/jobmanager-lcglsf-grid_2nh_dteam
       - ce114.cern.ch:2119/jobmanager-lcglsf-grid_dteam
       - ce124.cern.ch:2119/jobmanager-lcglsf-grid_2nh_dteam
       - ce124.cern.ch:2119/jobmanager-lcglsf-grid_dteam
       - ce125.cern.ch:2119/jobmanager-lcglsf-grid_2nh_dteam
       - ce125.cern.ch:2119/jobmanager-lcglsf-grid_dteam
       - ce126.cern.ch:2119/jobmanager-lcglsf-grid_2nh_dteam
       - ce126.cern.ch:2119/jobmanager-lcglsf-grid_dteam
       - ce127.cern.ch:2119/jobmanager-lcglsf-grid_2nh_dteam
       - ce127.cern.ch:2119/jobmanager-lcglsf-grid_dteam
       - ce128.cern.ch:2119/jobmanager-lcglsf-grid_2nh_dteam
       - ce128.cern.ch:2119/jobmanager-lcglsf-grid_dteam
       - ce129.cern.ch:2119/jobmanager-lcglsf-grid_2nh_dteam
       - ce129.cern.ch:2119/jobmanager-lcglsf-grid_dteam
       - ce130.cern.ch:2119/jobmanager-lcglsf-grid_2nh_dteam
       - ce130.cern.ch:2119/jobmanager-lcglsf-grid_dteam
       - ce131.cern.ch:2119/jobmanager-lcglsf-grid_2nh_dteam
       - ce131.cern.ch:2119/jobmanager-lcglsf-grid_dteam
       - ce132.cern.ch:2119/jobmanager-lcglsf-grid_2nh_dteam
       - ce132.cern.ch:2119/jobmanager-lcglsf-grid_dteam
       - ce133.cern.ch:2119/jobmanager-lcglsf-grid_2nh_dteam
       - ce133.cern.ch:2119/jobmanager-lcglsf-grid_dteam
       - cmsce01.na.infn.it:2119/jobmanager-lcgpbs-cert
       - grid-ce-01.ba.infn.it:2119/jobmanager-lcgpbs-cert
       - grid-ce.lns.infn.it:2119/jobmanager-lcgpbs-cert
       - grid-ce.lns.infn.it:2119/jobmanager-lcgpbs-infinite
       - grid-ce.lns.infn.it:2119/jobmanager-lcgpbs-long
       - grid-ce.lns.infn.it:2119/jobmanager-lcgpbs-short
       - grid-ce2.pr.infn.it:2119/jobmanager-pbs-cert
       - grid-eo-engine04.esrin.esa.int:2119/jobmanager-lcgpbs-cert
       - grid0.fe.infn.it:2119/jobmanager-lcgpbs-cert
       - grid001.ts.infn.it:2119/jobmanager-lcglsf-cert
       - grid002.ca.infn.it:2119/jobmanager-lcglsf-cert
       - grid01.ge.infn.it:2119/jobmanager-lcglsf-cert
       - grid012.ct.infn.it:2119/jobmanager-lcglsf-cert
       - gridce.ilc.cnr.it:2119/jobmanager-lcgpbs-cert
       - gridce.pg.infn.it:2119/jobmanager-lcgpbs-cert
       - gridce.sns.it:2119/jobmanager-lcgpbs-cert
       - gridce1.pi.infn.it:2119/jobmanager-lcglsf-cert
       - gridce2.pi.infn.it:2119/jobmanager-lcglsf-cert
       - gridit-ce-001.cnaf.infn.it:2119/jobmanager-lcgpbs-cert
       - griditce01.na.infn.it:2119/jobmanager-lcgpbs-cert
       - lcg-ce.research-infrastructures.eu:2119/jobmanager-lcgpbs-cert
       - linucs-ce-01.cs.infn.it:2119/jobmanager-lcgpbs-atlasgcert
       - pamelace01.na.infn.it:2119/jobmanager-lcgpbs-cert
       - pbs-enmr.cerm.unifi.it:2119/jobmanager-lcgpbs-cert
       - prod-ce-02.pd.infn.it:2119/jobmanager-lcglsf-cert
       - t2-ce-01.lnl.infn.it:2119/jobmanager-lcglsf-cert1
       - t2-ce-01.mi.infn.it:2119/jobmanager-lcgpbs-cert
       - t2-ce-02.lnl.infn.it:2119/jobmanager-lcglsf-cert1
       - t2-ce-02.mi.infn.it:2119/jobmanager-lcgcondor-cert
       - t2-ce-02.to.infn.it:2119/jobmanager-lcgpbs-cert
       - t2-ce-02.to.infn.it:2119/jobmanager-lcgpbs-short
       - t2-ce-03.lnl.infn.it:2119/jobmanager-lcglsf-cert1
       - t2-ce-04.lnl.infn.it:2119/jobmanager-lcglsf-cert1
       - t2-ce-06.lnl.infn.it:2119/jobmanager-lcglsf-cert1
       - test7200a.cnaf.infn.it:2119/jobmanager-lcgpbs-cert
       - test7200a.cnaf.infn.it:2119/jobmanager-lcgpbs-parallel
       - virgo-ce.roma1.infn.it:2119/jobmanager-lcgpbs-cert
      
      

List match with data

 
  • with data: Yes / Done

Submission/GetOutput

  • Normal jobs through
    • ICE work: Yes / Done
Added:
>
>
      • glite-wms-job-submit --config glite_wms_devel20.conf -a myjob-toICE.jdl
        
        Connecting to the service https://devel20.cnaf.infn.it:7443/glite_wms_wmproxy_server
        
        
        ====================== glite-wms-job-submit Success ======================
        
        The job has been successfully submitted to the WMProxy
        Your job identifier is:
        
        https://devel15.cnaf.infn.it:9000/Atu_PYr8SfD3C4VGCU_SjQ
        
        ==========================================================================
         glite-wms-job-status https://devel15.cnaf.infn.it:9000/Atu_PYr8SfD3C4VGCU_SjQ
        
        
        *************************************************************
        BOOKKEEPING INFORMATION:
        
        Status info for the Job : https://devel15.cnaf.infn.it:9000/Atu_PYr8SfD3C4VGCU_SjQ
        Current Status:     Done (Success)
        Exit code:          0
        Status Reason:      Job Terminated Successfully
        Destination:        ce202.cern.ch:8443/cream-lsf-grid_dteam
        Submitted:          Mon Apr 26 15:00:37 2010 CEST
        *************************************************************
        
         glite-wms-job-output https://devel15.cnaf.infn.it:9000/Atu_PYr8SfD3C4VGCU_SjQ
        
        Connecting to the service https://devel20.cnaf.infn.it:7443/glite_wms_wmproxy_server
        
        
        ================================================================================
        
                                JOB GET OUTPUT OUTCOME
        
        Output sandbox files for the job:
        https://devel15.cnaf.infn.it:9000/Atu_PYr8SfD3C4VGCU_SjQ
        have been successfully retrieved and stored in the directory:
        /tmp/jobOutput/emolinari_Atu_PYr8SfD3C4VGCU_SjQ
        --------------------------
        ls /tmp/jobOutput/emolinari_Atu_PYr8SfD3C4VGCU_SjQ/
        err.log  message.txt
        
        
 
    • JC work: Yes / Done
Added:
>
>
      • glite-wms-job-submit --config glite_wms_devel20.conf -a myjob-toLcg.jdl
        
        Connecting to the service https://devel20.cnaf.infn.it:7443/glite_wms_wmproxy_server
        
        
        ====================== glite-wms-job-submit Success ======================
        
        The job has been successfully submitted to the WMProxy
        Your job identifier is:
        
        https://devel15.cnaf.infn.it:9000/k0ue-prhJAvtAvjL2x_7Qg
        ----------------------------------------------------------------------
        glite-wms-job-status https://devel15.cnaf.infn.it:9000/k0ue-prhJAvtAvjL2x_7Qg
        
        
        *************************************************************
        BOOKKEEPING INFORMATION:
        
        Status info for the Job : https://devel15.cnaf.infn.it:9000/k0ue-prhJAvtAvjL2x_7Qg
        Current Status:     Done (Success)
        Logged Reason(s):
            -
            - Job terminated successfully
        Exit code:          0
        Status Reason:      Job terminated successfully
        Destination:        ce131.cern.ch:2119/jobmanager-lcglsf-grid_2nh_dteam
        Submitted:          Mon Apr 26 15:18:42 2010 CEST
        *************************************************************
        glite-wms-job-output https://devel15.cnaf.infn.it:9000/k0ue-prhJAvtAvjL2x_7Qg
        
        Connecting to the service https://devel20.cnaf.infn.it:7443/glite_wms_wmproxy_server
        
        
        ================================================================================
        
                                JOB GET OUTPUT OUTCOME
        
        Output sandbox files for the job:
        https://devel15.cnaf.infn.it:9000/k0ue-prhJAvtAvjL2x_7Qg
        have been successfully retrieved and stored in the directory:
        /tmp/jobOutput/emolinari_k0ue-prhJAvtAvjL2x_7Qg
        
        ================================================================================
        
 
  • Dag jobs through:
    • JC work: Yes / Done

Revision 432010-04-26 - AlessioGianelle

Line: 1 to 1
 

Certification report patch 3621

Line: 283 to 283
  1438 ? Sl 0:00 /opt/glite/bin/glite-wms-ice --conf glite_wms.conf 1470 pts/2 S+ 0:00 grep ice
Changed:
<
<
  • Bug #55452: CMS production struck by waves of "Globus error 10: data transfer to the server failed" FIX NOT CERTIFIED
>
>
  • Bug #55452: CMS production struck by waves of "Globus error 10: data transfer to the server failed" FIXED NOT CERTIFIED
 
  • Bug #56636: [ICE] statistics counters for monitoring FIXED
    • Verify the command and its options:
Line: 453 to 453
 
Changed:
<
<
  • Bug #59240: [ICE] abort reasons not always printed in its logfile NOT TESTED
>
>
  • Bug #59240: [ICE] abort reasons not always printed in its logfile FIXED NOT CERTIFIED
 
  • Bug #59399: [ICE] doesn't correctly handle request in jobdir/old when it is restarted FIXED
    • Verify submitting a big collection to cream CEs, and then restarting ICE in the middle of the submit process:

Revision 422010-04-15 - AlessioGianelle

Line: 1 to 1
 

Certification report patch 3621

Line: 94 to 94
 

Check bugs

Changed:
<
<
  • Bug #42288: Problem in forwarding cerequirements to a CREAM CE FIXED
>
>
  • Bug #42288: Problem in forwarding cerequirements to a CREAM CE FIXED
 
    • description of the problem --> "The parameters to be forwarded specified in the Requirements attribute of the .jdl classad are NOT considered and ICE does not send them to the CE, therefore the classad passed to BLAH does not contain them"
      • submitted the following .jdl via WMS:
         cat myjob_forwardReq.jdl
        [
Line: 283 to 283
  1438 ? Sl 0:00 /opt/glite/bin/glite-wms-ice --conf glite_wms.conf 1470 pts/2 S+ 0:00 grep ice
Changed:
<
<
  • Bug #55452: CMS production struck by waves of "Globus error 10: data transfer to the server failed" NOT TESTED
>
>
  • Bug #55452: CMS production struck by waves of "Globus error 10: data transfer to the server failed" FIX NOT CERTIFIED
 
  • Bug #56636: [ICE] statistics counters for monitoring FIXED
    • Verify the command and its options:
Line: 475 to 475
 2010-03-23 15:55:49,155 INFO - iceLBLogger::logEvent() - Cream Transfer OK Event - [gridJobID="https://devel17.cnaf.infn.it:9 000/iM8C3YV12fwhvIG5mNip5Q" CREAMJobID="https://cream-32.pd.infn.it:8443/CREAM036926381"]
Changed:
<
<
  • Bug #59453: [ICE] polling needs to be improved NOT TESTED
>
>
  • Bug #59453: [ICE] polling needs to be improved FIXED NOT CERTIFIED
 
  • Bug #60668: [ICE] does not respect LB server/proxy selection through the LBproxy attribute FIXED
    • Set LBProxy = false; in glite_wms.conf (section Common), restart ice and submit...

Revision 412010-04-14 - ElisabettaMolinari

Line: 1 to 1
 

Certification report patch 3621

Line: 94 to 94
 

Check bugs

Changed:
<
<
  • Bug #42288: Problem in forwarding cerequirements to a CREAM CE NOT TESTED
>
>
  • Bug #42288: Problem in forwarding cerequirements to a CREAM CE FIXED
    • description of the problem --> "The parameters to be forwarded specified in the Requirements attribute of the .jdl classad are NOT considered and ICE does not send them to the CE, therefore the classad passed to BLAH does not contain them"
      • submitted the following .jdl via WMS:
         cat myjob_forwardReq.jdl
        [
        Type = "Job";
        JobType = "normal";
        InputSandbox = { "file:///home/emolinari/test.sh"};
        VirtualOrganisation = "dteam";
        Executable="test.sh";
        Arguments="Hello ";
        requirements = (other.GlueCEUniqueID == "cream-19.pd.infn.it:8443/cream-lsf-testbedB_1") && (other.GlueHostMainMemoryRAMSize >= 0) ;
        Rank = 0;
        myproxy = myproxy.cnaf.infn.it;
        fuzzyrank = true;
        StdOutput="message.txt";
        StdError="err.log";
        OutputSandbox={"message.txt","err.log",".BrokerInfo"};
        RetryCount = 0;
        ShallowRetryCount = 3;
        ]
      • checked in the ice log file on the WMS, /var/log/glite/ice.log, that the CERequirement field of the .jdl gets populated as in the following
         CeRequirements = "true && ( true && ( true && ( other.GlueHostMainMemoryRAMSize >= 0 ) ) )"; 
      • checked on the CE that blah generates the correct classad with the requirements to be forwarded, as in the following:
         cat /tmp/subfile
        #!/bin/bash
        # LSF job wrapper generated by lsf_submit.sh
        # on Wed Apr 14 19:13:10 CEST 2010
        #
        # LSF directives:
        #BSUB -L /bin/bash
        #BSUB -J cre19_725998524
        #BSUB -q testbedB_1
        #BSUB -R "select[mem>=0]"
        ......
 
  • Bug #48910: Failure starting LM if its output jobdir doesn't exist; unprotected chown in WM/LM/JC startup scripts FIXED
    • Stopped gLite services and deleted the jobdir under '/var/glite/workload_manager'

Revision 402010-04-14 - AlessioGianelle

Line: 1 to 1
 

Certification report patch 3621

Line: 524 to 524
 Current Status: Done (Success) *************************************************************
Changed:
<
<
  • Bug #61405: [ICE] Missing proxy validity evaluation in ICE NOT TESTED
>
>
  • Bug #61405: [ICE] Missing proxy validity evaluation in ICE FIXED
    • Submit this jdl with a proxy of 30minutes NOT registered to the myproxy server (myproxy.cnaf.infn.it):
       [
        executable = "/bin/sleep"; 
        arguments = "2000"; 
        MyProxyServer = "myproxy.cnaf.infn.it"; 
        requirements = ( other.GlueCEStateStatus == "testbedb" ); 
        DefaultRank =  -other.GlueCEStateEstimatedResponseTime; 
       ]
    • After a while submit the same jdl with a fresh proxy and look in the ice's log if this new proxy is used to refresh the delegation of the previous job:
      • First it should try to renew the proxy contacting the myproxy server:
        2010-04-14 11:47:40,622 DEBUG - iceCommandDelegationRenewal::renewAllDelegations() - Contacting MyProxy server [myproxy.cnaf.infn.it] for user dn [/C=IT/O=INFN/OU=Personal Certificate/L=Padova/CN=Alessio Gianelle-/dteam/Role=NULL/Capability=NULL] with proxy certificate [/var/glite/ice/persist_dir/2A9DAF04C398C21D6ADF7E884BC192ED95AF554C.betterproxy] to renew it...
        2010-04-14 11:47:40,622 DEBUG - iceCommandDelegationRenewal::renewAllDelegations() - Executing command [export X509_USER_CERT=/var/glite/wms.proxy; export X509_USER_KEY=/var/glite/wms.proxy; /opt/glite/bin/glite-wms-ice-proxy-renew -s myproxy.cnaf.infn.it -p /var/glite/ice/persist_dir/2A9DAF04C398C21D6ADF7E884BC192ED95AF554C.betterproxy -o /var/glite/ice/persist_dir/2A9DAF04C398C21D6ADF7E884BC192ED95AF554C.betterproxy.renewed]...
        2010-04-14 11:47:40,783 DEBUG - iceCommandDelegationRenewal::renewAllDelegations() - Command output is [/opt/glite/bin/glite-wms-ice-proxy-renew: glite_renewal_core_renew() failed: Error contacting MyProxy server for proxy /var/glite/ice/persist_dir/2A9DAF04C398C21D6ADF7E884BC192ED95AF554C.betterproxy: ERROR from myproxy-server (myproxy.cnaf.infn.it):
        X509_verify_cert() failed: certificate has expired
        
        &#65533;]
      • Then it should use the proxy of the last arrived job to renew the delegation:
        2010-04-14 11:47:40,783 DEBUG - iceCommandDelegationRenewal::renewAllDelegations() - Looking for the better proxy for DN [/C=IT/O=INFN/OU=Personal Certificate/L=Padova/CN=Alessio Gianelle-/dteam/Role=NULL/Capability=NULL] MyProxy Server name [myproxy.cnaf.infn.it]...
        2010-04-14 11:47:40,783 INFO - iceCommandDelegationRenewal::renewAllDelegations() - Will Renew Delegation ID [12712381542E417936wms0072Ecnaf2Einfn2Eit] with BetterProxy [/var/glite/ice/persist_dir/2A9DAF04C398C21D6ADF7E884BC192ED95AF554C.betterproxy] that will expire on [Wed Apr 14 12:12:47 2010]
        2010-04-14 11:47:40,783 INFO - CreamProxy_DelegateRenew::execute() - Calling renewProxyReq to remote service [https://cream-39.pd.infn.it:8443/ce-cream/services/gridsite-delegation]

 
  • Bug #61413: [ICE] should not call EventQuery for a userDN if he/she doesn't have active jobs FIXED
    • Submit a job to a CreamCE and wait until it finished.

Revision 392010-04-09 - AlessioGianelle

Line: 1 to 1
 

Certification report patch 3621

Line: 367 to 367
 Submitted: Tue Mar 23 09:49:42 2010 CET *************************************************************
Changed:
<
<
  • Bug #58977: [ICE] Wrong database colum name in ICE SQL query NOT TESTED
>
>
  • Bug #58977: [ICE] Wrong database colum name in ICE SQL query FIXED
    • Submit some jobs to a ce (e.g. cream-25.pd.infn.it:8443/cream-lsf-testbedB_2):
      [root@wms007 20100409]# queryDb -v -C -G
      [https://cream-25.pd.infn.it:8443/CREAM881525184]  [https://devel17.cnaf.infn.it:9000/V2Lj0_-XWkrjRKMaf3f6ng]
      [https://cream-25.pd.infn.it:8443/CREAM425827870]  [https://devel17.cnaf.infn.it:9000/4TKQu1U_daCMMb2mRDR2cA]
      [https://cream-25.pd.infn.it:8443/CREAM543141647]  [https://devel17.cnaf.infn.it:9000/4wqozVNHEVUXWx5UzUaXqA]
      [https://cream-25.pd.infn.it:8443/CREAM769586568]  [https://devel17.cnaf.infn.it:9000/XLcaGE3kR3h8-oj8cMWE_A]
      [https://cream-25.pd.infn.it:8443/CREAM192029588]  [https://devel17.cnaf.infn.it:9000/PuOGoOxMf-pbfFu-wSkACw]
      [https://cream-25.pd.infn.it:8443/CREAM378177464]  [https://devel17.cnaf.infn.it:9000/T8ZKSu5zZPZY-Ee1gLRX5A]
      [https://cream-25.pd.infn.it:8443/CREAM299069473]  [https://devel17.cnaf.infn.it:9000/Xh1AMEor9hWOx4picngYkA]
      [https://cream-25.pd.infn.it:8443/CREAM012571708]  [https://devel17.cnaf.infn.it:9000/YjpCU6dfrLsDBs6wU_D3Hg]
      [https://cream-25.pd.infn.it:8443/CREAM561236418]  [https://devel17.cnaf.infn.it:9000/00Qc7RnutRORYVOd0ShIKg]
      [https://cream-25.pd.infn.it:8443/CREAM972351884]  [https://devel17.cnaf.infn.it:9000/ksz80OflJnDE_ynWHmKTwQ]
      [https://cream-25.pd.infn.it:8443/CREAM827240561]  [https://devel17.cnaf.infn.it:9000/WwfftKdV6_5lSgihPOUsaA]
      [https://cream-25.pd.infn.it:8443/CREAM573497695]  [https://devel17.cnaf.infn.it:9000/S5zbkyK72hv2LXwUD1vAFw]
      [https://cream-25.pd.infn.it:8443/CREAM735112819]  [https://devel17.cnaf.infn.it:9000/0J9nTy1tJxkcuRJ9oTZACw]
      [https://cream-25.pd.infn.it:8443/CREAM526570551]  [https://devel17.cnaf.infn.it:9000/Rcl0TypyUTXMwtLk86R3yA]
      [https://cream-25.pd.infn.it:8443/CREAM992848449]  [https://devel17.cnaf.infn.it:9000/xfH1fkIroQwNvlVBdn8N5A]
      [https://cream-25.pd.infn.it:8443/CREAM944698480]  [https://devel17.cnaf.infn.it:9000/xjiyHJo3rkUsXXHHe0s6yg]
      [https://cream-25.pd.infn.it:8443/CREAM729677007]  [https://devel17.cnaf.infn.it:9000/FIZA1Mjb4moUNel1N7UXvw]
      [https://cream-25.pd.infn.it:8443/CREAM589660323]  [https://devel17.cnaf.infn.it:9000/5DJlLG7M0v3C_-WMDKSdXQ]
      [https://cream-25.pd.infn.it:8443/CREAM994745139]  [https://devel17.cnaf.infn.it:9000/T_UdwnjC55dIVrPxJOVvmg]
      [https://cream-25.pd.infn.it:8443/CREAM228224655]  [https://devel17.cnaf.infn.it:9000/URw39mrv7jj-buJ3KDza8w]
      [https://cream-25.pd.infn.it:8443/CREAM397635733]  [https://devel17.cnaf.infn.it:9000/f3FOGwNoWpHyWkxO_87AIg]
      [https://cream-25.pd.infn.it:8443/CREAM510341828]  [https://devel17.cnaf.infn.it:9000/vEfH5j5_5R_7jFNrntEsog] 
      [https://cream-25.pd.infn.it:8443/CREAM788890890]  [https://devel17.cnaf.infn.it:9000/y0IVYbdR_UWbTmrXY5O8fA]
      
      ------------------------------------------------
      23 item(s) found
    • Check also the db_id registered in the ice's database
      [root@wms007 20100409]#  sqlite3 /var/glite/ice/persist_dir/ice.db "SELECT db_id, ceurl from ce_dbid;"
      1270820425000|https://cream-25.pd.infn.it:8443/ce-cream/services/CREAM2
    • Stop the cream CE. Drop its database. Create a new empty one. Restart the CE.
    • Check what happen in the ice log file:
      2010-04-09 16:16:53,953 WARN - iceCommandEventQuery::execute() -  TID=[150150560] *** CREAM HAS PROBABLY BEEN SCRATCHED. GOING TO ERASE ALL JOBS RELATED TO OLD DB_ID [1270820425000] ***
    • Check if there are jobs in the Ice's database:
      [root@wms007 persist_dir]# queryDb -v  -C -G
      
      ------------------------------------------------
      0 item(s) found
    • and if the db_id has been changed:
      [root@wms007 persist_dir]#  sqlite3 /var/glite/ice/persist_dir/ice.db "SELECT db_id, ceurl from ce_dbid;"
      1270822483000|https://cream-25.pd.infn.it:8443/ce-cream/services/CREAM2
    • Look at the status of a job that has been removed:
          Status info for the Job : https://devel17.cnaf.infn.it:9000/xjiyHJo3rkUsXXHHe0s6yg
          Current Status:     Aborted
          Logged Reason(s):
              - job completed
          Status Reason:      CREAM'S database has been scratched and all its jobs have been lost
          Destination:        cream-25.pd.infn.it:8443/cream-lsf-testbedB_2
          Submitted:          Fri Apr  9 16:11:34 2010 CEST

 
  • Bug #59240: [ICE] abort reasons not always printed in its logfile NOT TESTED
Line: 520 to 573
 
  • Bug #64698: jobwrapper max osb limit should be considered only if the gridftp server is the wms FIXED only for LCG-CE
    • Set MaxOutputSandboxSize = 10000000; in section WorkloadManager of file glite_wms.conf
Changed:
<
<
    • Submit a jdl with a file of more than 10Mb in the !OutputSandbox parameter and set also the corresponding OutputSandboxDestURI parameter
>
>
    • Submit a jdl with a file of more than 10Mb in the OutputSandbox parameter and set also the corresponding OutputSandboxDestURI parameter
 
    • Check the output dir in the SE:
      [root@devel18 tmp]# ls -lh
      -rw-r--r--  1 dteam044 dteam  50M Apr  7 16:23 bigfile

Revision 382010-04-09 - AlessioGianelle

Line: 1 to 1
 

Certification report patch 3621

Line: 254 to 254
 
  • Bug #55452: CMS production struck by waves of "Globus error 10: data transfer to the server failed" NOT TESTED
Changed:
<
<
  • Bug #56636: [ICE] statistics counters for monitoring NOT TESTED
>
>
  • Bug #56636: [ICE] statistics counters for monitoring FIXED
    • Verify the command and its options:
      [root@wms007 persist_dir]#  queryStats -t "2010-04-08 00:00:00"
      JOB_REGISTERED=2
      JOB_IDLE=2
      JOB_RUNNING=2
      JOB_REALLY-RUNNING=2
      JOB_DONE-OK=2
      
      [root@wms007 persist_dir]#  queryStats -f "2010-04-08 00:00:01" -t "2010-04-09 11:00:00"
      JOB_REGISTERED=4
      JOB_IDLE=4
      JOB_RUNNING=4
      JOB_REALLY-RUNNING=4
      JOB_DONE-OK=1
      JOB_DONE-FAILED=3
      
      [root@wms007 persist_dir]#  queryStats -f "2010-04-09 11:00:01"
      JOB_REGISTERED=255
      JOB_IDLE=255
      JOB_RUNNING=193
      JOB_REALLY-RUNNING=204
      JOB_DONE-OK=191
      JOB_ABORTED=6
      
      [root@wms007 persist_dir]#  queryStats
      JOB_REGISTERED=261
      JOB_IDLE=261
      JOB_RUNNING=199
      JOB_REALLY-RUNNING=210
      JOB_DONE-OK=194
      JOB_DONE-FAILED=3
      JOB_ABORTED=6
 
  • Bug #57295: [ICE] queryDb tool may create empty DB as root FIXED
    • Verify:

Revision 372010-04-07 - AlessioGianelle

Line: 1 to 1
 

Certification report patch 3621

Line: 301 to 301
 2010-03-23 10:20:37,828 INFO - iceLBContext::setLoggingJob - Setting log job to jobid=[https://devel17.cnaf.infn.it:9000/jw2aeAy1skHY3mRJHCF8YA] LB server=[devel17.cnaf.infn.it:9000] (port is not used, actually...) 2010-03-23 10:20:37,828 INFO - iceLBLogger::logEvent() - Job Aborted Event, reason=[Input sandbox's proxy is missing. Cannot resubmit job] - [gridJobID="https://devel17.cnaf.infn.it:9000/jw2aeAy1skHY3mRJHCF8YA" CREAMJobID="https://ce202.cern.ch:8443/CREAM030114428"]
Changed:
<
<
  • Bug #58099: WMS purger forces purge of jobs if LB cannot be reached NOT TESTED
>
>
  • Bug #58099: WMS purger forces purge of jobs if LB cannot be reached FIXED
    • Stop the LBServer and then run the cron purger:
      07 Apr, 16:09:13 -E: [Error] query_job_status(purger.cpp:125): https://devel17.cnaf.infn.it:9000/yeoXs2eB1kvOaPp0Mtjthg:: edg_wll_JobStat [111] Connection refused(edg_wll_gss_connect())
      [glite@wms007 ~]$ 
    • Verify that the SandBox dir has not been removed:
       [glite@wms007 ~]$ ls -l /var/glite/SandboxDir/ye/https_3a_2f_2fdevel17.cnaf.infn.it_3a9000_2fyeoXs2eB1kvOaPp0Mtjthg/
      total 16
      drwxrwx---  2 dteam008 glite 4096 Apr  6 14:34 input
      drwxrwx---  2 dteam008 glite 4096 Apr  6 14:46 output
      drwxrwx---  2 dteam008 glite 4096 Apr  6 14:34 peek
      lrwxrwxrwx  1 glite    glite  102 Apr  6 14:34 user.proxy -> /var/glite/SandboxDir/Uo/https_3a_2f_2fdevel17.cnaf.infn.it_3a9000_2fUow8XY0NGbyumU3PPGMSng/user.proxy
      
    • Restart LBServer and verify that now the SBD of the job is purged:
      [glite@wms007 ~]$  /opt/glite/sbin/glite-wms-purgeStorage.sh  -p /var/glite/SandboxDir/ye  -t 10000
      07 Apr, 16:18:07 -I: [Info] operator()(purger.cpp:449): https://devel17.cnaf.infn.it:9000/yeoXs2eB1kvOaPp0Mtjthg: removed DONE job
      [glite@wms007 ~]$ ls -l /var/glite/SandboxDir/ye/https_3a_2f_2fdevel17.cnaf.infn.it_3a9000_2fyeoXs2eB1kvOaPp0Mtjthg/
      ls: /var/glite/SandboxDir/ye/https_3a_2f_2fdevel17.cnaf.infn.it_3a9000_2fyeoXs2eB1kvOaPp0Mtjthg/: No such file or directory
 
  • Bug #58387: [ICE] should log a job aborted when it cannot resubmit the job for missing user proxy FIXED
    • Verify:
Line: 468 to 485
 2010-03-25 09:45:39,545 ERROR - Request_source_jobdir::get_requests() - Error returned by method jobDir::new_entries(): boost::filesystem::directory_iterator constructor: "/var/glite/ice/jobdir/new": Permission denied 2010-03-25 09:45:40,546 ERROR - Request_source_jobdir::get_requests() - Error returned by method jobDir::new_entries(): boost::filesystem::directory_iterator constructor: "/var/glite/ice/jobdir/new": Permission denied
Changed:
<
<
  • Bug #64698: jobwrapper max osb limit should be considered only if the gridftp server is the wms NOT TESTED
>
>
  • Bug #64698: jobwrapper max osb limit should be considered only if the gridftp server is the wms FIXED only for LCG-CE
    • Set MaxOutputSandboxSize = 10000000; in section WorkloadManager of file glite_wms.conf
    • Submit a jdl with a file of more than 10Mb in the !OutputSandbox parameter and set also the corresponding OutputSandboxDestURI parameter
    • Check the output dir in the SE:
      [root@devel18 tmp]# ls -lh
      -rw-r--r--  1 dteam044 dteam  50M Apr  7 16:23 bigfile
      -rw-r--r--  1 dteam044 dteam  646 Apr  7 16:23 ls.out
    • If you don't set OutputSandboxDestURI in the jdl, than the SandBox dir in the WMS should contain a .tail file of less than 10Mb:
      [root@wms007 persist_dir]#  ls -lh /var/glite/SandboxDir/A1/https_3a_2f_2fdevel17.cnaf.infn.it_3a9000_2fA1cdkhzepvrCjiU_5fTaKbpg/output/
      total 9.6M
      -rw-r--r--  1 dteam008 dteam 9.6M Apr  7 16:26 bigfile.tail
      -rw-r--r--  1 dteam008 dteam  637 Apr  7 16:26 ls.out
 

Revision 362010-03-26 - AlessioGianelle

Line: 1 to 1
Added:
>
>
 

Certification report patch 3621

Author(s): Elisabetta Molinari & Alessio Gianelle

Line: 90 to 92
 
Changed:
<
<

Check bugs:

>
>

Check bugs

 
  • Bug #42288: Problem in forwarding cerequirements to a CREAM CE NOT TESTED

Revision 352010-03-25 - AlessioGianelle

Line: 1 to 1
 

Certification report patch 3621

Author(s): Elisabetta Molinari & Alessio Gianelle

Changed:
<
<
Outcome: in certification...
>
>
Outcome: in certification...
 

Clean installation

Line: 22 to 22
 
    • JC work: Yes / Done

  • Dag jobs through:
Changed:
<
<
    • JC work: Yes / Done OK
>
>
    • JC work: Yes / Done
 
  • Collection jobs through:
    • ICE work: Yes / Done
Line: 67 to 67
 
    • ICE: Yes / Done
    • JC: Yes / Done
  • Node of a collection: Yes / Done
Changed:
<
<
Note: collections stay in status 'waiting' when all the nodes are Done (Success) except for one that is 'Cancelled'
>
>
    • Note: collections stay in status 'waiting' when all the nodes are Done (Success) except for one that is 'Cancelled'
 

Others

Line: 92 to 92
 

Check bugs:

Changed:
<
<
  • Bug #42288: Problem in forwarding cerequirements to a CREAM CE
>
>
  • Bug #42288: Problem in forwarding cerequirements to a CREAM CE NOT TESTED
 
  • Bug #48910: Failure starting LM if its output jobdir doesn't exist; unprotected chown in WM/LM/JC startup scripts FIXED
    • Stopped gLite services and deleted the jobdir under '/var/glite/workload_manager'
Line: 251 to 250
  1438 ? Sl 0:00 /opt/glite/bin/glite-wms-ice --conf glite_wms.conf 1470 pts/2 S+ 0:00 grep ice
Changed:
<
<
  • Bug #55452: CMS production struck by waves of "Globus error 10: data transfer to the server failed"
>
>
  • Bug #55452: CMS production struck by waves of "Globus error 10: data transfer to the server failed" NOT TESTED
 
Changed:
<
<
  • Bug #56636: [ICE] statistics counters for monitoring
>
>
  • Bug #56636: [ICE] statistics counters for monitoring NOT TESTED
 
  • Bug #57295: [ICE] queryDb tool may create empty DB as root FIXED
    • Verify:
Line: 300 to 299
 2010-03-23 10:20:37,828 INFO - iceLBContext::setLoggingJob - Setting log job to jobid=[https://devel17.cnaf.infn.it:9000/jw2aeAy1skHY3mRJHCF8YA] LB server=[devel17.cnaf.infn.it:9000] (port is not used, actually...) 2010-03-23 10:20:37,828 INFO - iceLBLogger::logEvent() - Job Aborted Event, reason=[Input sandbox's proxy is missing. Cannot resubmit job] - [gridJobID="https://devel17.cnaf.infn.it:9000/jw2aeAy1skHY3mRJHCF8YA" CREAMJobID="https://ce202.cern.ch:8443/CREAM030114428"]
Added:
>
>
  • Bug #58099: WMS purger forces purge of jobs if LB cannot be reached NOT TESTED
 
  • Bug #58387: [ICE] should log a job aborted when it cannot resubmit the job for missing user proxy FIXED
    • Verify:
      *************************************************************
Line: 314 to 315
 Submitted: Tue Mar 23 09:49:42 2010 CET *************************************************************
Changed:
<
<
  • Bug #58977: [ICE] Wrong database colum name in ICE SQL query
>
>
  • Bug #58977: [ICE] Wrong database colum name in ICE SQL query NOT TESTED
 
Changed:
<
<
  • Bug #59240: [ICE] abort reasons not always printed in its logfile
>
>
  • Bug #59240: [ICE] abort reasons not always printed in its logfile NOT TESTED
 
  • Bug #59399: [ICE] doesn't correctly handle request in jobdir/old when it is restarted FIXED
    • Verify submitting a big collection to cream CEs, and then restarting ICE in the middle of the submit process:
Line: 338 to 339
 2010-03-23 15:55:49,155 INFO - iceLBLogger::logEvent() - Cream Transfer OK Event - [gridJobID="https://devel17.cnaf.infn.it:9 000/iM8C3YV12fwhvIG5mNip5Q" CREAMJobID="https://cream-32.pd.infn.it:8443/CREAM036926381"]
Changed:
<
<
  • Bug #59453: [ICE] polling needs to be improved
>
>
  • Bug #59453: [ICE] polling needs to be improved NOT TESTED
 
  • Bug #60668: [ICE] does not respect LB server/proxy selection through the LBproxy attribute FIXED
    • Set LBProxy = false; in glite_wms.conf (section Common), restart ice and submit...
Line: 418 to 419
 Current Status: Done (Success) *************************************************************
Changed:
<
<
  • Bug #61405: [ICE] Missing proxy validity evaluation in ICE
>
>
  • Bug #61405: [ICE] Missing proxy validity evaluation in ICE NOT TESTED
 
  • Bug #61413: [ICE] should not call EventQuery for a userDN if he/she doesn't have active jobs FIXED
    • Submit a job to a CreamCE and wait until it finished.
Line: 465 to 466
 2010-03-25 09:45:39,545 ERROR - Request_source_jobdir::get_requests() - Error returned by method jobDir::new_entries(): boost::filesystem::directory_iterator constructor: "/var/glite/ice/jobdir/new": Permission denied 2010-03-25 09:45:40,546 ERROR - Request_source_jobdir::get_requests() - Error returned by method jobDir::new_entries(): boost::filesystem::directory_iterator constructor: "/var/glite/ice/jobdir/new": Permission denied
Added:
>
>
  • Bug #64698: jobwrapper max osb limit should be considered only if the gridftp server is the wms NOT TESTED

 -- AlessioGianelle - 2010-02-05

Revision 342010-03-25 - AlessioGianelle

Line: 1 to 1
 

Certification report patch 3621

Author(s): Elisabetta Molinari & Alessio Gianelle

Line: 112 to 112
 ismdump.fl jobdir [root@wms007 workload_manager]# /opt/glite/etc/init.d/glite-wms-lm status Logmonitor running...
Added:
>
>
    • Stopped gLite services and deleted the jobdir under '/var/glite/jobcontrol'
      [root@wms007 jobcontrol]# pwd
      /var/glite/jobcontrol
      [root@wms007 jobcontrol]# rm -rf jobdir
      [root@wms007 jobcontrol]# ls
      condorio  submit
      • re-started the JC service checking that the jobdir gets recreated
        [root@wms007 jobcontrol]# /opt/glite/etc/init.d/glite-wms-jc start JobController
        Starting JobController daemon(s)
           Starting JobController...                          [  OK  ]
        [root@wms007 jobcontrol]# ls
        condorio  jobdir  lock  submit
        [root@wms007 ice]# /opt/glite/etc/init.d/glite-wms-jc status JobController
        JobController running in pid: 3625
    • Stopped gLite services and deleted the jobdir under '/var/glite/ice'
      [root@wms007 ice]# pwd
      /var/glite/ice
      [root@wms007 ice]# ls
      jobdir  persist_dir
      [root@wms007 ice]# rm -rf jobdir/
      [root@wms007 ice]# ls
      persist_dir
      • re-started the ICE service checking that the jobdir gets recreated
        [root@wms007 ice]# /opt/glite/etc/init.d/glite-wms-ice start
        starting ICE... ok
        [root@wms007 ice]# ls
        jobdir  persist_dir
        [root@wms007 ice]# /opt/glite/etc/init.d/glite-wms-ice status
        /opt/glite/bin/glite-wms-ice-safe (pid 22783) is running...
    • Stopped gLite services and deleted all the jobdirs
      [root@wms007 glite]# ls workload_manager/ jobcontrol/ ice/
      ice/:
      persist_dir
      
      jobcontrol/:
      condorio  submit
      
      workload_manager/:
      ismdump.fl
      • re-started the WM service checking that all the jobdirs get recreated
        [root@wms007 glite]# /opt/glite/etc/init.d/glite-wms-wm start
        starting workload manager... ok
        [root@wms007 glite]# ls workload_manager/ jobcontrol/ ice/
        ice/:
        jobdir  persist_dir
        
        jobcontrol/:
        condorio  jobdir  submit
        
        workload_manager/:
        ismdump.fl  jobdir
        [root@wms007 glite]# /opt/glite/etc/init.d/glite-wms-wm status
        /opt/glite/bin/glite-wms-workload_manager (pid 23259) is running...
 
    • Comment Input/InputType parameter in wms conf file (Sections: ICE, WorkloadManager and JobController).
    • Try to start JobController:
      [root@wms007 workload_manager]# /opt/glite/etc/init.d/glite-wms-jc start JobController

Revision 332010-03-25 - AlessioGianelle

Line: 1 to 1
 

Certification report patch 3621

Author(s): Elisabetta Molinari & Alessio Gianelle

Line: 95 to 95
 
  • Bug #42288: Problem in forwarding cerequirements to a CREAM CE

  • Bug #48910: Failure starting LM if its output jobdir doesn't exist; unprotected chown in WM/LM/JC startup scripts FIXED
Changed:
<
<
    • stopped gLite services
    • deleted the jobdir under '/var/glite/workload_manager'
    • re-started the LM service checking that the jobdir gets recreated
>
>
    • Stopped gLite services and deleted the jobdir under '/var/glite/workload_manager'
      [root@wms007 jobdir]# service gLite stop
      [...]
      [root@wms007 workload_manager]# pwd
      /var/glite/workload_manager
      [root@wms007 workload_manager]# ls
      ismdump.fl  jobdir
      [root@wms007 workload_manager]# rm -rf jobdir
      [root@wms007 workload_manager]# ls
      ismdump.fl 
    • re-started the LM service checking that the jobdir gets recreated
      [root@wms007 workload_manager]# /opt/glite/etc/init.d/glite-wms-lm start
      Starting LogMonitor...                                     [  OK  ]
      [root@wms007 workload_manager]# ls
      ismdump.fl  jobdir
      [root@wms007 workload_manager]# /opt/glite/etc/init.d/glite-wms-lm status
      Logmonitor running...
    • Comment Input/InputType parameter in wms conf file (Sections: ICE, WorkloadManager and JobController).
    • Try to start JobController:
      [root@wms007 workload_manager]# /opt/glite/etc/init.d/glite-wms-jc start JobController
      Starting !JobController daemon(s)
       Please set Input parameter in glite_wms.conf - JC section [FAILED]
      [root@wms007 workload_manager]# /opt/glite/etc/init.d/glite-wms-jc status JobController
      JobController stopped.
    • Try to start LogMonitor:
      [root@wms007 workload_manager]# /opt/glite/etc/init.d/glite-wms-lm start
      Starting LogMonitor...Please set Input parameter in glite_wms.conf - WM section
                                                                 [FAILED]
      [root@wms007 workload_manager]# /opt/glite/etc/init.d/glite-wms-lm status
      LogMonitor stopped.
    • Try to start ICE:
      [root@wms007 workload_manager]# /opt/glite/etc/init.d/glite-wms-ice start
      starting ICE... failure
      [root@wms007 workload_manager]# /opt/glite/etc/init.d/glite-wms-ice status
      /opt/glite/bin/glite-wms-ice-safe is not running
    • Try to start WorkloadManager:
      [root@wms007 workload_manager]# /opt/glite/etc/init.d/glite-wms-wm start
      starting workload manager... Please set Input parameter in  - WM section
      Please set DispatcherType parameter in  - WM section
      Please set Input parameter in  - JC section
      Please set InputType parameter in  - JC section
      Please set Input parameter in  - ICE section
      Please set InputType parameter in  - ICE section
      failure
      [root@wms007 workload_manager]# /opt/glite/etc/init.d/glite-wms-wm status
      /opt/glite/bin/glite-wms-workload_manager is not running
 

Revision 322010-03-25 - AlessioGianelle

Line: 1 to 1
 

Certification report patch 3621

Author(s): Elisabetta Molinari & Alessio Gianelle

Line: 357 to 357
 t is blacklisted
Changed:
<
<
  • Bug #63989: [ICE] doesn't handle exception raised by jobDir::new_entries()
>
>
  • Bug #63989: [ICE] doesn't handle exception raised by jobDir::new_entries() FIXED
    • Change the permission of the new directory in jobdir:
      [root@wms007 jobdir]# chmod 111 new/
      [root@wms007 jobdir]# ls -l
      total 48
      d--x--x--x  2 glite glite 40960 Mar 24 16:13 new
      drwxr-xr-x  2 glite glite  4096 Mar 24 16:13 old
      drwxr-xr-x  2 glite glite  4096 Mar 24 16:13 tmp
    • Look in ICE's log:
      2010-03-25 09:45:39,545 ERROR - Request_source_jobdir::get_requests() - Error returned by method jobDir::new_entries(): boost::filesystem::directory_iterator constructor: "/var/glite/ice/jobdir/new": Permission denied
      2010-03-25 09:45:40,546 ERROR - Request_source_jobdir::get_requests() - Error returned by method jobDir::new_entries(): boost::filesystem::directory_iterator constructor: "/var/glite/ice/jobdir/new": Permission denied
  -- AlessioGianelle - 2010-02-05

Revision 312010-03-24 - AlessioGianelle

Line: 1 to 1
 

Certification report patch 3621

Author(s): Elisabetta Molinari & Alessio Gianelle

Line: 328 to 328
 
    • Submit a job to a CreamCE and wait until it finished.
    • Submit another job to a different CreamCE, you should not see any query to the previous used CreamCE.
Changed:
<
<
  • Bug #61748: [ICE] EventQuery/Polling must be done also to blacklisted CE
>
>
  • Bug #61748: [ICE] EventQuery/Polling must be done also to blacklisted CE FIXED
    • Submit some jobs to a CreamCEVerify
    • Trigger a socket timeout so that ICE blacklisted the CreamCE :
      2010-03-24 15:58:40,753 ERROR - CreamProxyMethod::execute() - Connection timed out to CREAM: "EOF detected during communicati
      on. Probably service closed connection or SOCKET TIMEOUT occurred." on try 3/3. Blacklisting endpoint and giving up.
      2010-03-24 15:58:40,753 DEBUG - CEBlackList::blacklist_endpoint() - Blacklisting CE https://cream-25.pd.infn.it:8443/ce-cream
      /services/gridsite-delegation until Wed Mar 24 16:08:40 2010
    • Verify that the QueryEvent commad is called in any case:
      2010-03-24 16:05:28,952 DEBUG - eventStatusPoller::body() - Adding EventQuery command for couple (/C=IT/O=INFN/OU=Personal Ce
      rtificate/L=Padova/CN=Alessio Gianelle-/dteam/Role=NULL/Capability=NULL, https://cream-25.pd.infn.it:8443/ce-cream/services/C
      REAM2) to the thread pool...
    • Instead a submission fails:
      2010-03-24 15:58:43,265 DEBUG - Delegation_manager::delegate() - Creating new delegation with delegation id [12694427232E2651
      16wms0072Ecnaf2Einfn2Eit] CREAM URL [https://cream-25.pd.infn.it:8443/ce-cream/services/CREAM2] Delegation URL [https://cream
      -25.pd.infn.it:8443/ce-cream/services/gridsite-delegation] user DN [/C=IT/O=INFN/OU=Personal Certificate/L=Padova/CN=Alessio 
      Gianelle-/dteam/Role=NULL/Capability=NULL] proxy hash [/C=IT/O=INFN/OU=Personal Certificate/L=Padova/CN=Alessio Gianelle-/dte
      am/Role=NULL/Capability=NULL] MyProxy Server [myproxy.cern.ch] Expiring on [Thu Mar 25 12:54:02 2010]
      2010-03-24 15:58:43,265 DEBUG - CEBlackList::is_blacklisted() - CE https://cream-25.pd.infn.it:8443/ce-cream/services/gridsit
      e-delegation is blacklisted until Wed Mar 24 16:08:40 2010
      2010-03-24 15:58:43,265 ERROR - Delegation_manager::delegate() - FAILED Creation of a new delegation with delegation id [1269
      4427232E265116wms0072Ecnaf2Einfn2Eit] CREAM URL [https://cream-25.pd.infn.it:8443/ce-cream/services/CREAM2] Delegation URL [h
      ttps://cream-25.pd.infn.it:8443/ce-cream/services/gridsite-delegation] user DN [/C=IT/O=INFN/OU=Personal Certificate/L=Padova
      /CN=Alessio Gianelle-/dteam/Role=NULL/Capability=NULL] proxy hash [/C=IT/O=INFN/OU=Personal Certificate/L=Padova/CN=Alessio G
      ianelle-/dteam/Role=NULL/Capability=NULL] MyProxy Server [myproxy.cern.ch] - ERROR is: [The endpoint is blacklisted]
      2010-03-24 15:58:43,265 ERROR - iceCommandSubmit::execute() -  TID=[159308760] Error during submission of jdl= Fatal Exceptio
      n is:Failed to create a delegation id for job https://devel17.cnaf.infn.it:9000/UoVsvjIj1CPluHb81xM_pQ: reason is The endpoin
      t is blacklisted
 
  • Bug #63989: [ICE] doesn't handle exception raised by jobDir::new_entries()

Revision 302010-03-24 - AlessioGianelle

Line: 1 to 1
 

Certification report patch 3621

Author(s): Elisabetta Molinari & Alessio Gianelle

Line: 244 to 244
 
  • Bug #59453: [ICE] polling needs to be improved
Changed:
<
<
  • Bug #60688: [ICE] does not respect LB server/proxy selection through the LBproxy attribute
>
>
  • Bug #60668: [ICE] does not respect LB server/proxy selection through the LBproxy attribute FIXED
    • Set LBProxy = false; in glite_wms.conf (section Common), restart ice and submit...
      mysql> select * from events where jobid="YFyqjw3FF-BO-0U5BxCOtA";
      +------------------------+-------+------+-----------------+---------------------+---------------------+----------------------------------+--------+-------+---------------------+
      | jobid                  | event | code | prog            | host                | time_stamp          | userid                           | usec   | level | arrived             |
      +------------------------+-------+------+-----------------+---------------------+---------------------+----------------------------------+--------+-------+---------------------+
      | YFyqjw3FF-BO-0U5BxCOtA |     0 |    5 | WorkloadManager | wms007.cnaf.infn.it | 2010-03-24 12:04:39 | bdd27610035bb0ec9287e2ecaa3da2eb | 394848 |     8 | 2010-03-24 12:04:39 |
      | YFyqjw3FF-BO-0U5BxCOtA |     1 |   15 | WorkloadManager | wms007.cnaf.infn.it | 2010-03-24 12:04:39 | bdd27610035bb0ec9287e2ecaa3da2eb | 548652 |     8 | 2010-03-24 12:04:39 |
      | YFyqjw3FF-BO-0U5BxCOtA |     2 |    4 | WorkloadManager | wms007.cnaf.infn.it | 2010-03-24 12:04:39 | bdd27610035bb0ec9287e2ecaa3da2eb | 608084 |     8 | 2010-03-24 12:04:39 |
      | YFyqjw3FF-BO-0U5BxCOtA |     3 |    4 | WorkloadManager | wms007.cnaf.infn.it | 2010-03-24 12:04:39 | bdd27610035bb0ec9287e2ecaa3da2eb | 657231 |     8 | 2010-03-24 12:04:39 |
      +------------------------+-------+------+-----------------+---------------------+---------------------+----------------------------------+--------+-------+---------------------+
      4 rows in set (0.00 sec)
    • * Set LBProxy = true; in glite_wms.conf (section Common), restart ice and submit...
      mysql> select * from events where jobid="SlKOGSnaW0oKO3TJqw9tbA";
      +------------------------+-------+------+-----------------+---------------------+---------------------+----------------------------------+--------+-------+---------------------+
      | jobid                  | event | code | prog            | host                | time_stamp          | userid                           | usec   | level | arrived             |
      +------------------------+-------+------+-----------------+---------------------+---------------------+----------------------------------+--------+-------+---------------------+
      | SlKOGSnaW0oKO3TJqw9tbA |     0 |   17 | NetworkServer   | wms007.cnaf.infn.it | 2010-03-24 12:09:53 | 3f82b966e8a77413044be1a9144a4af4 | 342720 |     8 | 2010-03-24 12:09:53 |
      | SlKOGSnaW0oKO3TJqw9tbA |     1 |   21 | NetworkServer   | wms007.cnaf.infn.it | 2010-03-24 12:09:53 | 3f82b966e8a77413044be1a9144a4af4 | 470416 |     8 | 2010-03-24 12:09:53 |
      | SlKOGSnaW0oKO3TJqw9tbA |     2 |   21 | NetworkServer   | wms007.cnaf.infn.it | 2010-03-24 12:09:53 | 3f82b966e8a77413044be1a9144a4af4 | 526402 |     8 | 2010-03-24 12:09:53 |
      | SlKOGSnaW0oKO3TJqw9tbA |     3 |    2 | NetworkServer   | wms007.cnaf.infn.it | 2010-03-24 12:09:54 | 3f82b966e8a77413044be1a9144a4af4 | 606511 |     8 | 2010-03-24 12:09:54 |
      | SlKOGSnaW0oKO3TJqw9tbA |     4 |    4 | NetworkServer   | wms007.cnaf.infn.it | 2010-03-24 12:09:54 | 3f82b966e8a77413044be1a9144a4af4 | 712100 |     8 | 2010-03-24 12:09:54 |
      | SlKOGSnaW0oKO3TJqw9tbA |     5 |    4 | NetworkServer   | wms007.cnaf.infn.it | 2010-03-24 12:09:55 | 3f82b966e8a77413044be1a9144a4af4 |  43631 |     8 | 2010-03-24 12:09:55 |
      | SlKOGSnaW0oKO3TJqw9tbA |     6 |    5 | WorkloadManager | wms007.cnaf.infn.it | 2010-03-24 12:09:55 | bdd27610035bb0ec9287e2ecaa3da2eb | 167414 |     8 | 2010-03-24 12:09:55 |
      | SlKOGSnaW0oKO3TJqw9tbA |     7 |   15 | WorkloadManager | wms007.cnaf.infn.it | 2010-03-24 12:09:55 | bdd27610035bb0ec9287e2ecaa3da2eb | 297333 |     8 | 2010-03-24 12:09:55 |
      | SlKOGSnaW0oKO3TJqw9tbA |     8 |    4 | WorkloadManager | wms007.cnaf.infn.it | 2010-03-24 12:09:55 | bdd27610035bb0ec9287e2ecaa3da2eb | 369636 |     8 | 2010-03-24 12:09:55 |
      | SlKOGSnaW0oKO3TJqw9tbA |     9 |    4 | WorkloadManager | wms007.cnaf.infn.it | 2010-03-24 12:09:55 | bdd27610035bb0ec9287e2ecaa3da2eb | 431565 |     8 | 2010-03-24 12:09:55 |
      | SlKOGSnaW0oKO3TJqw9tbA |    10 |    5 | JobController   | wms007.cnaf.infn.it | 2010-03-24 12:09:55 | bdd27610035bb0ec9287e2ecaa3da2eb | 745052 |     8 | 2010-03-24 12:09:55 |
      | SlKOGSnaW0oKO3TJqw9tbA |    11 |    1 | LogMonitor      | wms007.cnaf.infn.it | 2010-03-24 12:09:55 | bdd27610035bb0ec9287e2ecaa3da2eb | 846002 |     8 | 2010-03-24 12:09:55 |
      | SlKOGSnaW0oKO3TJqw9tbA |    12 |    1 | LogMonitor      | wms007.cnaf.infn.it | 2010-03-24 12:10:04 | bdd27610035bb0ec9287e2ecaa3da2eb | 869424 |     8 | 2010-03-24 12:10:04 |
      | SlKOGSnaW0oKO3TJqw9tbA |    13 |    8 | LogMonitor      | wms007.cnaf.infn.it | 2010-03-24 12:11:39 | bdd27610035bb0ec9287e2ecaa3da2eb |  94855 |     8 | 2010-03-24 12:11:39 |
      | SlKOGSnaW0oKO3TJqw9tbA |    14 |   25 | LogMonitor      | wms007.cnaf.infn.it | 2010-03-24 12:11:39 | bdd27610035bb0ec9287e2ecaa3da2eb | 181448 |     8 | 2010-03-24 12:11:39 |
      | SlKOGSnaW0oKO3TJqw9tbA |    15 |   10 | LogMonitor      | wms007.cnaf.infn.it | 2010-03-24 12:11:39 | bdd27610035bb0ec9287e2ecaa3da2eb | 250291 |     8 | 2010-03-24 12:11:39 |
      +------------------------+-------+------+-----------------+---------------------+---------------------+----------------------------------+--------+-------+---------------------+
      16 rows in set (0.00 sec)

  • Bug #61312: [ICE] Error in handling user dn in ICE's poller FIXED
    • Submit 5 jobs to an old CreamCE (Cream 1.5) setting MyProxyServer attribute:
      2010-03-24 13:40:38,128 ERROR - iceCommandEventQuery::execute() -  TID=[159321352] Cannot query events for UserDN [/C=IT/O=INFN/OU=Personal Certificate/L=Padova/CN=Alessio Gianelle-/dteam/Role=NULL/Capability=NULL] CEUrl [https://cream-33.pd.infn.it:8443/ce-cream/services/CREAM2]. Exception Internal ex is [Received NULL fault; the error is due to another cause: FaultString=[No such operation 'QueryEventRequest'] - FaultCode=["http://xml.apache.org/axis/":Client] - FaultSubCode=["http://xml.apache.org/axis/":Client] - FaultDetail=[<ns2:hostname>cream-33.pd.infn.it</ns2:hostname>]]
      2010-03-24 13:40:38,128 WARN - iceCommandEventQuery::execute() -  TID=[159321352] Not present QueryEvent on CE [https://cream-33.pd.infn.it:8443/ce-cream/services/CREAM2]. Falling back to old-style StatusPoller.
      2010-03-24 13:40:38,128 INFO - iceCommandStatusPoller::execute() - Getting [100] jobs to poll for user [/C=IT/O=INFN/OU=Personal Certificate/L=Padova/CN=Alessio Gianelle-/dteam/Role=NULL/Capability=NULL] creamurl [https://cream-33.pd.infn.it:8443/ce-cream/services/CREAM2]
      2010-03-24 13:40:38,128 DEBUG - iceCommandStatusPoller::get_jobs_to_poll() - Collecting jobs to poll for userdn=[/C=IT/O=INFN/OU=Personal Certificate/L=Padova/CN=Alessio Gianelle-/dteam/Role=NULL/Capability=NULL] creamurl=[https://cream-33.pd.infn.it:8443/ce-cream/services/CREAM2]. LIMIT set to [100]...
      2010-03-24 13:40:38,129 DEBUG - iceCommandStatusPoller::get_jobs_to_poll() - Finished collecting jobs to poll. [5] jobs are to poll
      [...]
  • And so:
    [ale@cream-15 UI]$ glite-wms-job-status -v 0 -i testo --noint
 
Changed:
<
<
  • Bug #61312: [ICE] Error in handling user dn in ICE's poller
>
>
*********************************************************** BOOKKEEPING INFORMATION:

Status info for the Job : https://devel17.cnaf.infn.it:9000/tt3GLYuIiHuwrmnl7fGVtA Current Status: Done (Success)

*********************************************************** BOOKKEEPING INFORMATION:

Status info for the Job : https://devel17.cnaf.infn.it:9000/lY9fdOgQk5RcaH99g23z5g Current Status: Done (Success)

*********************************************************** BOOKKEEPING INFORMATION:

Status info for the Job : https://devel17.cnaf.infn.it:9000/jta5KlBZEP-r2KbE0SB0Vw Current Status: Done (Success)

*********************************************************** BOOKKEEPING INFORMATION:

Status info for the Job : https://devel17.cnaf.infn.it:9000/TNqI_PbRyqgFAN3L52IpKQ Current Status: Done (Success)

*********************************************************** BOOKKEEPING INFORMATION:

Status info for the Job : https://devel17.cnaf.infn.it:9000/V7Pnv2yE47CdHKgQmRaIvQ Current Status: Done (Success) *************************************************************

 
  • Bug #61405: [ICE] Missing proxy validity evaluation in ICE
Changed:
<
<
  • Bug #61413: [ICE] should not call EventQuery for a userDN if he/she doesn't have active jobs
>
>
  • Bug #61413: [ICE] should not call EventQuery for a userDN if he/she doesn't have active jobs FIXED
    • Submit a job to a CreamCE and wait until it finished.
    • Submit another job to a different CreamCE, you should not see any query to the previous used CreamCE.
 
  • Bug #61748: [ICE] EventQuery/Polling must be done also to blacklisted CE

Revision 292010-03-23 - AlessioGianelle

Line: 1 to 1
 

Certification report patch 3621

Author(s): Elisabetta Molinari & Alessio Gianelle

Line: 222 to 222
 
  • Bug #59240: [ICE] abort reasons not always printed in its logfile
Changed:
<
<
  • Bug #59399: [ICE] doesn't correctly handle request in jobdir/old when it is restarted
>
>
  • Bug #59399: [ICE] doesn't correctly handle request in jobdir/old when it is restarted FIXED
    • Verify submitting a big collection to cream CEs, and then restarting ICE in the middle of the submit process:
      2010-03-23 15:55:43,604 DEBUG - iceCommandSubmit::try_to_submit() -  TID=[168434952] Going to START CreamJobID [https://cream
      -32.pd.infn.it:8443/CREAM036926381] related to GridJobID [https://devel17.cnaf.infn.it:9000/iM8C3YV12fwhvIG5mNip5Q]...
    • restarting ice...
      2010-03-23 15:55:45,760 DEBUG - ICE VersionID is [Fri Mar 19 13:53:17 CET 2010] ProcessID=[23579]
      2010-03-23 15:55:45,760 INFO - glite-wms-ice::main() - Host certificate is [/home/glite/.certs/hostcert.pem]
      2010-03-23 15:55:45,817 DEBUG - iceThreadPool::iceThreadPool(ICE Submission Pool) - Creating 10 worker threads
      2010-03-23 15:55:45,819 DEBUG - iceThreadPool::iceThreadPool(ICE Poller Pool) - Creating 5 worker threads
      [...]
      2010-03-23 15:55:48,967 INFO - iceCommandSubmit::execute() -  TID=[144321160] This request is a Submission...
      2010-03-23 15:55:48,968 INFO - iceCommandSubmit::try_to_submit() -  TID=[144321160] GridJobID [https://devel17.cnaf.infn.it:9
      000/iM8C3YV12fwhvIG5mNip5Q] has already been REGISTERED. Will only START it...
      2010-03-23 15:55:48,968 DEBUG - iceCommandSubmit::try_to_submit() -  TID=[144321160] Going to START CreamJobID [https://cream
      -32.pd.infn.it:8443/CREAM036926381] related to GridJobID [https://devel17.cnaf.infn.it:9000/iM8C3YV12fwhvIG5mNip5Q]...
      2010-03-23 15:55:49,154 INFO - iceLBContext::setLoggingJob - Setting log job to jobid=[https://devel17.cnaf.infn.it:9000/iM8C
      3YV12fwhvIG5mNip5Q] LB server=[devel17.cnaf.infn.it:9000] (port is not used, actually...)
      2010-03-23 15:55:49,155 INFO - iceLBLogger::logEvent() - Cream Transfer OK Event - [gridJobID="https://devel17.cnaf.infn.it:9
      000/iM8C3YV12fwhvIG5mNip5Q" CREAMJobID="https://cream-32.pd.infn.it:8443/CREAM036926381"]
 
  • Bug #59453: [ICE] polling needs to be improved

Revision 282010-03-23 - AlessioGianelle

Line: 1 to 1
 

Certification report patch 3621

Author(s): Elisabetta Molinari & Alessio Gianelle

Line: 196 to 196
 20046 pts/2 S+ 0:00 grep ice
Changed:
<
<
  • Bug #57596: [ICE] non resubmission if job failed for proxy expiration
>
>
  • Bug #57596: [ICE] non resubmission if job failed for proxy expiration FIXED
    • Verify:
      2010-03-23 10:20:37,696 INFO - iceLBLogger::logEvent() - Job Done Failed Event, ExitCode=[0], FailureReason=[Proxy is expired; /opt/glite/bin/glite-lb-logevent: edg_wll_LogEvent*(): LB server (bkserver,lbproxy) store protocol error (edg_wll_LogEvent():  LB server (bkserver,lbproxy) store protocol error;; Logging library ERROR:  LB server (bkserver,lbproxy) store protocol error;; edg_wll_DoLogEvent(): edg_wll_log_connect error Transport endpoint is not connected;; edg_wll_gss_connect();; System Error: Connection refused) /opt/glite/bin/glite-lb-logevent: edg_wll_LogEvent*(): LB server (bkserver,lbproxy) store protocol error (edg_wll_LogEvent():  LB server (bkserver,lbproxy) store protocol error;; Logging library ERROR:  LB server (bkserver,lbproxy) store protocol error;; edg_wll_DoLogEvent(): edg_wll_log_connect error Transport endpoint is not connected;; edg_wll_gss_connect();; System Error: Connection refused) Proxy expired: job killed Terminated Master process killed] - [gridJobID="https://devel17.cnaf.infn.it:9000/jw2aeAy1skHY3mRJHCF8YA" CREAMJobID="https://ce202.cern.ch:8443/CREAM030114428"]
      2010-03-23 10:20:37,817 DEBUG - iceLBContext::testCode() - L&B call succeeded.
      2010-03-23 10:20:37,828 ERROR - Ice::resubmit_job() - Will NOT resubmit job [gridJobID="https://devel17.cnaf.infn.it:9000/jw2aeAy1skHY3mRJHCF8YA" CREAMJobID="https://ce202.cern.ch:8443/CREAM030114428"] because it's Input Sandbox proxy file is not valid: The proxy is EXPIRED!
      2010-03-23 10:20:37,828 INFO - iceLBContext::setLoggingJob - Setting log job to jobid=[https://devel17.cnaf.infn.it:9000/jw2aeAy1skHY3mRJHCF8YA] LB server=[devel17.cnaf.infn.it:9000] (port is not used, actually...)
      2010-03-23 10:20:37,828 INFO - iceLBLogger::logEvent() - Job Aborted Event, reason=[Input sandbox's proxy is missing. Cannot resubmit job] - [gridJobID="https://devel17.cnaf.infn.it:9000/jw2aeAy1skHY3mRJHCF8YA" CREAMJobID="https://ce202.cern.ch:8443/CREAM030114428"]

  • Bug #58387: [ICE] should log a job aborted when it cannot resubmit the job for missing user proxy FIXED
    • Verify:
      *************************************************************
      BOOKKEEPING INFORMATION:
 
Changed:
<
<
  • Bug #58387: [ICE] should log a job aborted when it cannot resubmit the job for missing user proxy
>
>
Status info for the Job : https://devel17.cnaf.infn.it:9000/jw2aeAy1skHY3mRJHCF8YA Current Status: Aborted Logged Reason(s): - Proxy is expired; /opt/glite/bin/glite-lb-logevent: edg_wll_LogEvent*(): LB server (bkserver,lbproxy) store protocol error (edg_wll_LogEvent(): LB server (bkserver,lbproxy) store protocol error;; Logging library ERROR: LB server (bkserver,lbproxy) store protocol error;; edg_wll_DoLogEvent(): edg_wll_log_connect error Transport endpoint is not connected;; edg_wll_gss_connect();; System Error: Connection refused) /opt/glite/bin/glite-lb-logevent: edg_wll_LogEvent*(): LB server (bkserver,lbproxy) store protocol error (edg_wll_LogEvent(): LB server (bkserver,lbproxy) store protocol error;; Logging library ERROR: LB server (bkserver,lbproxy) store protocol error;; edg_wll_DoLogEvent(): edg_wll_log_connect error Transport endpoint is not connected;; edg_wll_gss_connect();; System Error: Connection refused) Proxy expired: job killed Terminated Master process killed Status Reason: Input sandbox's proxy is missing. Cannot resubmit job Destination: ce202.cern.ch:8443/cream-lsf-grid_dteam Submitted: Tue Mar 23 09:49:42 2010 CET *************************************************************
 
  • Bug #58977: [ICE] Wrong database colum name in ICE SQL query

Revision 272010-03-23 - AlessioGianelle

Line: 1 to 1
 

Certification report patch 3621

Author(s): Elisabetta Molinari & Alessio Gianelle

Line: 159 to 159
 
  • Bug #56636: [ICE] statistics counters for monitoring
Changed:
<
<
  • Bug #57295: [ICE] queryDb tool may create empty DB as root
>
>
  • Bug #57295: [ICE] queryDb tool may create empty DB as root FIXED
    • Verify:
      [root@wms007 ~]#  ll /var/glite/ice/persist_dir/ice.db 
      -rw-r--r--  1 glite glite 1280000 Mar 22 17:05 /var/glite/ice/persist_dir/ice.db
      [root@wms007 ~]#  /opt/glite/bin/queryDb -c glite_wms.conf -s RUNNING,REALLY_RUNNING 
      0 item(s) found
      [root@wms007 ~]#  ll /var/glite/ice/persist_dir/ice.db 
      -rw-r--r--  1 glite glite 1280000 Mar 22 17:05 /var/glite/ice/persist_dir/ice.db

  • Bug #57579: [ICE] Occasionally the ICE's start/stop script doesn't kill the ICE process HOPEFULLY FIXED
    • Verify:
      [root@wms007 ~]# /opt/glite/etc/init.d/glite-wms-ice status
      /opt/glite/bin/glite-wms-ice-safe (pid 1433) is running...
      [root@wms007 ~]# /opt/glite/etc/init.d/glite-wms-ice stop
      stopping ICE... ok
      [root@wms007 ~]# ps ax | grep ice
      19866 pts/2    S+     0:00 grep ice
      [root@wms007 ~]# /opt/glite/etc/init.d/glite-wms-ice start
      starting ICE... ok
      [root@wms007 ~]# ps ax | grep ice
      19899 ?        Ss     0:00 /opt/glite/bin/glite-wms-ice-safe --conf glite_wms.conf --daemon /var/glite/glite-wms-ice-safe.pid
      19903 ?        S      0:00 sh -c /opt/glite/bin/glite-wms-ice --conf glite_wms.conf > /var/log/glite/ice_console.log 2>&1
      19904 ?        Sl     0:00 /opt/glite/bin/glite-wms-ice --conf glite_wms.conf
      19932 pts/2    S+     0:00 grep ice
      [root@wms007 ~]# /opt/glite/etc/init.d/glite-wms-ice stop
      stopping ICE... ok
      [root@wms007 ~]# ps ax | grep ice
      19978 pts/2    S+     0:00 grep ice
      [root@wms007 ~]# /opt/glite/etc/init.d/glite-wms-ice start
      starting ICE... ok
      [root@wms007 ~]# ps ax | grep ice
      20009 ?        Ss     0:00 /opt/glite/bin/glite-wms-ice-safe --conf glite_wms.conf --daemon /var/glite/glite-wms-ice-safe.pid
      20013 ?        S      0:00 sh -c /opt/glite/bin/glite-wms-ice --conf glite_wms.conf > /var/log/glite/ice_console.log 2>&1
      20014 ?        Sl     0:00 /opt/glite/bin/glite-wms-ice --conf glite_wms.conf
      20046 pts/2    S+     0:00 grep ice
 
Deleted:
<
<
  • Bug #57579: [ICE] Occasionally the ICE's start/stop script doesn't kill the ICE process
 
  • Bug #57596: [ICE] non resubmission if job failed for proxy expiration

Revision 262010-03-22 - AlessioGianelle

Line: 1 to 1
 

Certification report patch 3621

Author(s): Elisabetta Molinari & Alessio Gianelle

Line: 111 to 111
 
      • Owner = /C=IT/O=INFN/OU=Personal Certificate/L=Padova/CN=Alessio Gianelle
      • MyProxyServer = "myproxy.cnaf.infn.it";
Changed:
<
<
  • Bug #53460: [ICE] Detection of job status changes for CREAM jobs should be improved

  • Bug #55103: [ICE] ICE port 7010 not cleaned up properly
>
>
  • Bug #53460: [ICE] Detection of job status changes for CREAM jobs should be improved FIXED
    • Using a new CE (1.6) looking in ice's log there is:
      2010-03-22 16:47:50,496 INFO - scoped_timer iceCommandEventQuery::execute() - SOAP Connection for QueryEvent - TID=[150673032] 1269272870.288498 1269272870.496129 0.207631
      2010-03-22 16:47:50,496 DEBUG - iceCommandEventQuery::execute() -  TID=[150673032] There're [2] event(s) for the couple DN [/C=IT/O=INFN/OU=Personal Certificate/L=Padova/CN=Alessio Gianelle-/dteam/Role=NULL/Capability=NULL] CEUrl [https://cream-30.pd.infn.it:8443/ce-cream/services/CREAM2]
      2010-03-22 16:47:50,496 DEBUG - iceCommandEventQuery::execute() -  TID=[150673032] Database  ID=[1261041182000]
      2010-03-22 16:47:50,496 DEBUG - iceCommandEventQuery::execute() -  TID=[150673032] Exec time ID=[3]
      2010-03-22 16:47:50,496 DEBUG - iceCommandEventQuery::processEventsForJob() -  TID=[150673032] Processing [2] event(s) for Job [gridJobID="https://devel17.cnaf.infn.it:9000/uKbQNcbh7kIohBz6bDMNZQ" CREAMJobID="https://cream-30.pd.infn.it:8443/CREAM396193798"] userdn [/C=IT/O=INFN/OU=Personal Certificate/L=Padova/CN=Alessio Gianelle-/dteam/Role=NULL/Capability=NULL] and ce url [https://cream-30.pd.infn.it:8443/ce-cream/services/CREAM2].
      2010-03-22 16:47:50,496 DEBUG - iceCommandEventQuery::processEventsForJob() -  TID=[150673032] EventID [685143] timestsamp [1269272804]
      2010-03-22 16:47:50,496 INFO - scoped_timer iceCommandEventQuery::processSingleEvent - TID=[150673032] InsertStat 1269272870.496682 1269272870.496864 0.000182
    • Using an "old" CE instead the "poller" method is used:
      2010-03-22 16:55:55,397 INFO - scoped_timer iceCommandEventQuery::execute() - SOAP Connection for QueryEvent - TID=[150673032] 1269273355.242918 1269273355.397806 0.154888
      2010-03-22 16:55:55,397 ERROR - iceCommandEventQuery::execute() -  TID=[150673032] Cannot query events for UserDN [/C=IT/O=INFN/OU=Personal Certificate/L=Padova/CN=Alessio Gianelle-/dteam/Role=NULL/Capability=NULL] CEUrl [https://cream-34.pd.infn.it:8443/ce-cream/services/CREAM2]. Exception Internal ex is [Received NULL fault; the error is due to another cause: FaultString=[No such operation 'QueryEventRequest'] - FaultCode=["http://xml.apache.org/axis/":Client] - FaultSubCode=["http://xml.apache.org/axis/":Client] - FaultDetail=[<ns2:hostname>cream-34.pd.infn.it</ns2:hostname>]]
      2010-03-22 16:55:55,398 WARN - iceCommandEventQuery::execute() -  TID=[150673032] Not present QueryEvent on CE [https://cream-34.pd.infn.it:8443/ce-cream/services/CREAM2]. Falling back to old-style StatusPoller.
      2010-03-22 16:55:55,398 INFO - iceCommandStatusPoller::execute() - Getting [100] jobs to poll for user [/C=IT/O=INFN/OU=Personal Certificate/L=Padova/CN=Alessio Gianelle-/dteam/Role=NULL/Capability=NULL] creamurl [https://cream-34.pd.infn.it:8443/ce-cream/services/CREAM2]
      2010-03-22 16:55:55,398 DEBUG - iceCommandStatusPoller::get_jobs_to_poll() - Collecting jobs to poll for userdn=[/C=IT/O=INFN/OU=Personal Certificate/L=Padova/CN=Alessio Gianelle-/dteam/Role=NULL/Capability=NULL] creamurl=[https://cream-34.pd.infn.it:8443/ce-cream/services/CREAM2]. LIMIT set to [100]...

  • Bug #55103: [ICE] ICE port 7010 not cleaned up properly FIXED
    • We try a stop/start/restart sequence
      [root@wms007 ~]# ps ax | grep ice
       1283 pts/2    S+     0:00 grep ice
      30985 ?        Ss     0:00 /opt/glite/bin/glite-wms-ice-safe --conf glite_wms.conf --daemon /var/glite/glite-wms-ice-safe.pid
      30989 ?        S      0:00 sh -c /opt/glite/bin/glite-wms-ice --conf glite_wms.conf > /var/log/glite/ice_console.log 2>&1
      30990 ?        Sl     0:15 /opt/glite/bin/glite-wms-ice --conf glite_wms.conf
      [root@wms007 ~]# /opt/glite/etc/init.d/glite-wms-ice stop
      stopping ICE... ok
      [root@wms007 ~]# ps ax | grep ice
       1321 pts/2    S+     0:00 grep ice
      [root@wms007 ~]# /opt/glite/etc/init.d/glite-wms-ice start
      starting ICE... ok
      [root@wms007 ~]# ps ax | grep ice
       1353 ?        Ss     0:00 /opt/glite/bin/glite-wms-ice-safe --conf glite_wms.conf --daemon /var/glite/glite-wms-ice-safe.pid
       1357 ?        S      0:00 sh -c /opt/glite/bin/glite-wms-ice --conf glite_wms.conf > /var/log/glite/ice_console.log 2>&1
       1358 ?        Sl     0:00 /opt/glite/bin/glite-wms-ice --conf glite_wms.conf
       1398 pts/2    S+     0:00 grep ice
      [root@wms007 ~]# /opt/glite/etc/init.d/glite-wms-ice restart
      stopping ICE... ok
      starting ICE... ok
      [root@wms007 ~]# ps ax | grep ice
       1433 ?        Ss     0:00 /opt/glite/bin/glite-wms-ice-safe --conf glite_wms.conf --daemon /var/glite/glite-wms-ice-safe.pid
       1437 ?        S      0:00 sh -c /opt/glite/bin/glite-wms-ice --conf glite_wms.conf > /var/log/glite/ice_console.log 2>&1
       1438 ?        Sl     0:00 /opt/glite/bin/glite-wms-ice --conf glite_wms.conf
       1470 pts/2    S+     0:00 grep ice
 
  • Bug #55452: CMS production struck by waves of "Globus error 10: data transfer to the server failed"

Revision 252010-03-22 - AlessioGianelle

Line: 1 to 1
 

Certification report patch 3621

Author(s): Elisabetta Molinari & Alessio Gianelle

Line: 100 to 100
 
    • re-started the LM service checking that the jobdir gets recreated

  • Bug #52934: [ICE] Delegation in ICE doesn't refer to the myproxy server FIXED
Changed:
<
<
>
>
 
      • Deleg Proxy ID = [12692524052E32526wms0072Ecnaf2Einfn2Eit]
      • Destination: cream-30.pd.infn.it:8443/cream-pbs-cream_B
      • Owner = /C=IT/O=INFN/OU=Personal Certificate/L=Padova/CN=Alessio Gianelle
Changed:
<
<
>
>
 
      • Deleg Proxy ID = [12692523642E948823wms0072Ecnaf2Einfn2Eit]
      • Destination: cream-30.pd.infn.it:8443/cream-pbs-cream_B
      • Owner = /C=IT/O=INFN/OU=Personal Certificate/L=Padova/CN=Alessio Gianelle
Added:
>
>
      • MyProxyServer = "myproxy.cnaf.infn.it";
 
  • Bug #53460: [ICE] Detection of job status changes for CREAM jobs should be improved

Revision 242010-03-22 - AlessioGianelle

Line: 1 to 1
 

Certification report patch 3621

Author(s): Elisabetta Molinari & Alessio Gianelle

Line: 99 to 99
 
    • deleted the jobdir under '/var/glite/workload_manager'
    • re-started the LM service checking that the jobdir gets recreated
Changed:
<
<
  • Bug #52934: [ICE] Delegation in ICE doesn't refer to the myproxy server
>
>
  • Bug #52934: [ICE] Delegation in ICE doesn't refer to the myproxy server FIXED
    • glite-ce-job-status -L 2 https://cream-30.pd.infn.it:8443/CREAM504042437
      • Deleg Proxy ID = [12692524052E32526wms0072Ecnaf2Einfn2Eit]
      • Destination: cream-30.pd.infn.it:8443/cream-pbs-cream_B
      • Owner = /C=IT/O=INFN/OU=Personal Certificate/L=Padova/CN=Alessio Gianelle
    • glite-ce-job-status -L 2 https://cream-30.pd.infn.it:8443/CREAM456698107
      • Deleg Proxy ID = [12692523642E948823wms0072Ecnaf2Einfn2Eit]
      • Destination: cream-30.pd.infn.it:8443/cream-pbs-cream_B
      • Owner = /C=IT/O=INFN/OU=Personal Certificate/L=Padova/CN=Alessio Gianelle
 
  • Bug #53460: [ICE] Detection of job status changes for CREAM jobs should be improved

Revision 232010-03-16 - AlessioGianelle

Line: 1 to 1
 

Certification report patch 3621

Author(s): Elisabetta Molinari & Alessio Gianelle

Line: 99 to 99
 
    • deleted the jobdir under '/var/glite/workload_manager'
    • re-started the LM service checking that the jobdir gets recreated
Added:
>
>
  • Bug #52934: [ICE] Delegation in ICE doesn't refer to the myproxy server
 
  • Bug #53460: [ICE] Detection of job status changes for CREAM jobs should be improved
Added:
>
>
  • Bug #55103: [ICE] ICE port 7010 not cleaned up properly
 
  • Bug #55452: CMS production struck by waves of "Globus error 10: data transfer to the server failed"

  • Bug #56636: [ICE] statistics counters for monitoring

Revision 222010-03-12 - AlessioGianelle

Line: 1 to 1
 

Certification report patch 3621

Author(s): Elisabetta Molinari & Alessio Gianelle

Line: 98 to 98
 
    • stopped gLite services
    • deleted the jobdir under '/var/glite/workload_manager'
    • re-started the LM service checking that the jobdir gets recreated
Added:
>
>
  • Bug #53460: [ICE] Detection of job status changes for CREAM jobs should be improved
 
  • Bug #55452: CMS production struck by waves of "Globus error 10: data transfer to the server failed"
Changed:
<
<
  • Bug #61405: Missing proxy validity evaluation in ICE
>
>
  • Bug #56636: [ICE] statistics counters for monitoring

  • Bug #57295: [ICE] queryDb tool may create empty DB as root

  • Bug #57579: [ICE] Occasionally the ICE's start/stop script doesn't kill the ICE process

  • Bug #57596: [ICE] non resubmission if job failed for proxy expiration

  • Bug #58387: [ICE] should log a job aborted when it cannot resubmit the job for missing user proxy

  • Bug #58977: [ICE] Wrong database colum name in ICE SQL query

  • Bug #59240: [ICE] abort reasons not always printed in its logfile

  • Bug #59399: [ICE] doesn't correctly handle request in jobdir/old when it is restarted

  • Bug #59453: [ICE] polling needs to be improved

  • Bug #60688: [ICE] does not respect LB server/proxy selection through the LBproxy attribute

  • Bug #61312: [ICE] Error in handling user dn in ICE's poller

  • Bug #61405: [ICE] Missing proxy validity evaluation in ICE

  • Bug #61413: [ICE] should not call EventQuery for a userDN if he/she doesn't have active jobs

  • Bug #61748: [ICE] EventQuery/Polling must be done also to blacklisted CE

  • Bug #63989: [ICE] doesn't handle exception raised by jobDir::new_entries()
  -- AlessioGianelle - 2010-02-05

Revision 212010-03-09 - ElisabettaMolinari

Line: 1 to 1
 

Certification report patch 3621

Author(s): Elisabetta Molinari & Alessio Gianelle

Line: 94 to 94
 
  • Bug #42288: Problem in forwarding cerequirements to a CREAM CE
Changed:
<
<
  • Bug #48910: Failure starting LM if its output jobdir doesn't exist; unprotected chown in WM/LM/JC startup scripts
>
>
  • Bug #48910: Failure starting LM if its output jobdir doesn't exist; unprotected chown in WM/LM/JC startup scripts FIXED
    • stopped gLite services
    • deleted the jobdir under '/var/glite/workload_manager'
    • re-started the LM service checking that the jobdir gets recreated
 
  • Bug #55452: CMS production struck by waves of "Globus error 10: data transfer to the server failed"

  • Bug #61405: Missing proxy validity evaluation in ICE

Revision 202010-03-08 - AlessioGianelle

Line: 1 to 1
 

Certification report patch 3621

Author(s): Elisabetta Molinari & Alessio Gianelle

Line: 72 to 72
 

Others

  • BrokerInfo
Changed:
<
<
    • ICE creation No
    • JC creation: No
    • Verify all the glite-brokerinfo functions with the generated file No
>
>
    • ICE creation Yes / Done
    • JC creation: Yes / Done
 
  • Resubmission
    • Shallow: Yes / Done

Revision 192010-03-08 - ElisabettaMolinari

Line: 1 to 1
 

Certification report patch 3621

Author(s): Elisabetta Molinari & Alessio Gianelle

Line: 81 to 81
 
    • Deep: Yes / Done

  • Job Recovery
Changed:
<
<
    • Tested with a few collections re-starting the wm while some node jobs are still in a 'submitted or 'waiting' status No
>
>
    • Tested with a few collections re-starting the wm while some node jobs are still in a 'submitted or 'waiting' status Yes / Done
 
  • Prologue and Epilogue jobs
    • ICE: Yes / Done

Revision 182010-03-05 - ElisabettaMolinari

Line: 1 to 1
 

Certification report patch 3621

Author(s): Elisabetta Molinari & Alessio Gianelle

Line: 47 to 47
 
    • Submit a bulk of 3 jobs -> success 100% Yes / Done both to ICE and JC
    • Submit a bulk of 50 jobs -> success 100% Yes / Done both to ICE and JC
    • Submit a bulk of 100 jobs -> success 100% Yes / Done both to ICE and JC
Changed:
<
<
    • Submit a bulk of 500 jobs -> success 99.99% Yes / Done both to ICE and JC
    • Submit a bulk of 1000 jobs -> success ???% No
>
>
    • Submit a bulk of 500 jobs -> success 99.9% Yes / Done both to ICE and JC
    • Submit a bulk of 1000 jobs -> success 99.9% Yes / Done both to ICE and JC
 
  • Perusal jobs through:
    • JC work: Yes / Done

Revision 172010-03-03 - ElisabettaMolinari

Line: 1 to 1
 

Certification report patch 3621

Author(s): Elisabetta Molinari & Alessio Gianelle

Line: 46 to 46
 
  • Bulk jobs sent both through ICE and JC and RetryCount = 0; :
    • Submit a bulk of 3 jobs -> success 100% Yes / Done both to ICE and JC
    • Submit a bulk of 50 jobs -> success 100% Yes / Done both to ICE and JC
Changed:
<
<
    • Submit a bulk of 100 jobs -> success 100% Yes / Done to ICE
    • Submit a bulk of 500 jobs -> success ???% No
>
>
    • Submit a bulk of 100 jobs -> success 100% Yes / Done both to ICE and JC
    • Submit a bulk of 500 jobs -> success 99.99% Yes / Done both to ICE and JC
 
    • Submit a bulk of 1000 jobs -> success ???% No

  • Perusal jobs through:

Revision 162010-03-02 - AlessioGianelle

Line: 1 to 1
 

Certification report patch 3621

Author(s): Elisabetta Molinari & Alessio Gianelle

Line: 13 to 13
 

List Match

  • without data: Yes / Done
Changed:
<
<
  • with data: No
>
>
  • with data: Yes / Done
 

Submission/GetOutput

Revision 152010-03-02 - ElisabettaMolinari

Line: 1 to 1
 

Certification report patch 3621

Author(s): Elisabetta Molinari & Alessio Gianelle

Line: 84 to 84
 
    • Tested with a few collections re-starting the wm while some node jobs are still in a 'submitted or 'waiting' status No

  • Prologue and Epilogue jobs
Changed:
<
<
    • ICE: No
    • JC: No
>
>
    • ICE: Yes / Done
    • JC: Yes / Done
 

Revision 142010-03-01 - ElisabettaMolinari

Line: 1 to 1
 

Certification report patch 3621

Author(s): Elisabetta Molinari & Alessio Gianelle

Line: 46 to 46
 
  • Bulk jobs sent both through ICE and JC and RetryCount = 0; :
    • Submit a bulk of 3 jobs -> success 100% Yes / Done both to ICE and JC
    • Submit a bulk of 50 jobs -> success 100% Yes / Done both to ICE and JC
Changed:
<
<
    • Submit a bulk of 100 jobs -> success ???% No
>
>
    • Submit a bulk of 100 jobs -> success 100% Yes / Done to ICE
 
    • Submit a bulk of 500 jobs -> success ???% No
    • Submit a bulk of 1000 jobs -> success ???% No

Revision 132010-03-01 - ElisabettaMolinari

Line: 1 to 1
 

Certification report patch 3621

Author(s): Elisabetta Molinari & Alessio Gianelle

Line: 59 to 59
 

Cancel

  • Normal jobs
Changed:
<
<
    • ICE: No
    • JC: No
>
>
    • ICE: Yes / Done
    • JC: Yes / Done
 
  • Dag: Yes / Done
    • Note that children nodes in status 'submitted' don't get cancelled
  • Collection
    • ICE: Yes / Done
    • JC: Yes / Done
Changed:
<
<
  • Node of a collection: No
>
>
  • Node of a collection: Yes / Done
Note: collections stay in status 'waiting' when all the nodes are Done (Success) except for one that is 'Cancelled'
 

Others

Line: 76 to 77
 
    • Verify all the glite-brokerinfo functions with the generated file No

  • Resubmission
Changed:
<
<
    • Shallow: No
    • Deep: No
>
>
    • Shallow: Yes / Done
    • Deep: Yes / Done
 
  • Job Recovery
    • Tested with a few collections re-starting the wm while some node jobs are still in a 'submitted or 'waiting' status No

Revision 122010-02-25 - ElisabettaMolinari

Line: 1 to 1
 

Certification report patch 3621

Author(s): Elisabetta Molinari & Alessio Gianelle

Line: 61 to 61
 
  • Normal jobs
    • ICE: No
    • JC: No
Changed:
<
<
  • Dag: No
  • Collection: No
>
>
  • Dag: Yes / Done
    • Note that children nodes in status 'submitted' don't get cancelled
  • Collection
    • ICE: Yes / Done
    • JC: Yes / Done
 
  • Node of a collection: No

Others

Revision 112010-02-25 - ElisabettaMolinari

Line: 1 to 1
 

Certification report patch 3621

Author(s): Elisabetta Molinari & Alessio Gianelle

Line: 45 to 45
 
  • Bulk jobs sent both through ICE and JC and RetryCount = 0; :
    • Submit a bulk of 3 jobs -> success 100% Yes / Done both to ICE and JC
Changed:
<
<
    • Submit a bulk of 50 jobs -> success ???% No
>
>
    • Submit a bulk of 50 jobs -> success 100% Yes / Done both to ICE and JC
 
    • Submit a bulk of 100 jobs -> success ???% No
    • Submit a bulk of 500 jobs -> success ???% No
    • Submit a bulk of 1000 jobs -> success ???% No

Revision 102010-02-25 - ElisabettaMolinari

Line: 1 to 1
 

Certification report patch 3621

Author(s): Elisabetta Molinari & Alessio Gianelle

Line: 51 to 51
 
    • Submit a bulk of 1000 jobs -> success ???% No

  • Perusal jobs through:
Changed:
<
<
    • JC work: No
    • ICE work: No
>
>
    • JC work: Yes / Done
    • ICE work: Yes / Done
 
  • MPICH jobs: No

Revision 92010-02-24 - ElisabettaMolinari

Line: 1 to 1
 

Certification report patch 3621

Author(s): Elisabetta Molinari & Alessio Gianelle

Line: 44 to 44
  ]

  • Bulk jobs sent both through ICE and JC and RetryCount = 0; :
Changed:
<
<
    • Submit a bulk of 3 jobs -> success ???% No
>
>
    • Submit a bulk of 3 jobs -> success 100% Yes / Done both to ICE and JC
 
    • Submit a bulk of 50 jobs -> success ???% No
    • Submit a bulk of 100 jobs -> success ???% No
    • Submit a bulk of 500 jobs -> success ???% No

Revision 82010-02-23 - AlessioGianelle

Line: 1 to 1
Changed:
<
<

PATCH 3621

>
>

Certification report patch 3621

 
Changed:
<
<

List Match

>
>
Author(s): Elisabetta Molinari & Alessio Gianelle

Outcome: in certification...

Clean installation

Upgrade from production

Test Report

List Match

 
  • without data: Yes / Done
  • with data: No
Changed:
<
<

Submission/GetOutput

>
>

Submission/GetOutput

 
  • Normal jobs through
    • ICE work: Yes / Done
    • JC work: Yes / Done
Deleted:
<
<
      • Sometimes the status of the job is not correctly computed due to a wrong WorkerNode SequenceCode:
        Event: ReallyRunning
        - Arrived                    =    Tue Feb 16 12:54:26 2010 CET
        - Host                       =    wms007.cnaf.infn.it
        - Level                      =    SYSTEM
        - Priority                   =    asynchronous
        - Seqcode                    =    UI=000000:NS=0000000005:WM=000004:BH=0000000000:JSS=000003:LM=000007:LRMS=000000:APP=000000:LBS=000000
        - Source                     =    LogMonitor
        - Src instance               =    unique
        - Timestamp                  =    Tue Feb 16 12:54:25 2010 CET
        - User                       =    /C=IT/O=INFN/OU=Personal Certificate/L=Padova/CN=Alessio Gianelle/CN=proxy/CN=proxy
        - Wn seq                     =    UI=000000:NS=0000000000:WM=000000:BH=0000000000:JSS=000000:LM=000000:LRMS=000003:APP=000000:LBS=000000
This happen with jobs sent to cclcgceliXX.in2p3.fr:2119/jobmanager-bqs-xxx CEs
 
  • Dag jobs through:
    • JC work: Yes / Done OK
Deleted:
<
<
      • tested with the following:
        [
        Type = "dag";
        VirtualOrganisation = "dteam";
        Max_nodes_running = 10;
        InputSandbox = "test.sh";
        FuzzyRank = true;
        Nodes = [
        nodeA = [
        file= "test_dag.jdl";
        ];
        nodeB = [
        file= "test_dag.jdl";
        ];
        nodeC = [
        file= "test_dag.jdl";
        ];
        nodeD = [
        file= "test_dag.jdl";
        ];
        nodeE = [
        file= "test_dag.jdl";
        ];
        nodeF= [
        file= "test_dag.jdl";
        ];
        ];
        Dependencies = {
        {{nodeA, nodeB}, nodeC},{nodeD,nodeE.nodeF}
        }
        ]
 
  • Collection jobs through:
    • ICE work: Yes / Done
Line: 89 to 56
 
  • MPICH jobs: No
Changed:
<
<

Cancel

>
>

Cancel

 
  • Normal jobs
    • ICE: No
Line: 98 to 65
 
  • Collection: No
  • Node of a collection: No
Changed:
<
<

Others

>
>

Others

 
  • BrokerInfo
    • ICE creation No

Revision 72010-02-23 - ElisabettaMolinari

Line: 1 to 1
 

PATCH 3621

List Match

Line: 25 to 25
 This happen with jobs sent to cclcgceliXX.in2p3.fr:2119/jobmanager-bqs-xxx CEs

  • Dag jobs through:
Changed:
<
<
    • JC work: No FAILED
      • Failed with the following:
        glite-wms-job-status https://devel17.cnaf.infn.it:9000/Nf8jcFJuDKCKxB2eKuFo-w
         
        
        *************************************************************
        BOOKKEEPING INFORMATION:
        
        Status info for the Job : https://devel17.cnaf.infn.it:9000/Nf8jcFJuDKCKxB2eKuFo-w
        Current Status:     Done (Exit Code !=0)
        Exit code:          1
        Status Reason:      Warning: job exit code != 0
        Destination:        dagman
        Submitted:          Tue Feb 16 10:08:59 2010 CET
        *************************************************************
      • The "real" reason is: Unrecognized argument: -Condorlog. See also here.
>
>
    • JC work: Yes / Done OK
      • tested with the following:
        [
        Type = "dag";
        VirtualOrganisation = "dteam";
        Max_nodes_running = 10;
        InputSandbox = "test.sh";
        FuzzyRank = true;
        Nodes = [
        nodeA = [
        file= "test_dag.jdl";
        ];
        nodeB = [
        file= "test_dag.jdl";
        ];
        nodeC = [
        file= "test_dag.jdl";
        ];
        nodeD = [
        file= "test_dag.jdl";
        ];
        nodeE = [
        file= "test_dag.jdl";
        ];
        nodeF= [
        file= "test_dag.jdl";
        ];
        ];
        Dependencies = {
        {{nodeA, nodeB}, nodeC},{nodeD,nodeE.nodeF}
        }
        ]
 
  • Collection jobs through:
    • ICE work: Yes / Done
Line: 47 to 63
 
    • also job-output for collections works even though only the parent node is set to 'Cleared'

  • Parametric jobs through:
Changed:
<
<
    • ICE work: No
    • JC work: No
>
>
    • ICE work: Yes / Done
    • JC work: Yes / Done
 
      • tested with the following
         [
          JobType = "parametric";
          Executable = "/usr/bin/env";

Revision 62010-02-16 - AlessioGianelle

Line: 1 to 1
 

PATCH 3621

List Match

Line: 10 to 10
 
  • Normal jobs through
    • ICE work: Yes / Done
    • JC work: Yes / Done
Added:
>
>
      • Sometimes the status of the job is not correctly computed due to a wrong WorkerNode SequenceCode:
        Event: ReallyRunning
        - Arrived                    =    Tue Feb 16 12:54:26 2010 CET
        - Host                       =    wms007.cnaf.infn.it
        - Level                      =    SYSTEM
        - Priority                   =    asynchronous
        - Seqcode                    =    UI=000000:NS=0000000005:WM=000004:BH=0000000000:JSS=000003:LM=000007:LRMS=000000:APP=000000:LBS=000000
        - Source                     =    LogMonitor
        - Src instance               =    unique
        - Timestamp                  =    Tue Feb 16 12:54:25 2010 CET
        - User                       =    /C=IT/O=INFN/OU=Personal Certificate/L=Padova/CN=Alessio Gianelle/CN=proxy/CN=proxy
        - Wn seq                     =    UI=000000:NS=0000000000:WM=000000:BH=0000000000:JSS=000000:LM=000000:LRMS=000003:APP=000000:LBS=000000
This happen with jobs sent to cclcgceliXX.in2p3.fr:2119/jobmanager-bqs-xxx CEs
 
  • Dag jobs through:
    • JC work: No FAILED
Deleted:
<
<
      • tested with the following
        [
        Type = "dag";
        VirtualOrganisation = "dteam";
        Max_nodes_running = 10;
        InputSandbox = "test.sh";
        FuzzyRank = true;
        Nodes = [
        nodeA = [
        file= "test_dag.jdl";
        ];
        nodeB = [
        file= "test_dag.jdl";
        ];
        nodeC = [
        file= "test_dag.jdl";
        ];
        nodeD = [
        file= "test_dag.jdl";
        ];
        nodeE = [
        file= "test_dag.jdl";
        ];
        nodeF= [
        file= "test_dag.jdl";
        ];
        ];
        Dependencies = {
        {{nodeA, nodeB}, nodeC},{nodeD,nodeE.nodeF}
        }
        ]
 
    • Failed with the following:
      glite-wms-job-status https://devel17.cnaf.infn.it:9000/Nf8jcFJuDKCKxB2eKuFo-w
       
      
Line: 55 to 38
 Status Reason: Warning: job exit code = 0 Destination: dagman Submitted: Tue Feb 16 10:08:59 2010 CET
Changed:
<
<
***********************************************************
>
>
*************************************************************
      • The "real" reason is: Unrecognized argument: -Condorlog. See also here.
 
  • Collection jobs through:
    • ICE work: Yes / Done
    • JC work: Yes / Done
Changed:
<
<
    • also job-output for collections works even though only the parent node is set to 'Cleared'
>
>
    • also job-output for collections works even though only the parent node is set to 'Cleared'
 
  • Parametric jobs through:
    • ICE work: No

Revision 52010-02-16 - ElisabettaMolinari

Line: 1 to 1
 

PATCH 3621

List Match

Line: 9 to 9
 
  • Normal jobs through
    • ICE work: Yes / Done
Changed:
<
<
    • JC work: No
>
>
    • JC work: Yes / Done
 
  • Dag jobs through:
Changed:
<
<
    • JC work: No
>
>
    • JC work: No FAILED
 
      • tested with the following
        [
        Type = "dag";
        VirtualOrganisation = "dteam";
Line: 43 to 43
 {{nodeA, nodeB}, nodeC},{nodeD,nodeE.nodeF} } ]
Added:
>
>
    • Failed with the following:
      glite-wms-job-status https://devel17.cnaf.infn.it:9000/Nf8jcFJuDKCKxB2eKuFo-w
       
      
      *************************************************************
      BOOKKEEPING INFORMATION:
      
      Status info for the Job : https://devel17.cnaf.infn.it:9000/Nf8jcFJuDKCKxB2eKuFo-w
      Current Status:     Done (Exit Code !=0)
      Exit code:          1
      Status Reason:      Warning: job exit code != 0
      Destination:        dagman
      Submitted:          Tue Feb 16 10:08:59 2010 CET
      *************************************************************
      
 
  • Collection jobs through:
Changed:
<
<
    • ICE work: No
    • JC work: No
>
>
    • ICE work: Yes / Done
    • JC work: Yes / Done
    • also job-output for collections works even though only the parent node is set to 'Cleared'
 
  • Parametric jobs through:
    • ICE work: No

Revision 42010-02-15 - ElisabettaMolinari

Line: 1 to 1
 

PATCH 3621

List Match

Line: 8 to 8
 

Submission/GetOutput

  • Normal jobs through
Changed:
<
<
    • ICE work: No
>
>
    • ICE work: Yes / Done
 
    • JC work: No

  • Dag jobs through:

Revision 32010-02-15 - AlessioGianelle

Line: 1 to 1
 

PATCH 3621

List Match

Line: 114 to 114
 
  • Bug #55452: CMS production struck by waves of "Globus error 10: data transfer to the server failed"
Changed:
<
<
>
>
  • Bug #61405: Missing proxy validity evaluation in ICE
  -- AlessioGianelle - 2010-02-05

Revision 22010-02-10 - ElisabettaMolinari

Line: 1 to 1
 

PATCH 3621

List Match

Changed:
<
<
  • without data: No
>
>
  • without data: Yes / Done
 
  • with data: No

Submission/GetOutput

Revision 12010-02-05 - AlessioGianelle

Line: 1 to 1
Added:
>
>

PATCH 3621

List Match

  • without data: No
  • with data: No

Submission/GetOutput

  • Normal jobs through
    • ICE work: No
    • JC work: No

  • Dag jobs through:
    • JC work: No
      • tested with the following
        [
        Type = "dag";
        VirtualOrganisation = "dteam";
        Max_nodes_running = 10;
        InputSandbox = "test.sh";
        FuzzyRank = true;
        Nodes = [
        nodeA = [
        file= "test_dag.jdl";
        ];
        nodeB = [
        file= "test_dag.jdl";
        ];
        nodeC = [
        file= "test_dag.jdl";
        ];
        nodeD = [
        file= "test_dag.jdl";
        ];
        nodeE = [
        file= "test_dag.jdl";
        ];
        nodeF= [
        file= "test_dag.jdl";
        ];
        ];
        Dependencies = {
        {{nodeA, nodeB}, nodeC},{nodeD,nodeE.nodeF}
        }
        ]

  • Collection jobs through:
    • ICE work: No
    • JC work: No

  • Parametric jobs through:
    • ICE work: No
    • JC work: No
      • tested with the following
         [
          JobType = "parametric";
          Executable = "/usr/bin/env";
          Environment = {"MYPATH_PARAM_=$PATH:/bin:/usr/bin:$HOME"};
          StdOutput = "echo_PARAM_.out";
          StdError = "echo_PARAM_.err";
          OutputSandbox = {"echo_PARAM_.out","echo_PARAM_.err"};
          Parameters =  5;
                usertags = [ jdl = "parametric" ];
         ]

  • Bulk jobs sent both through ICE and JC and RetryCount = 0; :
    • Submit a bulk of 3 jobs -> success ???% No
    • Submit a bulk of 50 jobs -> success ???% No
    • Submit a bulk of 100 jobs -> success ???% No
    • Submit a bulk of 500 jobs -> success ???% No
    • Submit a bulk of 1000 jobs -> success ???% No

  • Perusal jobs through:
    • JC work: No
    • ICE work: No

  • MPICH jobs: No

Cancel

  • Normal jobs
    • ICE: No
    • JC: No
  • Dag: No
  • Collection: No
  • Node of a collection: No

Others

  • BrokerInfo
    • ICE creation No
    • JC creation: No
    • Verify all the glite-brokerinfo functions with the generated file No

  • Resubmission
    • Shallow: No
    • Deep: No

  • Job Recovery
    • Tested with a few collections re-starting the wm while some node jobs are still in a 'submitted or 'waiting' status No

  • Prologue and Epilogue jobs
    • ICE: No
    • JC: No



Check bugs:

  • Bug #42288: Problem in forwarding cerequirements to a CREAM CE

  • Bug #48910: Failure starting LM if its output jobdir doesn't exist; unprotected chown in WM/LM/JC startup scripts

  • Bug #55452: CMS production struck by waves of "Globus error 10: data transfer to the server failed"

-- AlessioGianelle - 2010-02-05

 
This site is powered by the TWiki collaboration platformCopyright © 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback