Difference: CreamTestsP3179 (1 vs. 50)

Revision 502010-03-10 - PaoloAndreetto

Line: 1 to 1
 

PATCH 3179

Line: 407 to 407
 
    • temporary rename the directory /opt/glite/var/cream_sandbox/<voname>
    • submit a job using voms-proxy published by the given VO and verify that the job fails and no directory /opt/glite/var/cream_sandbox/<voname> has been created.
Changed:
<
<
-- AlessioGianelle - 2010-02-05
>
>

Clean installation

 
Changed:
<
<
>
>
  • Installation steps:
    wget http://etics-repository.cern.ch:8080/repository/pm/registered/repomd/name/patch_3179/etics-registered-build-by-name.repo -O /etc/yum.repos.d/glite-CREAM.repo
    yum install xml-commons-apis
    yum install glite-CREAM
    wget http://grid-deployment.web.cern.ch/grid-deployment/glite/repos/3.2/glite-TORQUE_utils.repo -O /etc/yum.repos.d/glite-TORQUE_utils.repo
    wget http://grid-deployment.web.cern.ch/grid-deployment/glite/repos/3.2/glite-TORQUE_server.repo -O /etc/yum.repos.d/glite-TORQUE_server.repo
    yum install glite-TORQUE_utils glite-TORQUE_server
    /opt/glite/yaim/bin/yaim -c -s site-info.def -n creamCE -n TORQUE_server -n TORQUE_utils
    
  • View the log of yum for a clean installation
  • View the log of yaim for a clean installation (TORQUE is used)
 
Changed:
<
<
>
>

Upgrade from production

  • Upgrade steps:
    wget http://etics-repository.cern.ch:8080/repository/pm/registered/repomd/name/patch_3179/etics-registered-build-by-name.repo -O /etc/yum.repos.d/glite-CREAM.repo
    yum update
    /opt/glite/yaim/bin/yaim -c -s site-info.def -n creamCE -n TORQUE_server -n TORQUE_utils
    
  • View the log of yum for an upgrade
  • View the log of yaim for an upgrade (TORQUE is used)

-- AlessioGianelle - 2010-02-05

 
META FILEATTACHMENT attachment="reports_patch3179_01.tar.gz" attr="" comment="Testsuite reports for patch 3179" date="1267618838" name="reports_patch3179_01.tar.gz" path="reports_patch3179_01.tar.gz" size="263482" stream="reports_patch3179_01.tar.gz" tmpFilename="/usr/tmp/CGItemp7655" user="PaoloAndreetto" version="1"
META FILEATTACHMENT attachment="reports_patch3179_02.tar.gz" attr="" comment="Testsuite reports for patch 3179" date="1267701127" name="reports_patch3179_02.tar.gz" path="reports_patch3179_02.tar.gz" size="186358" stream="reports_patch3179_02.tar.gz" tmpFilename="/usr/tmp/CGItemp7389" user="PaoloAndreetto" version="1"
META FILEATTACHMENT attachment="ice.png" attr="" comment="WMS test" date="1268064941" name="ice.png" path="ice.png" size="4626" stream="ice.png" tmpFilename="/usr/tmp/CGItemp10483" user="AlessioGianelle" version="2"
Added:
>
>
META FILEATTACHMENT attachment="yum_installation_log.txt.gz" attr="" comment="Log from yum installation" date="1268228202" name="yum_installation_log.txt.gz" path="yum_installation_log.txt.gz" size="10295" stream="yum_installation_log.txt.gz" tmpFilename="/usr/tmp/CGItemp7897" user="PaoloAndreetto" version="1"
META FILEATTACHMENT attachment="yaim_installation_log.txt.gz" attr="" comment="Log from yaim installation" date="1268228230" name="yaim_installation_log.txt.gz" path="yaim_installation_log.txt.gz" size="20169" stream="yaim_installation_log.txt.gz" tmpFilename="/usr/tmp/CGItemp7830" user="PaoloAndreetto" version="1"
META FILEATTACHMENT attachment="yum_update_log.txt.gz" attr="" comment="Log from yum update" date="1268228254" name="yum_update_log.txt.gz" path="yum_update_log.txt.gz" size="2418" stream="yum_update_log.txt.gz" tmpFilename="/usr/tmp/CGItemp7844" user="PaoloAndreetto" version="1"
META FILEATTACHMENT attachment="yaim_update_log.txt.gz" attr="" comment="Log from yaim update" date="1268228280" name="yaim_update_log.txt.gz" path="yaim_update_log.txt.gz" size="16822" stream="yaim_update_log.txt.gz" tmpFilename="/usr/tmp/CGItemp7719" user="PaoloAndreetto" version="1"

Revision 492010-03-10 - MassimoSgaravatto

Line: 1 to 1
 

PATCH 3179

Line: 296 to 296
 
  • Bug #58423: RFE: support for ISB/OSB transfers from/to gridftp servers running using credentials FIXED
    • tested using java-based UI
Added:
>
>
    • tested using the following JDL:
      [
      executable="/bin/ls";
      inputsandbox={"gsiftp://lxsgaravatto.pd.infn.it:6787/etc/fstab?DN=/C=IT/O=INFN/OU=Personal Certificate/L=Padova/CN=Massimo Sgaravatto"};
      stdoutput="out-gsi.out";
      stderror="err-gsi.err";
      outputsandbox={"out-gsi.out", "err-gsi.err"}
      outputsandboxbasedesturi="gsiftp://lxsgaravatto.pd.infn.it:6787/tmp?DN=/C=IT/O=INFN/OU=Personal Certificate/L=Padova/CN=Massimo Sgaravatto";
      ]
      
 
  • Bug #58659: NullPointerException from getStatus FIXED
    • try to overload the node with the testsuite: cream-test-monitored-submit -r 30 -n 20000 -m 2000 -C 50 -l log4py.conf -j test.jdl -R --sotimeout 60 --vo dteam --valid 02:00 and verify that the log of the testsuite does not report any NullPointerException
Line: 365 to 375
 
    • Not possible to reproduce it according to the developer (M. Mezzadri)

  • Bug #62207: [ yaim-cream ] Enable Glue 2.0 publishing FIXED
Changed:
<
<
    • send the following query to the CE BDII: ldapsearch -x -h $(hostname) -p 2170 -b o=glue and verify that it returns GLUE2 schema and information
>
>
 
  • Bug #62436: Possible problem with updater if job remain queued too long FIXED
    • Fixed as reported here: 3 jobs lasting 2 hours were submitted to a CREAM CE with only 2 job slots. For the third one the BNotifier logged the right events (i.e. it didn't log status=4 with failurereason=999)

Revision 482010-03-10 - MassimoSgaravatto

Line: 1 to 1
 

PATCH 3179

Line: 295 to 295
 
    • verify with /opt/glite/libexec/glite-info-wrapper | grep -i gluecestatestatus

  • Bug #58423: RFE: support for ISB/OSB transfers from/to gridftp servers running using credentials FIXED
Changed:
<
<
    • tested by the condor team, used java-based UI
>
>
    • tested using java-based UI
 
  • Bug #58659: NullPointerException from getStatus FIXED
    • try to overload the node with the testsuite: cream-test-monitored-submit -r 30 -n 20000 -m 2000 -C 50 -l log4py.conf -j test.jdl -R --sotimeout 60 --vo dteam --valid 02:00 and verify that the log of the testsuite does not report any NullPointerException
Line: 342 to 342
 
  • Bug #61407: Set CE_ID in the cream jw FIXED
    • submit a job and verify that the jobwrapper script, contained into the sandbox area for that job, defines correctly the CE_ID variable
Changed:
<
<
  • Bug #61493: [ yaim-cream-ce ] glexec_get_account policy order is wrong INVALID
    • the structure of the file lcmaps-glexec.db is change so that it is complaint with the one used by gridftp
>
>
  • Bug #61493: [ yaim-cream-ce ] glexec_get_account policy order is wrong FIXED
    • As reported in the bug, this was fixed fixing bug #58941
 
  • Bug #61604: yaim-cream-ce should not install config_gip_software_plugin FIXED
    • verify that the glite-yaim-cream-ce package does not contain the file config_gip_software_plugin but it contains config_cream_gip_software_plugin instead

Revision 472010-03-10 - PaoloAndreetto

Line: 1 to 1
 

PATCH 3179

Line: 294 to 294
 
  • Bug #58119: CREAM CE: publish Production instead of Special as default value for GlueCEStateStatus FIXED
    • verify with /opt/glite/libexec/glite-info-wrapper | grep -i gluecestatestatus
Changed:
<
<
  • Bug #58423: RFE: support for ISB/OSB transfers from/to gridftp servers running using credentials HOPEFULLY FIXED
    • tested by the condor team
    • cannot test using the UI (see the bug #59426
>
>
  • Bug #58423: RFE: support for ISB/OSB transfers from/to gridftp servers running using credentials FIXED
    • tested by the condor team, used java-based UI
 
  • Bug #58659: NullPointerException from getStatus FIXED
    • try to overload the node with the testsuite: cream-test-monitored-submit -r 30 -n 20000 -m 2000 -C 50 -l log4py.conf -j test.jdl -R --sotimeout 60 --vo dteam --valid 02:00 and verify that the log of the testsuite does not report any NullPointerException

Revision 442010-03-08 - AlessioGianelle

Line: 1 to 1
 

PATCH 3179

Line: 56 to 56
  ice.png
Deleted:
<
<
NOTE the WNs of two CEs are not able to log to LB.
 

Checked bugs

Line: 409 to 407
 
META FILEATTACHMENT attachment="reports_patch3179_01.tar.gz" attr="" comment="Testsuite reports for patch 3179" date="1267618838" name="reports_patch3179_01.tar.gz" path="reports_patch3179_01.tar.gz" size="263482" stream="reports_patch3179_01.tar.gz" tmpFilename="/usr/tmp/CGItemp7655" user="PaoloAndreetto" version="1"
META FILEATTACHMENT attachment="reports_patch3179_02.tar.gz" attr="" comment="Testsuite reports for patch 3179" date="1267701127" name="reports_patch3179_02.tar.gz" path="reports_patch3179_02.tar.gz" size="186358" stream="reports_patch3179_02.tar.gz" tmpFilename="/usr/tmp/CGItemp7389" user="PaoloAndreetto" version="1"
Changed:
<
<
META FILEATTACHMENT attachment="ice.png" attr="" comment="WMS test" date="1268060894" name="ice.png" path="ice.png" size="4801" stream="ice.png" tmpFilename="/usr/tmp/CGItemp7995" user="AlessioGianelle" version="1"
>
>
META FILEATTACHMENT attachment="ice.png" attr="" comment="WMS test" date="1268064941" name="ice.png" path="ice.png" size="4626" stream="ice.png" tmpFilename="/usr/tmp/CGItemp10483" user="AlessioGianelle" version="2"

Revision 432010-03-08 - AlessioGianelle

Line: 1 to 1
 

PATCH 3179

Line: 54 to 54
 gsiftp://devel18.cnaf.infn.it:2811/var/glite/SandboxDir/P4/https_3a_2f_2fdevel15.cnaf.infn.it_3a9000_2fP4KWHXbaYyEAymru3kNugA/output/env.err): proxy expired All the errors occur in the old CEs (i.e. version 1.11).
Added:
>
>
ice.png
 
Added:
>
>
NOTE the WNs of two CEs are not able to log to LB.
 

Checked bugs

Line: 407 to 409
 
META FILEATTACHMENT attachment="reports_patch3179_01.tar.gz" attr="" comment="Testsuite reports for patch 3179" date="1267618838" name="reports_patch3179_01.tar.gz" path="reports_patch3179_01.tar.gz" size="263482" stream="reports_patch3179_01.tar.gz" tmpFilename="/usr/tmp/CGItemp7655" user="PaoloAndreetto" version="1"
META FILEATTACHMENT attachment="reports_patch3179_02.tar.gz" attr="" comment="Testsuite reports for patch 3179" date="1267701127" name="reports_patch3179_02.tar.gz" path="reports_patch3179_02.tar.gz" size="186358" stream="reports_patch3179_02.tar.gz" tmpFilename="/usr/tmp/CGItemp7389" user="PaoloAndreetto" version="1"
Added:
>
>
META FILEATTACHMENT attachment="ice.png" attr="" comment="WMS test" date="1268060894" name="ice.png" path="ice.png" size="4801" stream="ice.png" tmpFilename="/usr/tmp/CGItemp7995" user="AlessioGianelle" version="1"

Revision 422010-03-08 - PaoloAndreetto

Line: 1 to 1
 

PATCH 3179

Line: 85 to 85
 
    • reconfigure the node with yaim and check whether the sandbox area is empty
Changed:
<
<
  • Bug #47070: [ yaim-cream ] yaim cream module should support remote mysql setup NOT TESTED
>
>
  • Bug #47070: [ yaim-cream ] yaim cream module should support remote mysql setup HOPEFULLY FIXED
 
  • Bug #47254: Possible problems if the proxy used to talk with CREAM is shorter than 10 minutes FIXED
    • create a voms-proxy whose lifetime is shorter than 10 minutes
Line: 295 to 295
 
  • Bug #58119: CREAM CE: publish Production instead of Special as default value for GlueCEStateStatus FIXED
    • verify with /opt/glite/libexec/glite-info-wrapper | grep -i gluecestatestatus
Changed:
<
<
  • Bug #58423: RFE: support for ISB/OSB transfers from/to gridftp servers running using credentials NOT TESTED
>
>
  • Bug #58423: RFE: support for ISB/OSB transfers from/to gridftp servers running using credentials HOPEFULLY FIXED
    • tested by the condor team
    • cannot test using the UI (see the bug #59426
 
  • Bug #58659: NullPointerException from getStatus FIXED
    • try to overload the node with the testsuite: cream-test-monitored-submit -r 30 -n 20000 -m 2000 -C 50 -l log4py.conf -j test.jdl -R --sotimeout 60 --vo dteam --valid 02:00 and verify that the log of the testsuite does not report any NullPointerException
Line: 318 to 320
 
    • Define the parameter pbs_spoolpath in the file /opt/glite/etc/blah.config
    • run the BUpdaterPBS daemon and verify its liveness
Changed:
<
<
  • Bug #59862: [ yaim-cream-ce ] broken -v functionality NOT TESTED
>
>
  • Bug #59862: [ yaim-cream-ce ] broken -v functionality FIXED
    • remove a mandatory variable from the site-info.def, for examples JOB_MANAGER
    • run yaim configurator with option -v and verify that all the yaim functions are called.
 
  • Bug #59962: Sometimes the CREAM initialization fails with "UserId = ADMINISTRATOR is not enable for that operation"CANNOT REPRODUCE
Line: 330 to 334
 
  • Bug #61322: CREAM jw doesn't set GLITE_WMS_RB_BROKERINFO FIXED
    • submit a job and verify that the jobwrapper script, contained into the sandbox area for that job, defines correctly the __brokerinfo variable
Changed:
<
<
  • Bug #61401: config_cream_blah and config_cream_clean don't take into account GLITE_LOCATION_LOGNOT TESTED
>
>
  • Bug #61401: config_cream_blah and config_cream_clean don't take into account GLITE_LOCATION_LOG FIXED
    • verify that the log files of blahp are saved into the directory specified by GLITE_LOCATION_LOG
 
Changed:
<
<
  • Bug #61402: [yaim-cream-ce] does not use GLITE_LOCATION_VAR/LOG is some cases.NOT TESTED
>
>
  • Bug #61402: [yaim-cream-ce] does not use GLITE_LOCATION_VAR/LOG is some cases FIXED
    • change the value of GLITE_LOCATION_VAR and GLITE_LOCATION_LOG and run the yaim configurator
    • verify that the new installation has been deployed into the the new directory and the log is written in the new location
 
  • Bug #61407: Set CE_ID in the cream jw FIXED
    • submit a job and verify that the jobwrapper script, contained into the sandbox area for that job, defines correctly the CE_ID variable
Changed:
<
<
  • Bug #61493: [ yaim-cream-ce ] glexec_get_account policy order is wrong NOT TESTED
>
>
  • Bug #61493: [ yaim-cream-ce ] glexec_get_account policy order is wrong INVALID
    • the structure of the file lcmaps-glexec.db is change so that it is complaint with the one used by gridftp
 
  • Bug #61604: yaim-cream-ce should not install config_gip_software_plugin FIXED
    • verify that the glite-yaim-cream-ce package does not contain the file config_gip_software_plugin but it contains config_cream_gip_software_plugin instead
Line: 387 to 395
 ] specifying existing host and path first and verify that the job terminate successfully; the owner of the token must be the mapped-user.
    • submit the jdl above but specifying a fake host and/or path and verify that the job status reports 3 different failed attempts for taking the token:
      "/opt/edg/libexec/edg-gridftp-base-rm: error globus_ftp_client: the server responded with an error 500 500-Command failed : System error in unlink: No such file or directory 500-A system call failed: No such file or directory 500 End"
Changed:
<
<
  • Bug #63874: CREAM sandbox dir creation program should not attempt creation of parent directories.NOT TESTED
>
>
  • Bug #63874: CREAM sandbox dir creation program should not attempt creation of parent directories.FIXED
    • temporary rename the directory /opt/glite/var/cream_sandbox/<voname>
    • submit a job using voms-proxy published by the given VO and verify that the job fails and no directory /opt/glite/var/cream_sandbox/<voname> has been created.
 -- AlessioGianelle - 2010-02-05

Revision 412010-03-08 - AlessioGianelle

Line: 1 to 1
Changed:
<
<

PATCH 3179

>
>
 
Added:
>
>

PATCH 3179

 
Changed:
<
<

Automatic tests:

>
>

Automatic tests

 
  • report #1:
    • CREAM UI version: 1.12.1; CREAM testsuite version: 1.0.7
Line: 18 to 19
  Since the current version of the CREAM CE does not enable CEMonitor for a standard installation, all the tests that make use of the notification mechanism have not been taken into account
Changed:
<
<

Checked bugs:

>
>

Test submission through a WMS (i.e. ICE)

Description:
  • 2880 collections each of 25 jobs
  • One collection every 60 seconds
  • Four users
  • We use these CEs located at Padua:
    • 6 CEs SL5/64b with cream version 1.12 (2 lsf + 4 torque)
    • 4 CEs SL4 with cream version 1.11 (2 lsf + 2 torque)
    • 11 CEs SL4 with cream version 1.12 (5 lsf + 6 torque)
  • Use automatic-delegation
  • The job is a "sleep random(7200)"
  • Resubmission is enabled
  • Use proxy renewal service (myproxy.cern.ch)
  • Lease mechanism is not used

Results

  • Collections correctly submitted: 2868 (71700 jobs)
    • DONE OK: 71663 (99.95%)
    • NOTDONE: 0 (0 %)
    • ABORTED: 0 (0 %)
    • CANCELLED: 37 (0.05 %) (Stucked in torque queues)
    • Resubmitted: 82 (0.11 %)

  • The 82 jobs have been resubmit due to an error like this one:
    Cannot move OSB (${globus_transfer_cmd} file:///tmp/CREAM954728532/env.err gsiftp://devel18.cnaf.infn.it:2811/var/glite/SandboxDir
    /P4/https_3a_2f_2fdevel15.cnaf.infn.it_3a9000_2fP4KWHXbaYyEAymru3kNugA/output/env.err): proxy expired; /opt/glite/bin/glite-lb-logevent: 
    edg_wll_LogEvent*(): LB server (bkserver,lbproxy) store protocol error (edg_wll_LogEvent():  LB server (bkserver,lbproxy) store protocol error;; Logging library 
    ERROR:  LB server (bkserver,lbproxy) store protocol error;; edg_wll_DoLogEvent(): edg_wll_log_connect error GSSAPI Error;; edg_wll_gss_connect();; GSS 
    Error: GSS failure occured: GSS Major Status: General failure  (GSS Minor Status Error Chain: globus_gsi_gssapi: Error with gss context globus_gsi_gssapi: 
    Error with GSI credential globus_gsi_gssapi: Error with gss credential handle globus_credential: Error with credential: The proxy credential: /home/dteam002
    /home_cre34_954728532/cre34_954728532.proxy       with subject: /C=IT/O=INFN/OU=Personal Certificate/L=Padova/CN=Luigi Zangrando/CN=proxy
    /CN=proxy/CN=proxy/CN=limited proxy       expired 42 minutes ago.  )) Cannot move OSB (${globus_transfer_cmd} file:///tmp/CREAM954728532/env.err 
    gsiftp://devel18.cnaf.infn.it:2811/var/glite/SandboxDir/P4/https_3a_2f_2fdevel15.cnaf.infn.it_3a9000_2fP4KWHXbaYyEAymru3kNugA/output/env.err): proxy 
    expired
    All the errors occur in the old CEs (i.e. version 1.11).

Checked bugs

 

  • Bug #37430: BLParser should properly filter it's log output FIXED

Revision 402010-03-08 - PaoloAndreetto

Line: 1 to 1
 

PATCH 3179

Line: 53 to 53
 
    • create a voms-proxy whose lifetime is shorter than 10 minutes
    • submit a simple job whose lifetime is shorter than the voms-proxy one and verify its correct termination
Changed:
<
<
  • Bug #47804: Possible problems configuring blah in CREAM-CE for LSF NOT TESTED
>
>
  • Bug #47804: Possible problems configuring blah in CREAM-CE for LSF FIXED
    • copy the file profile.lsf from the LSF configuration directory into a new destination, for example /tmp
    • define in the site-info.def the variable LSFPROFILE_DIR=/tmp and reconfigure with yaim
    • verify that in the file /opt/glite/etc/blah.conf the profile is loaded from the new path
 
  • Bug #48786: Load should be one of the parameter of DISABLE_SUBMISSION_POLICY in CREAM FIXED
    • specify a low load level in the file /opt/glite/bin/glite_cream_load_monitor
Line: 130 to 132
 
  • Bug #51993: Proxy renewal not very efficient for multiple jobs having the same delegationid FIXED
    • stress the renewal mechanism with a single short delegated proxy, for example with the following test: cream-test-monitored-submit -r 30 -n 2000 -m 2000 -C 50 --sotimeout 60 -j long.jdl -R <ce_id> --vo <vo_name> --valid <00:20>
Changed:
<
<
  • Bug #52020: [ yaim-cream-ce ] Support use of file (besides syslog) for glexec logging NOT TESTED
>
>
  • Bug #52020: [ yaim-cream-ce ] Support use of file (besides syslog) for glexec logging FIXED
    • install from scratch, submit a job and verify that the operation has been logged into the syslog.
    • define the following variables in the site-info.def:
      GLEXEC_CREAM_LOG_DESTINATION=file
      GLEXEC_CREAM_LOG_DIR=/tmp/tests
      
      run yaim again, submit a new job and verify that the log is written into the specified directory.
 
  • Bug #52050: misleading error message "The problem seems to be related to glexec FIXED
    • The CREAM service does not make use of glexec anymore, and therefore this error message can't appear anymore
Line: 150 to 157
 
    • submit a job specifying a simple CE requirements (e.g. cerequirements="other.GlueHostMainMemoryRAMSize > 2000")
    • verify that, after the execution of the job, in the tmp directory no files ce-req-file-* are left
Deleted:
<
<
  • Bug #52577: [ yaim-cream-ce ] create CREAM_GLEXEC_USER_HOME variableNOT TESTED
 
  • Bug #52651: CREAM file descriptor overuse FIXED
    • try to overload the node with the testsuite: cream-test-monitored-submit -r 30 -n 20000 -m 2000 -C 50 -l log4py.conf -j test.jdl -R --sotimeout 60 --vo dteam --valid 02:00 and verify that with high load the submissions are rejected
    • seek "too many open files" in the CREAM log
Line: 261 to 266
 
    • temporary rename the directory /opt/glite/var/cream_sandbox without turning off the service
    • submit a job and verify that the failure reports "cannot create the job's working directory!"
Changed:
<
<
  • Bug #58941: [yaim-cream-ce] lcmaps confs for glexec and gridftp are not fully synchronized NOT TESTED
>
>
  • Bug #58941: [yaim-cream-ce] lcmaps confs for glexec and gridftp are not fully synchronized FIXED
    • verify that the file /opt/glite/etc/lcmaps/lcmaps.db is complaint with the one attached to the bug.
 
  • Bug #59005: Possible problem with hold/resumed jobs in BUpdaterLSF FIXED
    • Verified as reported here
Line: 320 to 326
 
  • Bug #62436: Possible problem with updater if job remain queued too long FIXED
    • Fixed as reported here: 3 jobs lasting 2 hours were submitted to a CREAM CE with only 2 job slots. For the third one the BNotifier logged the right events (i.e. it didn't log status=4 with failurereason=999)
Changed:
<
<
  • Bug #62565: yaim-cream-ce requires BLPARSER_HOST even if the new blparser has to be configuredNOT TESTED
>
>
  • Bug #62565: yaim-cream-ce requires BLPARSER_HOST even if the new blparser has to be configured FIXED
    • Install the CE node from scratch removing the BLPARSER_HOST definition from the site-info.def and defining BLPARSER_WITH_UPDATER_NOTIFIER=true.
    • verify that the yaim log about any error concerning the variable above and the BNotifier and the Bupdater run correctly.
 
  • Bug #62776: Yaim config for CREAM CE erroneously requires tomcat in glexec group FIXED
    • Install the CE node from scratch and verify the following permissions:

Revision 392010-03-05 - PaoloAndreetto

Line: 1 to 1
 

PATCH 3179

Line: 252 to 252
 
  • Bug #58119: CREAM CE: publish Production instead of Special as default value for GlueCEStateStatus FIXED
    • verify with /opt/glite/libexec/glite-info-wrapper | grep -i gluecestatestatus
Changed:
<
<
  • Bug #59423: RFE: support for ISB/OSB transfers from/to gridftp servers running using credentials NOT TESTED
>
>
  • Bug #58423: RFE: support for ISB/OSB transfers from/to gridftp servers running using credentials NOT TESTED
 
  • Bug #58659: NullPointerException from getStatus FIXED
    • try to overload the node with the testsuite: cream-test-monitored-submit -r 30 -n 20000 -m 2000 -C 50 -l log4py.conf -j test.jdl -R --sotimeout 60 --vo dteam --valid 02:00 and verify that the log of the testsuite does not report any NullPointerException
Line: 274 to 274
 
    • Define the parameter pbs_spoolpath in the file /opt/glite/etc/blah.config
    • run the BUpdaterPBS daemon and verify its liveness
Added:
>
>
  • Bug #59862: [ yaim-cream-ce ] broken -v functionality NOT TESTED
 
  • Bug #59962: Sometimes the CREAM initialization fails with "UserId = ADMINISTRATOR is not enable for that operation"CANNOT REPRODUCE

  • Bug #60831: Error log message: "CREAM_JOB_SENSOR_HOST parameter not specified" FIXED

Revision 382010-03-05 - MassimoSgaravatto

Line: 1 to 1
 

PATCH 3179

Line: 284 to 284
 
  • Bug #61322: CREAM jw doesn't set GLITE_WMS_RB_BROKERINFO FIXED
    • submit a job and verify that the jobwrapper script, contained into the sandbox area for that job, defines correctly the __brokerinfo variable
Added:
>
>
  • Bug #61401: config_cream_blah and config_cream_clean don't take into account GLITE_LOCATION_LOGNOT TESTED

  • Bug #61402: [yaim-cream-ce] does not use GLITE_LOCATION_VAR/LOG is some cases.NOT TESTED
 
  • Bug #61407: Set CE_ID in the cream jw FIXED
    • submit a job and verify that the jobwrapper script, contained into the sandbox area for that job, defines correctly the CE_ID variable
Line: 314 to 318
 
  • Bug #62436: Possible problem with updater if job remain queued too long FIXED
    • Fixed as reported here: 3 jobs lasting 2 hours were submitted to a CREAM CE with only 2 job slots. For the third one the BNotifier logged the right events (i.e. it didn't log status=4 with failurereason=999)
Changed:
<
<
  • Bug #62776: Yaim config for CREAM CE erroneously requires tomcat in glexec group %green%FIXED
>
>
  • Bug #62565: yaim-cream-ce requires BLPARSER_HOST even if the new blparser has to be configuredNOT TESTED

  • Bug #62776: Yaim config for CREAM CE erroneously requires tomcat in glexec group FIXED
 
    • Install the CE node from scratch and verify the following permissions:
      -r--r----- 1 root glexec   535 Mar  1 10:47 /opt/glite/etc/glexec.conf
      -r-sr-sr-x 1 root glexec 79792 Jun 11  2009 /opt/glite/sbin/glexec
Line: 333 to 339
 ] specifying existing host and path first and verify that the job terminate successfully; the owner of the token must be the mapped-user.
    • submit the jdl above but specifying a fake host and/or path and verify that the job status reports 3 different failed attempts for taking the token:
      "/opt/edg/libexec/edg-gridftp-base-rm: error globus_ftp_client: the server responded with an error 500 500-Command failed : System error in unlink: No such file or directory 500-A system call failed: No such file or directory 500 End"
Changed:
<
<
>
>
  • Bug #63874: CREAM sandbox dir creation program should not attempt creation of parent directories.NOT TESTED
 -- AlessioGianelle - 2010-02-05

Revision 372010-03-04 - PaoloAndreetto

Line: 1 to 1
 

PATCH 3179

Line: 14 to 14
 
    • CREAM UI version: 1.11.1; CREAM testsuite version: 1.0.6
    • used direct polling for monitoring and BLParser for status change detection
    • Batch system: TORQUE
Added:
>
>
    • All the tests complete successfully, view the reports
  Since the current version of the CREAM CE does not enable CEMonitor for a standard installation, all the tests that make use of the notification mechanism have not been taken into account
Line: 313 to 314
 
  • Bug #62436: Possible problem with updater if job remain queued too long FIXED
    • Fixed as reported here: 3 jobs lasting 2 hours were submitted to a CREAM CE with only 2 job slots. For the third one the BNotifier logged the right events (i.e. it didn't log status=4 with failurereason=999)
Changed:
<
<
  • Bug #62776: Yaim config for CREAM CE erroneously requires tomcat in glexec group NOT TESTED
>
>
  • Bug #62776: Yaim config for CREAM CE erroneously requires tomcat in glexec group %green%FIXED
    • Install the CE node from scratch and verify the following permissions:
      -r--r----- 1 root glexec   535 Mar  1 10:47 /opt/glite/etc/glexec.conf
      -r-sr-sr-x 1 root glexec 79792 Jun 11  2009 /opt/glite/sbin/glexec
      
    • verify that the glexec.conf file contains the property: "user_white_list = tomcat"
 
  • Bug #62893: Possible proxy renewal problem in the CREAM jw FIXED
    • try to overload the node with the testsuite: cream-test-monitored-submit -r 30 -n 20000 -m 2000 -C 50 -l log4py.conf -j test.jdl -R <ceID> --sotimeout 60 --vo dteam --valid 00:30
Line: 332 to 338
 
Added:
>
>
 
META FILEATTACHMENT attachment="reports_patch3179_01.tar.gz" attr="" comment="Testsuite reports for patch 3179" date="1267618838" name="reports_patch3179_01.tar.gz" path="reports_patch3179_01.tar.gz" size="263482" stream="reports_patch3179_01.tar.gz" tmpFilename="/usr/tmp/CGItemp7655" user="PaoloAndreetto" version="1"
Added:
>
>
META FILEATTACHMENT attachment="reports_patch3179_02.tar.gz" attr="" comment="Testsuite reports for patch 3179" date="1267701127" name="reports_patch3179_02.tar.gz" path="reports_patch3179_02.tar.gz" size="186358" stream="reports_patch3179_02.tar.gz" tmpFilename="/usr/tmp/CGItemp7389" user="PaoloAndreetto" version="1"

Revision 362010-03-03 - MassimoSgaravatto

Line: 1 to 1
 

PATCH 3179

Line: 155 to 155
 
    • try to overload the node with the testsuite: cream-test-monitored-submit -r 30 -n 20000 -m 2000 -C 50 -l log4py.conf -j test.jdl -R --sotimeout 60 --vo dteam --valid 02:00 and verify that with high load the submissions are rejected
    • seek "too many open files" in the CREAM log
Changed:
<
<
  • Bug #52719: Blah doesn't set the 'executable' flag if a local jobwrapper is found NOT TESTED
>
>
  • Bug #52719: Blah doesn't set the 'executable' flag if a local jobwrapper is found FIXED
    • Submitted a job to a CREAM CE
    • Checked the BLAH wrapper: the chmod u+x of the CREAM JobWrapper is done in all cases (even if the job is going to be run on the WN via a local jobwrapper)
 
  • Bug #52942: Missing description for ISB/OSB error in jobwrapper FIXED
    • submit a job with an unreachable host in the inputsandbox or in the outputsandboxbasedesturi parameter

Revision 352010-03-03 - PaoloAndreetto

Line: 1 to 1
 

PATCH 3179

Automatic tests:

Added:
>
>
  • report #1:
    • CREAM UI version: 1.12.1; CREAM testsuite version: 1.0.7
    • Used event query for monitoring and BUpdater/BNotifier for status change detection
    • Batch system: LSF
    • All the tests complete successfully, view the reports

  • report #2
    • CREAM UI version: 1.11.1; CREAM testsuite version: 1.0.6
    • used direct polling for monitoring and BLParser for status change detection
    • Batch system: TORQUE
 
Added:
>
>
Since the current version of the CREAM CE does not enable CEMonitor for a standard installation, all the tests that make use of the notification mechanism have not been taken into account
 

Checked bugs:

Line: 316 to 327
 

-- AlessioGianelle - 2010-02-05

Added:
>
>

META FILEATTACHMENT attachment="reports_patch3179_01.tar.gz" attr="" comment="Testsuite reports for patch 3179" date="1267618838" name="reports_patch3179_01.tar.gz" path="reports_patch3179_01.tar.gz" size="263482" stream="reports_patch3179_01.tar.gz" tmpFilename="/usr/tmp/CGItemp7655" user="PaoloAndreetto" version="1"

Revision 342010-03-02 - MassimoSgaravatto

Line: 1 to 1
 

PATCH 3179

Line: 188 to 188
 
    • verify that the sandbox directory of that job has been removed from /opt/glite/var/cream_sandbox
    • remove manually the job from the batch system and reconnect all the WN
Changed:
<
<
  • Bug #55438: BUpdater problems in updating job state with AssignFinalState for all batch system NOT TESTED
>
>
  • Bug #55438: BUpdater problems in updating job state with AssignFinalState for all batch system FIXED
    • Submitted 3 jobs lasting 2 hours to a CREAM CE with only 2 job slots.
    • For all the jobs the right events were logged by the bnotifier (i.e. it didn't log status=4 with failurereason=999)
 

Revision 332010-03-02 - PaoloAndreetto

Line: 1 to 1
 

PATCH 3179

Line: 46 to 46
 
  • Bug #48786: Load should be one of the parameter of DISABLE_SUBMISSION_POLICY in CREAM FIXED
    • specify a low load level in the file /opt/glite/bin/glite_cream_load_monitor
Changed:
<
<
    • try to overload the node with the testsuite: cream-test-monitored-submit -r 30 -n 20000 -m 2000 -C 50 -l log4py.conf -j test.jdl -R --sotimeout 60 --vo dteam --valid 02:00 and verify that with high load the submissions are rejected
>
>
    • try to overload the node with the testsuite: cream-test-monitored-submit -r 30 -n 20000 -m 2000 -C 50 -l log4py.conf -j test.jdl -R <ceID> --sotimeout 60 --vo dteam --valid 02:00 and verify that with high load the submissions are rejected
 
  • Bug #49497: user proxies on CREAM do not get cleaned up FIXED
    • delegate a proxy whose lifetime is shorter than the parameter delegation_purge_rate of the CREAM configuration file
Line: 300 to 300
 
  • Bug #62776: Yaim config for CREAM CE erroneously requires tomcat in glexec group NOT TESTED
Changed:
<
<
  • Bug #62893: Possible proxy renewal problem in the CREAM jw NOT TESTED
>
>
  • Bug #62893: Possible proxy renewal problem in the CREAM jw FIXED
    • try to overload the node with the testsuite: cream-test-monitored-submit -r 30 -n 20000 -m 2000 -C 50 -l log4py.conf -j test.jdl -R <ceID> --sotimeout 60 --vo dteam --valid 00:30
    • verify that no proxy related issues occur
 
  • Bug #63398: CREAM jw: removal of token should be retried in case of failure FIXED
    • submit the following jdl:
      [

Revision 322010-03-02 - PaoloAndreetto

Line: 1 to 1
 

PATCH 3179

Line: 150 to 150
 
    • submit a job with an unreachable host in the inputsandbox or in the outputsandboxbasedesturi parameter
    • verify that the output of the glite-ce-job-status contains the full description of the failure
Changed:
<
<
  • Bug #53459: [CREAM] Provide method to improve the detection of job status changes by ICE NOT TESTED
>
>
  • Bug #53459: [CREAM] Provide method to improve the detection of job status changes by ICE FIXED
    • run the "monitored" part of the testsuite, the latest version of the testsuite makes use of the "event query" mechanism for keeping track of the job status.
 
  • Bug #53499: CREAM job wrapper template should be put outside the jar FIXED
    • check whether the file /opt/glite/share/webapps/ce-cream.war contains the file WEB-INF/jobwrapper.tpl

Revision 312010-03-02 - MassimoSgaravatto

Line: 1 to 1
 

PATCH 3179

Line: 169 to 169
 
  • Bug #54900: [ glite-yaim-cream-ce ] config_cream_tomcat_user should not add tomcat to VO FIXED
    • check the membership of any VO group
Changed:
<
<
  • Bug #54949: Some job can remain in running state when BLParser is restarted for both lsf and pbs NOT TESTED
>
>
  • Bug #54949: Some job can remain in running state when BLParser is restarted for both lsf and pbs HOPEFULLY FIXED
    • Not easy to reproduce
    • Submitted several jobs (logged in different batch system log files) to a CREAM CE configured with the old blparser
    • Restarted CREAM
    • Didn't notice problems in getting the status of these jobs
 
  • Bug #55078: Possible final state not considered in BLParserPBS and BUpdaterPBS CANNOT REPRODUCE
    • To test the fix it would be necessary to have a scenario for which in the Torque log file for a certain job the event "Job Run..." is followed by the event "dequeuing from"

Revision 302010-03-02 - MassimoSgaravatto

Line: 1 to 1
 

PATCH 3179

Line: 8 to 8
 

Checked bugs:

Deleted:
<
<
  • Bug #17949: BLAH should operate with no access whatsoever to the batch system logs NOT TESTED
 
  • Bug #37430: BLParser should properly filter it's log output FIXED
    • Not too clear what the fix is supposed to be
Line: 151 to 150
 
    • submit a job with an unreachable host in the inputsandbox or in the outputsandboxbasedesturi parameter
    • verify that the output of the glite-ce-job-status contains the full description of the failure
Deleted:
<
<
  • Bug #53124: blparser_master could crash if some variable in blparser.conf are not set NOT TESTED
 
  • Bug #53459: [CREAM] Provide method to improve the detection of job status changes by ICE NOT TESTED

  • Bug #53499: CREAM job wrapper template should be put outside the jar FIXED

Revision 292010-03-01 - PaoloAndreetto

Line: 1 to 1
 

PATCH 3179

Line: 220 to 220
 
    • Not possible to test the fix since we don't have CREAM based CEs with Condor as batch system
Changed:
<
<
  • Bug #57820: [yaim-cream-ce] CREAM-CE publishes GlueServiceDataValue incomplete NOT TESTED
>
>
  • Bug #57820: [yaim-cream-ce] CREAM-CE publishes GlueServiceDataValue incomplete FIXED
    • run the infoprovider: /opt/glite/etc/gip/provider/glite-info-provider-service-cream-wrapper | grep GlueServiceDataValue
    • verify that 3 different values are returned for the GlueServiceDataValue: the version, the DN and the host name of the CE
 
  • Bug #58103: Cream database Query performance FIXED
    • Internal improvement
Line: 229 to 231
 
  • Bug #58109: Wrong value for the "service version" property FIXED
    • verify the property using the command glite-ce-service-info
Changed:
<
<
  • Bug #58119: CREAM CE: publish Production instead of Special as default value for GlueCEStateStatus NOT TESTED
>
>
  • Bug #58119: CREAM CE: publish Production instead of Special as default value for GlueCEStateStatus FIXED
    • verify with /opt/glite/libexec/glite-info-wrapper | grep -i gluecestatestatus
 
  • Bug #59423: RFE: support for ISB/OSB transfers from/to gridftp servers running using credentials NOT TESTED
Line: 287 to 290
 
  • Bug #62070: Possible problem with notification time in BNotifier HOPEFULLY FIXED
    • Not possible to reproduce it according to the developer (M. Mezzadri)
Changed:
<
<
  • Bug #62207: [ yaim-cream ] Enable Glue 2.0 publishing NOT TESTED
>
>
  • Bug #62207: [ yaim-cream ] Enable Glue 2.0 publishing FIXED
    • send the following query to the CE BDII: ldapsearch -x -h $(hostname) -p 2170 -b o=glue and verify that it returns GLUE2 schema and information
 
  • Bug #62436: Possible problem with updater if job remain queued too long FIXED
    • Fixed as reported here: 3 jobs lasting 2 hours were submitted to a CREAM CE with only 2 job slots. For the third one the BNotifier logged the right events (i.e. it didn't log status=4 with failurereason=999)
Line: 296 to 300
 
  • Bug #62893: Possible proxy renewal problem in the CREAM jw NOT TESTED
Changed:
<
<
  • Bug #63398: CREAM jw: removal of token should be retried in case of failure NOT TESTED
>
>
  • Bug #63398: CREAM jw: removal of token should be retried in case of failure FIXED
    • submit the following jdl:
      [
      environment= {"__token_file=gsiftp://host/path"};
      executable="/bin/sleep";
      arguments="30";
      ]
      specifying existing host and path first and verify that the job terminate successfully; the owner of the token must be the mapped-user.
    • submit the jdl above but specifying a fake host and/or path and verify that the job status reports 3 different failed attempts for taking the token:
      "/opt/edg/libexec/edg-gridftp-base-rm: error globus_ftp_client: the server responded with an error 500 500-Command failed : System error in unlink: No such file or directory 500-A system call failed: No such file or directory 500 End"
  -- AlessioGianelle - 2010-02-05

Revision 282010-02-26 - MassimoSgaravatto

Line: 1 to 1
 

PATCH 3179

Line: 16 to 16
 
    • Verified in the old blparser log file

  • Bug #45364: BLAH_JOB_CANCEL should report failure reason FIXED
Changed:
<
<
    • submit and cancel a job using the LRMS command (e.g. qdel) and verify that the reason reports "Cancelled by CE admin"
>
>
    • submit a job top CREAM and then cancels it using the LRMS command (e.g. qdel). Before the blparser (and therefore CREAM) realizes that the job was cancelled, issue a glite-ce-job-cancel.
    • Issue a glite-ce-job-status -L 2. For the cancel command a failure )alomng with its reason) should be reported such as:

   *** Command Name              = [JOB_CANCEL]
       Command Category          = [JOB_MANAGEMENT]
       Command Status            = [ERROR]
       Command Fail Reason       = [qdel: Unknown Job Id 45299.cream-38.pd.infn.it]
       Creation Time             = [Fri 26 Feb 2010 18:43:27] (1267206207)
       Start Scheduling Time     = [Fri 26 Feb 2010 18:43:27] (1267206207)
       Start Processing Time     = [Fri 26 Feb 2010 18:43:27] (1267206207)
       Execution Completed Time  = [Fri 26 Feb 2010 18:43:30] (1267206210)
 
  • Bug #46419: CREAM sandbox area should be scratched when the CREAM DB is scratched FIXED
    • Submit at least one job to the CE and wait for its termination, so that the sandbox area is not empty
Changed:
<
<
    • Increment the value of the parameters creamdb_database_version and/or delegationdb_database_version in the file /opt/glite/etc/glite-ce-cream/cream-config.xml.template
>
>
    • Increment the value of the parameters creamdb_database_version in the file /opt/glite/etc/glite-ce-cream/cream-config.xml.template
 
    • reconfigure the node with yaim and check whether the sandbox area is empty
Line: 57 to 69
 
    • verify that the failure reason reported by the job status contains the message: Problem to detect the lifetime of the proxy

  • Bug #51046: CREAM: DelegProxyInfo info sometimes is wrong FIXED
Changed:
<
<
    • submit a job, wait for its termination and verify the correct lifetime of the proxy
>
>
    • submit a job, wait for its termination and verify the correct lifetime of the proxy in the glite-ce-job-status output
 
  • Bug #51118: config_cream_glexec doesn't set glexec permissions right FIXED
    • install a CE node from scratch and verify the permissions for /opt/glite/sbin/glexec (6555) and /opt/glite/etc/glexec.conf (640)
Line: 87 to 99
 
    • check the content of glite-ce-cream rpm

  • Bug #51706: yaim-cream-ce: remove "lcg" prefix from JOB_MANAGER FIXED
Changed:
<
<
    • change the value of JOB_MANAGER in the siteinfo.def
    • configure the node with YAIM and verify that the resource BDII publishes this new value in the GlueCeUniqueids
>
>
    • change the value of JOB_MANAGER in the siteinfo.def e.g. from lsf to lcglsf
    • configure the node with YAIM and verify that in the resource BDII the string lsf (and not lcglsf) appears in the glueeeuniqueids
 
  • Bug #51892: Exception when using java.text.DateFormat.parse FIXED
    • try to overload the node with the testsuite: cream-test-monitored-submit -r 30 -n 20000 -m 2000 -C 50 -l log4py.conf -j test.jdl -R --sotimeout 60 --vo dteam --valid 02:00 and verify that with high load the submissions are rejected
Line: 109 to 121
 
  • Bug #52020: [ yaim-cream-ce ] Support use of file (besides syslog) for glexec logging NOT TESTED
Changed:
<
<
  • Bug #52050: misleading error message "The problem seems to be related to glexec INVALID
    • The CREAM service does not make use of glexec anymore
>
>
  • Bug #52050: misleading error message "The problem seems to be related to glexec FIXED
    • The CREAM service does not make use of glexec anymore, and therefore this error message can't appear anymore
 

  • Bug #52051: CEMon must remove all expired subscriptions on start-up FIXED
Line: 144 to 156
 
  • Bug #53459: [CREAM] Provide method to improve the detection of job status changes by ICE NOT TESTED

  • Bug #53499: CREAM job wrapper template should be put outside the jar FIXED
Changed:
<
<
    • check wheter the file /opt/glite/share/webapps/ce-cream.war contains the file WEB-INF/jobwrapper.tpl
>
>
    • check whether the file /opt/glite/share/webapps/ce-cream.war contains the file WEB-INF/jobwrapper.tpl
 
  • Bug #54812: lsf_submit.sh job requirement FIXED
    • Created (and chmoded +x) the file /opt/glite/bin/lsf_local_submit_attributes.sh on the CREAM CE with the following content:
Line: 162 to 174
 
  • Bug #54949: Some job can remain in running state when BLParser is restarted for both lsf and pbs NOT TESTED
Changed:
<
<
  • Bug #55078: Possible final state not considered in BLParserPBS and BUpdaterPBS NOT TESTED
>
>
  • Bug #55078: Possible final state not considered in BLParserPBS and BUpdaterPBS CANNOT REPRODUCE
    • To test the fix it would be necessary to have a scenario for which in the Torque log file for a certain job the event "Job Run..." is followed by the event "dequeuing from"
    • Not able to reproduce such scenario
 
  • Bug #55420: Allow admin to purge CREAM jobs in a non terminal status FIXED
    • temporary disconnect any WN from the CE, e.g. shutting down the mom server in a TORQUE installation

Revision 272010-02-26 - MassimoSgaravatto

Line: 1 to 1
 

PATCH 3179

Line: 10 to 10
 
  • Bug #17949: BLAH should operate with no access whatsoever to the batch system logs NOT TESTED
Changed:
<
<
  • Bug #37430: BLParser should properly filter it's log output NOT TESTED
>
>
  • Bug #37430: BLParser should properly filter it's log output FIXED
 
    • Not too clear what the fix is supposed to be
    • According to the developer (M. Mezzadri) the command received by the old blparser from CREAM should be reported in the blparser log file without an extra new-line
    • Verified in the old blparser log file
Line: 146 to 146
 
  • Bug #53499: CREAM job wrapper template should be put outside the jar FIXED
    • check wheter the file /opt/glite/share/webapps/ce-cream.war contains the file WEB-INF/jobwrapper.tpl
Changed:
<
<
  • Bug #54812: lsf_submit.sh job requirement NOT TESTED
>
>
  • Bug #54812: lsf_submit.sh job requirement FIXED
    • Created (and chmoded +x) the file /opt/glite/bin/lsf_local_submit_attributes.sh on the CREAM CE with the following content:

                     #!/bin/sh
                     echo "BSUB -n 2"

    • Submitted a job to that CE, without specifying in the JDL the cerequirements attribute
    • Checked (via bjobs -l) that the -n 2 directive was used (which means that the lsf_local_submit_attributes.sh was run)
 
  • Bug #54900: [ glite-yaim-cream-ce ] config_cream_tomcat_user should not add tomcat to VO FIXED
    • check the membership of any VO group

Revision 262010-02-26 - PaoloAndreetto

Line: 1 to 1
 

PATCH 3179

Line: 70 to 70
 
    • install a CE node from scratch
    • verify the existence of the files: /opt/glite/etc/lcas/lcas-glexec.db and /opt/glite/etc/lcmaps/lcmaps-glexec.db
Changed:
<
<
* Bug #51249: [ yaim-cream-ce ] refactor config_cream_db NOT TESTED * Bug #51249: [ yaim-cream-ce ] refactor config_cream_db FIXED * Install the node from scratch and verify all the basic operations of the CREAM service
>
>
  • Bug #51249: [ yaim-cream-ce ] refactor config_cream_db FIXED
    • Install the node from scratch and verify all the basic operations of the CREAM service
 
  • Bug #51310: Wrong event timestamp FIXED
    • run the consumer server (glite-ce-monitor-consumer) on the client machine
Line: 105 to 104
 
    • when all the jobs have been submitted restart the service and verify the startup time.
    • verify in the CREAM and BLAHP logs that the jobs are checked one by one at startup, instead of polling all jobs from a given timestamp
Changed:
<
<
  • Bug #51993: Proxy renewal not very efficient for multiple jobs having the same delegationid NOT TESTED
>
>
  • Bug #51993: Proxy renewal not very efficient for multiple jobs having the same delegationid FIXED
    • stress the renewal mechanism with a single short delegated proxy, for example with the following test: cream-test-monitored-submit -r 30 -n 2000 -m 2000 -C 50 --sotimeout 60 -j long.jdl -R <ce_id> --vo <vo_name> --valid <00:20>
 
  • Bug #52020: [ yaim-cream-ce ] Support use of file (besides syslog) for glexec logging NOT TESTED

Revision 252010-02-26 - MassimoSgaravatto

Line: 1 to 1
 

PATCH 3179

Line: 10 to 10
 
  • Bug #17949: BLAH should operate with no access whatsoever to the batch system logs NOT TESTED
Changed:
<
<
* Bug #37430: BLParser should properly filter it's log output HOPEFULLY FIXED * Bug #37430: BLParser should properly filter it's log output NOT TESTED * Not too clear what the fix is supposed to be
>
>
  • Bug #37430: BLParser should properly filter it's log output NOT TESTED
    • Not too clear what the fix is supposed to be
 
    • According to the developer (M. Mezzadri) the command received by the old blparser from CREAM should be reported in the blparser log file without an extra new-line
    • Verified in the old blparser log file
Line: 192 to 191
 
    • force the service to fail a register operation, e.g. temporary renaming the sandbox directory
    • verify that the log reports at least the JobID and the reason of the failure
Changed:
<
<
  • Bug #57210: BLAH condor_submit script doesn't recognize certain options. NOT TESTED
>
>
  • Bug #57210: BLAH condor_submit script doesn't recognize certain options. CANNOT REPRODUCE
    • Not possible to test the fix since we don't have CREAM based CEs with Condor as batch system

  • Bug #57307: condor_submit.sh does not support the handling of "local" attributes CANNOT REPRODUCE
    • Not possible to test the fix since we don't have CREAM based CEs with Condor as batch system
 
Deleted:
<
<
  • Bug #57307: condor_submit.sh does not support the handling of "local" attributes NOT TESTED
 
  • Bug #57820: [yaim-cream-ce] CREAM-CE publishes GlueServiceDataValue incomplete NOT TESTED

Revision 242010-02-26 - PaoloAndreetto

Line: 1 to 1
 

PATCH 3179

Line: 10 to 10
 
  • Bug #17949: BLAH should operate with no access whatsoever to the batch system logs NOT TESTED
Changed:
<
<
  • Bug #37430: BLParser should properly filter it's log output HOPEFULLY FIXED
    • Not too clear what the fix is supposed to be
>
>
* Bug #37430: BLParser should properly filter it's log output HOPEFULLY FIXED * Bug #37430: BLParser should properly filter it's log output NOT TESTED * Not too clear what the fix is supposed to be
 
    • According to the developer (M. Mezzadri) the command received by the old blparser from CREAM should be reported in the blparser log file without an extra new-line
    • Verified in the old blparser log file
Line: 70 to 71
 
    • install a CE node from scratch
    • verify the existence of the files: /opt/glite/etc/lcas/lcas-glexec.db and /opt/glite/etc/lcmaps/lcmaps-glexec.db
Changed:
<
<
  • Bug #51249: [ yaim-cream-ce ] refactor config_cream_db NOT TESTED
>
>
* Bug #51249: [ yaim-cream-ce ] refactor config_cream_db NOT TESTED * Bug #51249: [ yaim-cream-ce ] refactor config_cream_db FIXED * Install the node from scratch and verify all the basic operations of the CREAM service
 
  • Bug #51310: Wrong event timestamp FIXED
    • run the consumer server (glite-ce-monitor-consumer) on the client machine

Revision 232010-02-26 - MassimoSgaravatto

Line: 1 to 1
 

PATCH 3179

Line: 10 to 10
 
  • Bug #17949: BLAH should operate with no access whatsoever to the batch system logs NOT TESTED
Changed:
<
<
  • Bug #37430: BLParser should properly filter it's log output NOT TESTED
>
>
  • Bug #37430: BLParser should properly filter it's log output HOPEFULLY FIXED
    • Not too clear what the fix is supposed to be
    • According to the developer (M. Mezzadri) the command received by the old blparser from CREAM should be reported in the blparser log file without an extra new-line
    • Verified in the old blparser log file
 
  • Bug #45364: BLAH_JOB_CANCEL should report failure reason FIXED
    • submit and cancel a job using the LRMS command (e.g. qdel) and verify that the reason reports "Cancelled by CE admin"
Line: 164 to 167
 
Changed:
<
<
  • Bug #55565: BLAH configuration attribute blah_disable_wn_proxy_renewal fails to disable proxy renewal. NOT TESTED
>
>
  • Bug #55565: BLAH configuration attribute blah_disable_wn_proxy_renewal fails to disable proxy renewal. FIXED
    • Verified issuing a BLAH_JOB_REFRESH proxy for a running job
    • Moreover the BLAH proxy renewal operation is not used anymore (the proxy on the CE is renewed by CREAM and no more by BLAH)
 
  • Bug #56075: Job failure reasons missing in the CREAM log file FIXED
    • submit a job with an unreachable host in the inputsandbox or in the outputsandboxbasedesturi parameter
Line: 248 to 253
 
    • create on ore more subscriptions to non-existing consumer URL or to a fake blocking one (e.g. using nc -l -p <consumer port>) specifying the same rate as above
    • verify that the notification rate for the first consumer is correct
Changed:
<
<
  • Bug #61790: Problems in CREAM CE when there are "strange" characters in the subject certificate NOT TESTED
>
>
  • Bug #61790: Problems in CREAM CE when there are "strange" characters in the subject certificate FIXED
    • Verified submitting a job to a Torque CREAM CE with a proxy with subject: /DC=gov/DC=fnal/O=Fermilab/OU=Robots/CN=lcgcaf/CN=cdf/CN=Donatella Lucchesi/CN=UID:lucchesi
    • With the same proxy there were problems before (see https://gus.fzk.de/ws/ticket_info.php?ticket=54767)
 
  • Bug #62070: Possible problem with notification time in BNotifier HOPEFULLY FIXED
    • Not possible to reproduce it according to the developer (M. Mezzadri)

  • Bug #62207: [ yaim-cream ] Enable Glue 2.0 publishing NOT TESTED
Changed:
<
<
  • Bug #62436: Possible problem with updater if job remain queued too long NOT TESTED
>
>
  • Bug #62436: Possible problem with updater if job remain queued too long FIXED
    • Fixed as reported here: 3 jobs lasting 2 hours were submitted to a CREAM CE with only 2 job slots. For the third one the BNotifier logged the right events (i.e. it didn't log status=4 with failurereason=999)
 
  • Bug #62776: Yaim config for CREAM CE erroneously requires tomcat in glexec group NOT TESTED

Revision 222010-02-26 - PaoloAndreetto

Line: 1 to 1
 

PATCH 3179

Line: 95 to 95
 
    • submit a job specifying a malformed cerequirements parameter
    • verify that the job is executed and the parameter is ignored
Changed:
<
<
  • Bug #51978: CREAM can be slow to start TESTING NOW
    • submit a big bunch of long-lived jobs, for example using cream-test-monitored-submit -r 30 -n 2000 -m 2000 -C 100 -j long.jdl -R <ce_id> and long.jdl is "[executable="/bin/sleep";arguments="3600";]"
>
>
  • Bug #51978: CREAM can be slow to start FIXED
    • submit a big bunch of long-lived jobs, for example using cream-test-monitored-submit -r 30 -n 2000 -m 2000 -C 100 --sotimeout 60 -j long.jdl -R <ce_id> where long.jdl is "[executable="/bin/sleep";arguments="3600";]"
 
    • when all the jobs have been submitted restart the service and verify the startup time.
    • verify in the CREAM and BLAHP logs that the jobs are checked one by one at startup, instead of polling all jobs from a given timestamp
Line: 201 to 201
 
  • Bug #59423: RFE: support for ISB/OSB transfers from/to gridftp servers running using credentials NOT TESTED
Changed:
<
<
  • Bug #58659: NullPointerException from getStatus TESTING NOW
>
>
  • Bug #58659: NullPointerException from getStatus FIXED
 
    • try to overload the node with the testsuite: cream-test-monitored-submit -r 30 -n 20000 -m 2000 -C 50 -l log4py.conf -j test.jdl -R --sotimeout 60 --vo dteam --valid 02:00 and verify that the log of the testsuite does not report any NullPointerException

  • Bug #58792: JobRegister fails, because cream_sandbox directory doesn't exist FIXED

Revision 212010-02-26 - MassimoSgaravatto

Line: 1 to 1
 

PATCH 3179

Line: 161 to 161
 
  • Bug #55438: BUpdater problems in updating job state with AssignFinalState for all batch system NOT TESTED
Changed:
<
<
  • Bug #55531: BUpdaterPBS should consider lines like "unable to run job" NOT TESTED
>
>
 
  • Bug #55565: BLAH configuration attribute blah_disable_wn_proxy_renewal fails to disable proxy renewal. NOT TESTED
Line: 250 to 250
 
  • Bug #61790: Problems in CREAM CE when there are "strange" characters in the subject certificate NOT TESTED
Changed:
<
<
  • Bug #62070: Possible problem with notification time in BNotifier NOT TESTED
>
>
  • Bug #62070: Possible problem with notification time in BNotifier HOPEFULLY FIXED
    • Not possible to reproduce it according to the developer (M. Mezzadri)
 
  • Bug #62207: [ yaim-cream ] Enable Glue 2.0 publishing NOT TESTED

Revision 202010-02-26 - MassimoSgaravatto

Line: 1 to 1
 

PATCH 3179

Line: 210 to 210
 
  • Bug #58941: [yaim-cream-ce] lcmaps confs for glexec and gridftp are not fully synchronized NOT TESTED
Changed:
<
<
  • Bug #59005: Possible problem with hold/resumed jobs in BUpdaterLSF NOT TESTED
>
>
  • Bug #59005: Possible problem with hold/resumed jobs in BUpdaterLSF FIXED
    • Verified as reported here
 
  • Bug #59329: Proxy symlinks left in the registry area until purged FIXED
    • submit a job and verify the verify the existence of the related symlink in the directory /opt/glite/var/blah/user_blah_job_registry.bjr/registry.proxydir

Revision 192010-02-23 - PaoloAndreetto

Line: 1 to 1
 

PATCH 3179

Line: 260 to 260
 
  • Bug #62893: Possible proxy renewal problem in the CREAM jw NOT TESTED
Added:
>
>
  • Bug #63398: CREAM jw: removal of token should be retried in case of failure NOT TESTED
 -- AlessioGianelle - 2010-02-05

Revision 182010-02-23 - PaoloAndreetto

Line: 1 to 1
 

PATCH 3179

Line: 8 to 8
 

Checked bugs:

Changed:
<
<
  • Bug #17949: BLAH should operate with no access whatsoever to the batch system logs
>
>
  • Bug #17949: BLAH should operate with no access whatsoever to the batch system logs NOT TESTED
 
Changed:
<
<
  • Bug #37430: BLParser should properly filter it's log output
>
>
  • Bug #37430: BLParser should properly filter it's log output NOT TESTED
 
  • Bug #45364: BLAH_JOB_CANCEL should report failure reason FIXED
    • submit and cancel a job using the LRMS command (e.g. qdel) and verify that the reason reports "Cancelled by CE admin"
Line: 21 to 21
 
    • reconfigure the node with yaim and check whether the sandbox area is empty
Changed:
<
<
  • Bug #47070: [ yaim-cream ] yaim cream module should support remote mysql setup
>
>
  • Bug #47070: [ yaim-cream ] yaim cream module should support remote mysql setup NOT TESTED
 
  • Bug #47254: Possible problems if the proxy used to talk with CREAM is shorter than 10 minutes FIXED
    • create a voms-proxy whose lifetime is shorter than 10 minutes
    • submit a simple job whose lifetime is shorter than the voms-proxy one and verify its correct termination
Changed:
<
<
  • Bug #47804: Possible problems configuring blah in CREAM-CE for LSF 
>
>
  • Bug #47804: Possible problems configuring blah in CREAM-CE for LSF NOT TESTED
 

  • Bug #48786: Load should be one of the parameter of DISABLE_SUBMISSION_POLICY in CREAM FIXED
Line: 49 to 49
 
    • submit and cancel a job using the CREAM CLI command and verify that the reason reports "Cancelled by user"
    • submit and cancel a job using the LRMS command (e.g. qdel) and verify that the reason reports "Cancelled by CE admin"
Changed:
<
<
  • Bug #50876: CREAM reports that the proxy expired even when the problem is in detecting the lifetime of the proxy 
>
>
  • Bug #50876: CREAM reports that the proxy expired even when the problem is in detecting the lifetime of the proxy FIXED
    • force a failure for the command grid-proxy-init in the jobwrapper, for example delegating a proxy on the CE, manually renaming the corresponding delegated proxy in the sandbox area and then submitting a job using the given delegation ID.
    • verify that the failure reason reported by the job status contains the message: Problem to detect the lifetime of the proxy
 
  • Bug #51046: CREAM: DelegProxyInfo info sometimes is wrong FIXED
    • submit a job, wait for its termination and verify the correct lifetime of the proxy
Line: 66 to 67
 
    • install a CE node from scratch
    • verify the existence of the files: /opt/glite/etc/lcas/lcas-glexec.db and /opt/glite/etc/lcmaps/lcmaps-glexec.db
Changed:
<
<
  • Bug #51249: [ yaim-cream-ce ] refactor config_cream_db 
>
>
  • Bug #51249: [ yaim-cream-ce ] refactor config_cream_db NOT TESTED
 
  • Bug #51310: Wrong event timestamp FIXED
    • run the consumer server (glite-ce-monitor-consumer) on the client machine
Line: 99 to 100
 
    • when all the jobs have been submitted restart the service and verify the startup time.
    • verify in the CREAM and BLAHP logs that the jobs are checked one by one at startup, instead of polling all jobs from a given timestamp
Changed:
<
<
  • Bug #51993: Proxy renewal not very efficient for multiple jobs having the same delegationid 
>
>
  • Bug #51993: Proxy renewal not very efficient for multiple jobs having the same delegationid NOT TESTED
 
Changed:
<
<
  • Bug #52020: [ yaim-cream-ce ] Support use of file (besides syslog) for glexec logging
>
>
  • Bug #52020: [ yaim-cream-ce ] Support use of file (besides syslog) for glexec logging NOT TESTED
 
  • Bug #52050: misleading error message "The problem seems to be related to glexec INVALID
    • The CREAM service does not make use of glexec anymore
Line: 122 to 123
 
    • submit a job specifying a simple CE requirements (e.g. cerequirements="other.GlueHostMainMemoryRAMSize > 2000")
    • verify that, after the execution of the job, in the tmp directory no files ce-req-file-* are left
Changed:
<
<
  • Bug #52577: [ yaim-cream-ce ] create CREAM_GLEXEC_USER_HOME variable  
>
>
  • Bug #52577: [ yaim-cream-ce ] create CREAM_GLEXEC_USER_HOME variableNOT TESTED
 
  • Bug #52651: CREAM file descriptor overuse FIXED
    • try to overload the node with the testsuite: cream-test-monitored-submit -r 30 -n 20000 -m 2000 -C 50 -l log4py.conf -j test.jdl -R --sotimeout 60 --vo dteam --valid 02:00 and verify that with high load the submissions are rejected
    • seek "too many open files" in the CREAM log
Changed:
<
<
  • Bug #52719: Blah doesn't set the 'executable' flag if a local jobwrapper is found 
>
>
  • Bug #52719: Blah doesn't set the 'executable' flag if a local jobwrapper is found NOT TESTED
 
  • Bug #52942: Missing description for ISB/OSB error in jobwrapper FIXED
    • submit a job with an unreachable host in the inputsandbox or in the outputsandboxbasedesturi parameter
    • verify that the output of the glite-ce-job-status contains the full description of the failure
Changed:
<
<
  • Bug #53124: blparser_master could crash if some variable in blparser.conf are not set 

  • Bug #53459: [CREAM] Provide method to improve the detection of job status changes by ICE 
>
>
  • Bug #53124: blparser_master could crash if some variable in blparser.conf are not set NOT TESTED
 
Added:
>
>
  • Bug #53459: [CREAM] Provide method to improve the detection of job status changes by ICE NOT TESTED
 
  • Bug #53499: CREAM job wrapper template should be put outside the jar FIXED
    • check wheter the file /opt/glite/share/webapps/ce-cream.war contains the file WEB-INF/jobwrapper.tpl
Changed:
<
<
  • Bug #54812: lsf_submit.sh job requirement 
>
>
  • Bug #54812: lsf_submit.sh job requirement NOT TESTED
 
  • Bug #54900: [ glite-yaim-cream-ce ] config_cream_tomcat_user should not add tomcat to VO FIXED
    • check the membership of any VO group
Changed:
<
<
  • Bug #54949: Some job can remain in running state when BLParser is restarted for both lsf and pbs 
>
>
  • Bug #54949: Some job can remain in running state when BLParser is restarted for both lsf and pbs NOT TESTED
 
Changed:
<
<
  • Bug #55078: Possible final state not considered in BLParserPBS and BUpdaterPBS 
>
>
  • Bug #55078: Possible final state not considered in BLParserPBS and BUpdaterPBS NOT TESTED
 
  • Bug #55420: Allow admin to purge CREAM jobs in a non terminal status FIXED
    • temporary disconnect any WN from the CE, e.g. shutting down the mom server in a TORQUE installation
Line: 159 to 159
 
    • verify that the sandbox directory of that job has been removed from /opt/glite/var/cream_sandbox
    • remove manually the job from the batch system and reconnect all the WN
Changed:
<
<
  • Bug #55438: BUpdater problems in updating job state with AssignFinalState for all batch systems 
>
>
  • Bug #55438: BUpdater problems in updating job state with AssignFinalState for all batch system NOT TESTED
 
Changed:
<
<
  • Bug #55531: BUpdaterPBS should consider lines like "unable to run job" 
>
>
  • Bug #55531: BUpdaterPBS should consider lines like "unable to run job" NOT TESTED
 
Changed:
<
<
  • Bug #55565: BLAH configuration attribute blah_disable_wn_proxy_renewal fails to disable proxy renewal. 
>
>
  • Bug #55565: BLAH configuration attribute blah_disable_wn_proxy_renewal fails to disable proxy renewal. NOT TESTED
 
  • Bug #56075: Job failure reasons missing in the CREAM log file FIXED
    • submit a job with an unreachable host in the inputsandbox or in the outputsandboxbasedesturi parameter
Line: 184 to 184
 
    • force the service to fail a register operation, e.g. temporary renaming the sandbox directory
    • verify that the log reports at least the JobID and the reason of the failure
Changed:
<
<
  • Bug #57210: BLAH condor_submit script doesn't recognize certain options. 
>
>
  • Bug #57210: BLAH condor_submit script doesn't recognize certain options. NOT TESTED
 
Changed:
<
<
  • Bug #57307: condor_submit.sh does not support the handling of "local" attributes 
>
>
  • Bug #57307: condor_submit.sh does not support the handling of "local" attributes NOT TESTED
 
Changed:
<
<
  • Bug #57820: [yaim-cream-ce] CREAM-CE publishes GlueServiceDataValue incomplete 
>
>
  • Bug #57820: [yaim-cream-ce] CREAM-CE publishes GlueServiceDataValue incomplete NOT TESTED
 
  • Bug #58103: Cream database Query performance FIXED
    • Internal improvement
Line: 197 to 197
 
  • Bug #58109: Wrong value for the "service version" property FIXED
    • verify the property using the command glite-ce-service-info
Changed:
<
<
  • Bug #58119: CREAM CE: publish Production instead of Special as default value for GlueCEStateStatus 

  • Bug #59423: RFE: support for ISB/OSB transfers from/to gridftp servers running using credentials 
>
>
  • Bug #58119: CREAM CE: publish Production instead of Special as default value for GlueCEStateStatus NOT TESTED
 
Changed:
<
<
  • Bug #58659: NullPointerException from getStatus 
>
>
  • Bug #59423: RFE: support for ISB/OSB transfers from/to gridftp servers running using credentials NOT TESTED
 
Added:
>
>
  • Bug #58659: NullPointerException from getStatus TESTING NOW
    • try to overload the node with the testsuite: cream-test-monitored-submit -r 30 -n 20000 -m 2000 -C 50 -l log4py.conf -j test.jdl -R --sotimeout 60 --vo dteam --valid 02:00 and verify that the log of the testsuite does not report any NullPointerException
 
  • Bug #58792: JobRegister fails, because cream_sandbox directory doesn't exist FIXED
    • temporary rename the directory /opt/glite/var/cream_sandbox without turning off the service
    • submit a job and verify that the failure reports "cannot create the job's working directory!"
Changed:
<
<
  • Bug #58941: [yaim-cream-ce] lcmaps confs for glexec and gridftp are not fully synchronized
>
>
  • Bug #58941: [yaim-cream-ce] lcmaps confs for glexec and gridftp are not fully synchronized NOT TESTED
 
Changed:
<
<
  • Bug #59005: Possible problem with hold/resumed jobs in BUpdaterLSF 
>
>
  • Bug #59005: Possible problem with hold/resumed jobs in BUpdaterLSF NOT TESTED
 

  • Bug #59329: Proxy symlinks left in the registry area until purged FIXED
Line: 234 to 234
 
  • Bug #61407: Set CE_ID in the cream jw FIXED
    • submit a job and verify that the jobwrapper script, contained into the sandbox area for that job, defines correctly the CE_ID variable
Changed:
<
<
  • Bug #61493: [ yaim-cream-ce ] glexec_get_account policy order is wrong 
>
>
  • Bug #61493: [ yaim-cream-ce ] glexec_get_account policy order is wrong NOT TESTED
 
  • Bug #61604: yaim-cream-ce should not install config_gip_software_plugin FIXED
    • verify that the glite-yaim-cream-ce package does not contain the file config_gip_software_plugin but it contains config_cream_gip_software_plugin instead
Line: 242 to 242
 
  • Bug #61730: CREAM jw: GLITE_WMS_LOG_DESTINATION should always be set with the FQDN FIXED
    • submit a job and verify that the jobwrapper script, contained into the sandbox area for that job, defines a FQDN in the __ce_hostname variable
Changed:
<
<
  • Bug #61761: CEMon must guarantee the notification rate
>
>
  • Bug #61761: CEMon must guarantee the notification rate FIXED
    • enable the "CE Sensor" plugin
    • create a subscription for the topic published by the sensor above with a running consumer: glite-ce-monitor-subscribe --cert <user_proxy> --key <user_proxy> --topic CE_MONITOR --dialects ISM_CLASSAD_GLUE_1.2 --consumer-url <consumer_url> --rate 10 --duration 600 <cemonitor_url>
    • create on ore more subscriptions to non-existing consumer URL or to a fake blocking one (e.g. using nc -l -p <consumer port>) specifying the same rate as above
    • verify that the notification rate for the first consumer is correct
 
Changed:
<
<
  • Bug #61790: Problems in CREAM CE when there are "strange" characters in the subject certificate
>
>
  • Bug #61790: Problems in CREAM CE when there are "strange" characters in the subject certificate NOT TESTED
 
Changed:
<
<
  • Bug #62070: Possible problem with notification time in BNotifier
>
>
  • Bug #62070: Possible problem with notification time in BNotifier NOT TESTED
 
Changed:
<
<
  • Bug #62207: [ yaim-cream ] Enable Glue 2.0 publishing
>
>
  • Bug #62207: [ yaim-cream ] Enable Glue 2.0 publishing NOT TESTED
 
Changed:
<
<
  • Bug #62436: Possible problem with updater if job remain queued too long
>
>
  • Bug #62436: Possible problem with updater if job remain queued too long NOT TESTED
 
Changed:
<
<
  • Bug #62776: Yaim config for CREAM CE erroneously requires tomcat in glexec group
>
>
  • Bug #62776: Yaim config for CREAM CE erroneously requires tomcat in glexec group NOT TESTED
 
Changed:
<
<
  • Bug #62893: Possible proxy renewal problem in the CREAM jw
>
>
  • Bug #62893: Possible proxy renewal problem in the CREAM jw NOT TESTED
  -- AlessioGianelle - 2010-02-05

Revision 172010-02-22 - PaoloAndreetto

Line: 1 to 1
 

PATCH 3179

Line: 85 to 85
 
    • change the value of JOB_MANAGER in the siteinfo.def
    • configure the node with YAIM and verify that the resource BDII publishes this new value in the GlueCeUniqueids
Changed:
<
<
  • Bug #51892: Exception when using java.text.DateFormat.parse 
>
>
  • Bug #51892: Exception when using java.text.DateFormat.parse FIXED
    • try to overload the node with the testsuite: cream-test-monitored-submit -r 30 -n 20000 -m 2000 -C 50 -l log4py.conf -j test.jdl -R --sotimeout 60 --vo dteam --valid 02:00 and verify that with high load the submissions are rejected
    • verify the log of the CREAM service
 

  • Bug #51928: BLAH crashes if the cerequirements classad attribute is malformed FIXED

Revision 162010-02-18 - PaoloAndreetto

Line: 1 to 1
 

PATCH 3179

Line: 95 to 95
 
  • Bug #51978: CREAM can be slow to start TESTING NOW
    • submit a big bunch of long-lived jobs, for example using cream-test-monitored-submit -r 30 -n 2000 -m 2000 -C 100 -j long.jdl -R <ce_id> and long.jdl is "[executable="/bin/sleep";arguments="3600";]"
    • when all the jobs have been submitted restart the service and verify the startup time.
Added:
>
>
    • verify in the CREAM and BLAHP logs that the jobs are checked one by one at startup, instead of polling all jobs from a given timestamp
 
  • Bug #51993: Proxy renewal not very efficient for multiple jobs having the same delegationid 

Revision 152010-02-18 - PaoloAndreetto

Line: 1 to 1
 

PATCH 3179

Line: 92 to 92
 
    • submit a job specifying a malformed cerequirements parameter
    • verify that the job is executed and the parameter is ignored
Changed:
<
<
  • Bug #51978: CREAM can be slow to start 
>
>
  • Bug #51978: CREAM can be slow to start TESTING NOW
    • submit a big bunch of long-lived jobs, for example using cream-test-monitored-submit -r 30 -n 2000 -m 2000 -C 100 -j long.jdl -R <ce_id> and long.jdl is "[executable="/bin/sleep";arguments="3600";]"
    • when all the jobs have been submitted restart the service and verify the startup time.
 
  • Bug #51993: Proxy renewal not very efficient for multiple jobs having the same delegationid 

Revision 142010-02-17 - PaoloAndreetto

Line: 1 to 1
 

PATCH 3179

Line: 81 to 81
 
  • Bug #51705: glexec-wrapper.sh should be removed from CREAM RPM FIXED
    • check the content of glite-ce-cream rpm
Changed:
<
<
  • Bug #51706: yaim-cream-ce: remove "lcg" prefix from JOB_MANAGER 
>
>
  • Bug #51706: yaim-cream-ce: remove "lcg" prefix from JOB_MANAGER FIXED
    • change the value of JOB_MANAGER in the siteinfo.def
    • configure the node with YAIM and verify that the resource BDII publishes this new value in the GlueCeUniqueids
 
  • Bug #51892: Exception when using java.text.DateFormat.parse 
Line: 137 to 139
 
  • Bug #54812: lsf_submit.sh job requirement 
Changed:
<
<
  • Bug #54900: [ glite-yaim-cream-ce ] config_cream_tomcat_user should not add tomcat to VO groups 
>
>
  • Bug #54900: [ glite-yaim-cream-ce ] config_cream_tomcat_user should not add tomcat to VO FIXED
    • check the membership of any VO group
 
  • Bug #54949: Some job can remain in running state when BLParser is restarted for both lsf and pbs 
Line: 210 to 212
 
    • submit a job and verify the verify the existence of the related symlink in the directory /opt/glite/var/blah/user_blah_job_registry.bjr/registry.proxydir
    • when the job terminates verify that the symlink has been removed by blah.
Changed:
<
<
  • Bug #59686: Possible crash of BUpdarePBS due to wrong malloc 
>
>
  • Bug #59686: Possible crash of BUpdarePBS due to wrong malloc FIXED
    • Define the parameter pbs_spoolpath in the file /opt/glite/etc/blah.config
    • run the BUpdaterPBS daemon and verify its liveness
 
  • Bug #59962: Sometimes the CREAM initialization fails with "UserId = ADMINISTRATOR is not enable for that operation"CANNOT REPRODUCE
Line: 219 to 223
 
    • submit several jobs
    • verify that the log of the CREAM service does not report the error above
Changed:
<
<
  • Bug #61322: CREAM jw doesn't set GLITE_WMS_RB_BROKERINFO 
>
>
  • Bug #61322: CREAM jw doesn't set GLITE_WMS_RB_BROKERINFO FIXED
    • submit a job and verify that the jobwrapper script, contained into the sandbox area for that job, defines correctly the __brokerinfo variable
 
  • Bug #61407: Set CE_ID in the cream jw FIXED
    • submit a job and verify that the jobwrapper script, contained into the sandbox area for that job, defines correctly the CE_ID variable

  • Bug #61493: [ yaim-cream-ce ] glexec_get_account policy order is wrong 
Changed:
<
<
  • Bug #61604: yaim-cream-ce should not install config_gip_software_plugin
>
>
  • Bug #61604: yaim-cream-ce should not install config_gip_software_plugin FIXED
    • verify that the glite-yaim-cream-ce package does not contain the file config_gip_software_plugin but it contains config_cream_gip_software_plugin instead
 
  • Bug #61730: CREAM jw: GLITE_WMS_LOG_DESTINATION should always be set with the FQDN FIXED
    • submit a job and verify that the jobwrapper script, contained into the sandbox area for that job, defines a FQDN in the __ce_hostname variable

Revision 132010-02-16 - PaoloAndreetto

Line: 1 to 1
 

PATCH 3179

Line: 8 to 8
 

Checked bugs:

Added:
>
>
  • Bug #17949: BLAH should operate with no access whatsoever to the batch system logs
 
  • Bug #37430: BLParser should properly filter it's log output
Changed:
<
<
  • Bug #45364: BLAH_JOB_CANCEL should report failure reason
>
>
  • Bug #45364: BLAH_JOB_CANCEL should report failure reason FIXED
    • submit and cancel a job using the LRMS command (e.g. qdel) and verify that the reason reports "Cancelled by CE admin"
 
  • Bug #46419: CREAM sandbox area should be scratched when the CREAM DB is scratched FIXED
    • Submit at least one job to the CE and wait for its termination, so that the sandbox area is not empty
Line: 240 to 242
 
  • Bug #62436: Possible problem with updater if job remain queued too long
Added:
>
>
  • Bug #62776: Yaim config for CREAM CE erroneously requires tomcat in glexec group

  • Bug #62893: Possible proxy renewal problem in the CREAM jw
 -- AlessioGianelle - 2010-02-05

Revision 122010-02-16 - PaoloAndreetto

Line: 1 to 1
 

PATCH 3179

Line: 173 to 173
 
  • Bug #56697: CREAM logging must be improved when CREAM register operation fails FIXED
    • force the service to fail a register operation, e.g. temporary renaming the sandbox directory
Changed:
<
<
    • verify that the log reports at least the JobID and the reason of the failure
>
>
    • verify that the log reports at least the JobID and the reason of the failure
 
  • Bug #57210: BLAH condor_submit script doesn't recognize certain options. 
Line: 210 to 210
 
  • Bug #59686: Possible crash of BUpdarePBS due to wrong malloc 
Changed:
<
<
  • Bug #59962: Sometimes the CREAM initialization fails with "UserId = ADMINISTRATOR is not enable for that operation CANNOT REPRODUCE
>
>
  • Bug #59962: Sometimes the CREAM initialization fails with "UserId = ADMINISTRATOR is not enable for that operation"CANNOT REPRODUCE
 
Changed:
<
<
  • Bug #60831: Error log message: "CREAM_JOB_SENSOR_HOST parameter not specified!" 
>
>
  • Bug #60831: Error log message: "CREAM_JOB_SENSOR_HOST parameter not specified" FIXED
    • verify that the parameter "CREAM_JOB_SENSOR_HOST" is not defined in the file /opt/glite/etc/glite-ce-cream/cream-config.xml
    • submit several jobs
    • verify that the log of the CREAM service does not report the error above
 
  • Bug #61322: CREAM jw doesn't set GLITE_WMS_RB_BROKERINFO 
Changed:
<
<
  • Bug #61407: Set CE_ID in the cream jw 
>
>
  • Bug #61407: Set CE_ID in the cream jw FIXED
    • submit a job and verify that the jobwrapper script, contained into the sandbox area for that job, defines correctly the CE_ID variable
 
  • Bug #61493: [ yaim-cream-ce ] glexec_get_account policy order is wrong 
Added:
>
>
  • Bug #61604: yaim-cream-ce should not install config_gip_software_plugin
 
Added:
>
>
  • Bug #61730: CREAM jw: GLITE_WMS_LOG_DESTINATION should always be set with the FQDN FIXED
    • submit a job and verify that the jobwrapper script, contained into the sandbox area for that job, defines a FQDN in the __ce_hostname variable
 
Added:
>
>
  • Bug #61761: CEMon must guarantee the notification rate
 
Added:
>
>
  • Bug #61790: Problems in CREAM CE when there are "strange" characters in the subject certificate
 
Added:
>
>
  • Bug #62070: Possible problem with notification time in BNotifier

  • Bug #62207: [ yaim-cream ] Enable Glue 2.0 publishing

  • Bug #62436: Possible problem with updater if job remain queued too long
  -- AlessioGianelle - 2010-02-05

Revision 112010-02-15 - PaoloAndreetto

Line: 1 to 1
 

PATCH 3179

Line: 53 to 53
 
  • Bug #51046: CREAM: DelegProxyInfo info sometimes is wrong FIXED
    • submit a job, wait for its termination and verify the correct lifetime of the proxy
Changed:
<
<
  • Bug #51118: config_cream_glexec doesn't set glexec permissions right 
>
>
  • Bug #51118: config_cream_glexec doesn't set glexec permissions right FIXED
    • install a CE node from scratch and verify the permissions for /opt/glite/sbin/glexec (6555) and /opt/glite/etc/glexec.conf (640)
 
  • Bug #51124: catalina.out is clogged with grid-proxy-init warnings FIXED
    • submit a job and check the catalina.out file
Line: 73 to 74
 
  • Bug #51311: Wrong event timestamp generated by the CREAM Job Sensor FIXED
Changed:
<
<
  • Bug #51313: CEMon must not notify the expired events
>
>
  • Bug #51313: CEMon must not notify the expired events CANNOT REPRODUCE
 
  • Bug #51705: glexec-wrapper.sh should be removed from CREAM RPM FIXED
    • check the content of glite-ce-cream rpm
Line: 83 to 84
 
  • Bug #51892: Exception when using java.text.DateFormat.parse 
Changed:
<
<
  • Bug #51928: BLAH crashes if the cerequirements classad attribute is malformed 
>
>
  • Bug #51928: BLAH crashes if the cerequirements classad attribute is malformed FIXED
    • submit a job specifying a malformed cerequirements parameter
    • verify that the job is executed and the parameter is ignored
 
  • Bug #51978: CREAM can be slow to start 
Line: 158 to 161
 
    • submit a job with an unreachable host in the inputsandbox or in the outputsandboxbasedesturi parameter
    • verify that in the log file appears the message: failureReason=Cannot move ISB (): error: globus_xio: Unable to connect to xxxx:2811 globus_xio: globus_libc_getaddrinfo failed.globus_common: Name or service not known
Changed:
<
<
  • Bug #56339: [blah] "service glite-ce-blparser restart" does not always work 
>
>
  • Bug #56339: [blah] "service glite-ce-blparser restart" does not always work FIXED
    • try the command /opt/glite/etc/init.d/glite-ce-blparser restart and verify the correct behaviour of the script
 
  • Bug #56367: CREAM RPM depends on C libs FIXED
    • check if the package of glite-ce-cream contains any elf executable
Changed:
<
<
  • Bug #56518: BLAH blparser doesn't start after boot of the machine 

  • Bug #56697: CREAM logging must be improved when CREAM register operation fails 

  • Bug #56762: CREAM doesn't accept anymore jobs with NodeNumber and/or CpuNumber NOT FIXED
    • submit the following jdl
      [
      executable="/bin/sleep";
      arguments="60";
      jobtype="Normal";
      cpunumber=3;
      ]
      
    • verify that the submission has been rejected
>
>
  • Bug #56518: BLAH blparser doesn't start after boot of the machine FIXED
    • install the CE node from scratch specifying the parameter BLPARSER_WITH_UPDATER_NOTIFIER=false in the yaim configuration for creamCE
    • reboot the machine and verify that the blparser_master is running

  • Bug #56697: CREAM logging must be improved when CREAM register operation fails FIXED
    • force the service to fail a register operation, e.g. temporary renaming the sandbox directory
    • verify that the log reports at least the JobID and the reason of the failure
 
  • Bug #57210: BLAH condor_submit script doesn't recognize certain options. 
Line: 205 to 199
 
    • temporary rename the directory /opt/glite/var/cream_sandbox without turning off the service
    • submit a job and verify that the failure reports "cannot create the job's working directory!"
Changed:
<
<
  • Bug #58941: [yaim-cream-ce] lcmaps confs for glexec and gridftp are not fully synchronized (TM) 
>
>
  • Bug #58941: [yaim-cream-ce] lcmaps confs for glexec and gridftp are not fully synchronized
 
  • Bug #59005: Possible problem with hold/resumed jobs in BUpdaterLSF 
Changed:
<
<
  • Bug #59329: Proxy symlinks left in the registry area until purged 
>
>
  • Bug #59329: Proxy symlinks left in the registry area until purged FIXED
    • submit a job and verify the verify the existence of the related symlink in the directory /opt/glite/var/blah/user_blah_job_registry.bjr/registry.proxydir
    • when the job terminates verify that the symlink has been removed by blah.
 
  • Bug #59686: Possible crash of BUpdarePBS due to wrong malloc 

Revision 102010-02-12 - PaoloAndreetto

Line: 1 to 1
 

PATCH 3179

Line: 36 to 36
 
    • delegate a proxy whose lifetime is shorter than the parameter delegation_purge_rate of the CREAM configuration file
    • wait for the new proxy cleanup run (at least twice the delegation_purge_rate) and verify that the proxy file has been removed from the directory
Changed:
<
<
  • Bug #50226: yaim-cream-ce should use config_secure_tomcat  
>
>
  • Bug #50226: yaim-cream-ce should use config_secure_tomcat FIXED
    • install the CE node from scratch
    • verify the state of the trustmanager accessing the URL: https://ce-host:8443/ce-cream/services
 
  • Bug #50723: CREAM: check for the jobtype is not case insensitive FIXED
    • submit a job specifying the parameter "jobtype=Normal" in the JDL and verify the correct execution of the job
Line: 58 to 59
 
    • submit a job and check the catalina.out file
Changed:
<
<
  • Bug #51128: lcas-suexec.db on CREAM CE should be named lcas-glexec.db for consistency 
>
>
  • Bug #51128: lcas-suexec.db on CREAM CE should be named lcas-glexec.db for consistency FIXED
    • install a CE node from scratch
    • verify the existence of the files: /opt/glite/etc/lcas/lcas-glexec.db and /opt/glite/etc/lcmaps/lcmaps-glexec.db
 
  • Bug #51249: [ yaim-cream-ce ] refactor config_cream_db 
Line: 103 to 106
 
    • wait for cemonitor to reload the configuration (usually 10m)
    • verify the availability of the topic using the command glite-ce-monitor-gettopics
Changed:
<
<
  • Bug #52268: BLAH leaves files in /tmp when CErequirements is set 
>
>
  • Bug #52268: BLAH leaves files in /tmp when CErequirements is set FIXED
    • submit a job specifying a simple CE requirements (e.g. cerequirements="other.GlueHostMainMemoryRAMSize > 2000")
    • verify that, after the execution of the job, in the tmp directory no files ce-req-file-* are left
 
  • Bug #52577: [ yaim-cream-ce ] create CREAM_GLEXEC_USER_HOME variable  
Line: 165 to 169
 
  • Bug #56697: CREAM logging must be improved when CREAM register operation fails 
Changed:
<
<
  • Bug #56762: CREAM doesn't accept anymore jobs with NodeNumber and/or CpuNumber ! 
>
>
  • Bug #56762: CREAM doesn't accept anymore jobs with NodeNumber and/or CpuNumber NOT FIXED
    • submit the following jdl
      [
      executable="/bin/sleep";
      arguments="60";
      jobtype="Normal";
      cpunumber=3;
      ]
      
    • verify that the submission has been rejected
 
  • Bug #57210: BLAH condor_submit script doesn't recognize certain options. 
Line: 174 to 187
 
  • Bug #57820: [yaim-cream-ce] CREAM-CE publishes GlueServiceDataValue incomplete 
Changed:
<
<
  • Bug #58103: Cream database Query performance. 
>
>
  • Bug #58103: Cream database Query performance FIXED
    • Internal improvement
    • run a set of stress-tests and verify the performance
 
  • Bug #58109: Wrong value for the "service version" property FIXED
    • verify the property using the command glite-ce-service-info
Line: 199 to 214
 
  • Bug #59686: Possible crash of BUpdarePBS due to wrong malloc 
Changed:
<
<
  • Bug #59962: Sometimes the CREAM initialization fails with "UserId = ADMINISTRATOR is not enable for that operation!" error 
>
>
  • Bug #59962: Sometimes the CREAM initialization fails with "UserId = ADMINISTRATOR is not enable for that operation CANNOT REPRODUCE
 
  • Bug #60831: Error log message: "CREAM_JOB_SENSOR_HOST parameter not specified!" 

Revision 92010-02-11 - PaoloAndreetto

Line: 1 to 1
 

PATCH 3179

Line: 135 to 135
 
  • Bug #55078: Possible final state not considered in BLParserPBS and BUpdaterPBS 
Changed:
<
<
  • Bug #55420: Allow admin to purge CREAM jobs in a non terminal status 
>
>
  • Bug #55420: Allow admin to purge CREAM jobs in a non terminal status FIXED
    • temporary disconnect any WN from the CE, e.g. shutting down the mom server in a TORQUE installation
    • submit a job
    • on the CE with administrator privileges run the command: /opt/glite/sbin/JobDBAdminPurger.sh -u -p -s 2 as described in the wiki page
    • verify with glite-ce-job-list that the job has been purged from the database
    • verify that the sandbox directory of that job has been removed from /opt/glite/var/cream_sandbox
    • remove manually the job from the batch system and reconnect all the WN
 
  • Bug #55438: BUpdater problems in updating job state with AssignFinalState for all batch systems 
Line: 145 to 150
 
  • Bug #55565: BLAH configuration attribute blah_disable_wn_proxy_renewal fails to disable proxy renewal. 
Changed:
<
<
  • Bug #56075: Job failure reasons missing in the CREAM log file 
>
>
  • Bug #56075: Job failure reasons missing in the CREAM log file FIXED
    • submit a job with an unreachable host in the inputsandbox or in the outputsandboxbasedesturi parameter
    • verify that in the log file appears the message: failureReason=Cannot move ISB (): error: globus_xio: Unable to connect to xxxx:2811 globus_xio: globus_libc_getaddrinfo failed.globus_common: Name or service not known
 
  • Bug #56339: [blah] "service glite-ce-blparser restart" does not always work 
Line: 169 to 176
 
  • Bug #58103: Cream database Query performance. 
Changed:
<
<
  • Bug #58109: Wrong value for the "service version" property 
>
>
  • Bug #58109: Wrong value for the "service version" property FIXED
    • verify the property using the command glite-ce-service-info
 
  • Bug #58119: CREAM CE: publish Production instead of Special as default value for GlueCEStateStatus 
Line: 179 to 186
 
  • Bug #58659: NullPointerException from getStatus 
Changed:
<
<
  • Bug #58792: JobRegister fails, because cream_sandbox directory doesn't exist. 
>
>
  • Bug #58792: JobRegister fails, because cream_sandbox directory doesn't exist FIXED
    • temporary rename the directory /opt/glite/var/cream_sandbox without turning off the service
    • submit a job and verify that the failure reports "cannot create the job's working directory!"
 
  • Bug #58941: [yaim-cream-ce] lcmaps confs for glexec and gridftp are not fully synchronized (TM) 

Revision 82010-02-11 - PaoloAndreetto

Line: 1 to 1
 

PATCH 3179

Line: 93 to 93
 
    • The CREAM service does not make use of glexec anymore
Changed:
<
<
  • Bug #52051: CEMon must remove all expired subscriptions on start-up 

  • Bug #52052: Sometimes the getInfo() operation does not report the right list of topics TO BE INVESTIGATED
>
>
  • Bug #52051: CEMon must remove all expired subscriptions on start-up FIXED
    • create a subscription for the topic CREAM_JOBS on the CE with a short lifetime
    • shutdown the service and wait for the expiration of the subscription
    • restart the service and verify that the subscription does not exist anymore in the directory /opt/glite/var/cemonitor/subscription

  • Bug #52052: Sometimes the getInfo() operation does not report the right list of topics FIXED
    • enable or disable the CE sensor removing or adding the corresponding tag in the file /opt/glite/etc/glite-ce-monitor/cemonitor-config.xml
    • wait for cemonitor to reload the configuration (usually 10m)
    • verify the availability of the topic using the command glite-ce-monitor-gettopics
 
  • Bug #52268: BLAH leaves files in /tmp when CErequirements is set 

Revision 72010-02-10 - PaoloAndreetto

Line: 1 to 1
 

PATCH 3179

Line: 95 to 95
 
  • Bug #52051: CEMon must remove all expired subscriptions on start-up 
Changed:
<
<
  • Bug #52052: Sometimes the getInfo() operation does not report the right list of topics. 
>
>
  • Bug #52052: Sometimes the getInfo() operation does not report the right list of topics TO BE INVESTIGATED
 
  • Bug #52268: BLAH leaves files in /tmp when CErequirements is set 

Revision 62010-02-10 - PaoloAndreetto

Line: 1 to 1
 

PATCH 3179

Line: 108 to 108
 
  • Bug #52719: Blah doesn't set the 'executable' flag if a local jobwrapper is found 
Changed:
<
<
  • Bug #52942: Missing description for ISB/OSB error in jobwrapper 
>
>
  • Bug #52942: Missing description for ISB/OSB error in jobwrapper FIXED
    • submit a job with an unreachable host in the inputsandbox or in the outputsandboxbasedesturi parameter
    • verify that the output of the glite-ce-job-status contains the full description of the failure
 
  • Bug #53124: blparser_master could crash if some variable in blparser.conf are not set 

Revision 52010-02-10 - PaoloAndreetto

Line: 1 to 1
 

PATCH 3179

Line: 62 to 62
 
  • Bug #51249: [ yaim-cream-ce ] refactor config_cream_db 
Changed:
<
<
  • Bug #51310: Wrong event timestamp 
>
>
  • Bug #51310: Wrong event timestamp FIXED
    • run the consumer server (glite-ce-monitor-consumer) on the client machine
    • create a subscription for the topic CREAM_JOBS on the CE specifying the URL of the consumer server above
    • submit a job and verify the validity of the field TIMESTAMP of any event
 
Changed:
<
<
  • Bug #51311: Wrong event timestamp generated by the CREAM Job Sensor 
>
>
  • Bug #51311: Wrong event timestamp generated by the CREAM Job Sensor FIXED
 
Changed:
<
<
  • Bug #51313: CEMon must not notify the expired events. 
>
>
  • Bug #51313: CEMon must not notify the expired events
 
  • Bug #51705: glexec-wrapper.sh should be removed from CREAM RPM FIXED
    • check the content of glite-ce-cream rpm
Line: 83 to 87
 
  • Bug #51993: Proxy renewal not very efficient for multiple jobs having the same delegationid 
Changed:
<
<
  • Bug #52020: [ yaim-cream-ce ] Support use of file (besides syslog) for glexec logging 
>
>
  • Bug #52020: [ yaim-cream-ce ] Support use of file (besides syslog) for glexec logging
 
Changed:
<
<
  • Bug #52050: misleading error message "The problem seems to be related to glexec" 
>
>
  • Bug #52050: misleading error message "The problem seems to be related to glexec INVALID
    • The CREAM service does not make use of glexec anymore
 

  • Bug #52051: CEMon must remove all expired subscriptions on start-up 
Line: 97 to 102
 
  • Bug #52577: [ yaim-cream-ce ] create CREAM_GLEXEC_USER_HOME variable  
Changed:
<
<
  • Bug #52651: CREAM file descriptor overuse 
>
>
  • Bug #52651: CREAM file descriptor overuse FIXED
    • try to overload the node with the testsuite: cream-test-monitored-submit -r 30 -n 20000 -m 2000 -C 50 -l log4py.conf -j test.jdl -R --sotimeout 60 --vo dteam --valid 02:00 and verify that with high load the submissions are rejected
    • seek "too many open files" in the CREAM log
 
  • Bug #52719: Blah doesn't set the 'executable' flag if a local jobwrapper is found 

Revision 42010-02-09 - PaoloAndreetto

Line: 1 to 1
 

PATCH 3179

Line: 68 to 68
 
  • Bug #51313: CEMon must not notify the expired events. 
Changed:
<
<
  • Bug #51705: glexec-wrapper.sh should be removed from CREAM RPM 
>
>
  • Bug #51705: glexec-wrapper.sh should be removed from CREAM RPM FIXED
    • check the content of glite-ce-cream rpm
 
  • Bug #51706: yaim-cream-ce: remove "lcg" prefix from JOB_MANAGER 
Line: 107 to 108
 
  • Bug #53459: [CREAM] Provide method to improve the detection of job status changes by ICE 
Changed:
<
<
  • Bug #53499: CREAM job wrapper template should be put outside the jar 
>
>
  • Bug #53499: CREAM job wrapper template should be put outside the jar FIXED
    • check wheter the file /opt/glite/share/webapps/ce-cream.war contains the file WEB-INF/jobwrapper.tpl
 
  • Bug #54812: lsf_submit.sh job requirement 
Line: 133 to 135
 
  • Bug #56339: [blah] "service glite-ce-blparser restart" does not always work 
Changed:
<
<
  • Bug #56367: CREAM RPM depends on C libs 
>
>
  • Bug #56367: CREAM RPM depends on C libs FIXED
    • check if the package of glite-ce-cream contains any elf executable
 
  • Bug #56518: BLAH blparser doesn't start after boot of the machine 

Revision 32010-02-09 - PaoloAndreetto

Line: 1 to 1
 

PATCH 3179

Line: 54 to 54
 
  • Bug #51118: config_cream_glexec doesn't set glexec permissions right 
Changed:
<
<
  • Bug #51124: catalina.out is clogged with grid-proxy-init warnings 
>
>
  • Bug #51124: catalina.out is clogged with grid-proxy-init warnings FIXED
    • submit a job and check the catalina.out file
 

  • Bug #51128: lcas-suexec.db on CREAM CE should be named lcas-glexec.db for consistency 

Revision 22010-02-08 - PaoloAndreetto

Line: 1 to 1
 

PATCH 3179

Line: 13 to 13
 
  • Bug #45364: BLAH_JOB_CANCEL should report failure reason
Changed:
<
<
  • Bug #46419: CREAM sandbox area should be scratched when the CREAM DB is scratched
>
>
  • Bug #46419: CREAM sandbox area should be scratched when the CREAM DB is scratched FIXED
    • Submit at least one job to the CE and wait for its termination, so that the sandbox area is not empty
    • Increment the value of the parameters creamdb_database_version and/or delegationdb_database_version in the file /opt/glite/etc/glite-ce-cream/cream-config.xml.template
    • reconfigure the node with yaim and check whether the sandbox area is empty
 

  • Bug #47070: [ yaim-cream ] yaim cream module should support remote mysql setup
Changed:
<
<
  • Bug #47254: Possible problems if the proxy used to talk with CREAM is shorter than 10 minutes 
>
>
  • Bug #47254: Possible problems if the proxy used to talk with CREAM is shorter than 10 minutes FIXED
    • create a voms-proxy whose lifetime is shorter than 10 minutes
    • submit a simple job whose lifetime is shorter than the voms-proxy one and verify its correct termination
 
  • Bug #47804: Possible problems configuring blah in CREAM-CE for LSF 
Changed:
<
<
  • Bug #48786: Load should be one of the parameter of DISABLE_SUBMISSION_POLICY in CREAM 

  • Bug #49497: user proxies on CREAM do not get cleaned up 
>
>
  • Bug #48786: Load should be one of the parameter of DISABLE_SUBMISSION_POLICY in CREAM FIXED
    • specify a low load level in the file /opt/glite/bin/glite_cream_load_monitor
    • try to overload the node with the testsuite: cream-test-monitored-submit -r 30 -n 20000 -m 2000 -C 50 -l log4py.conf -j test.jdl -R --sotimeout 60 --vo dteam --valid 02:00 and verify that with high load the submissions are rejected

  • Bug #49497: user proxies on CREAM do not get cleaned up FIXED
    • delegate a proxy whose lifetime is shorter than the parameter delegation_purge_rate of the CREAM configuration file
    • wait for the new proxy cleanup run (at least twice the delegation_purge_rate) and verify that the proxy file has been removed from the directory
 
  • Bug #50226: yaim-cream-ce should use config_secure_tomcat  
Changed:
<
<
  • Bug #50723: CREAM: check for the jobtype is not case insensitive  
>
>
  • Bug #50723: CREAM: check for the jobtype is not case insensitive FIXED
    • submit a job specifying the parameter "jobtype=Normal" in the JDL and verify the correct execution of the job
 
Changed:
<
<
  • Bug #50875: CREAM: reason for cancelled jobs should be reported 
>
>
  • Bug #50875: CREAM: reason for cancelled jobs should be reported FIXED
    • submit and cancel a job using the CREAM CLI command and verify that the reason reports "Cancelled by user"
    • submit and cancel a job using the LRMS command (e.g. qdel) and verify that the reason reports "Cancelled by CE admin"
 
  • Bug #50876: CREAM reports that the proxy expired even when the problem is in detecting the lifetime of the proxy 
Changed:
<
<
  • Bug #51046: CREAM: DelegProxyInfo info sometimes is wrong 
>
>
  • Bug #51046: CREAM: DelegProxyInfo info sometimes is wrong FIXED
    • submit a job, wait for its termination and verify the correct lifetime of the proxy
 
  • Bug #51118: config_cream_glexec doesn't set glexec permissions right 

Revision 12010-02-05 - AlessioGianelle

Line: 1 to 1
Added:
>
>

PATCH 3179

Automatic tests:

Checked bugs:

  • Bug #37430: BLParser should properly filter it's log output

  • Bug #45364: BLAH_JOB_CANCEL should report failure reason

  • Bug #46419: CREAM sandbox area should be scratched when the CREAM DB is scratched

  • Bug #47070: [ yaim-cream ] yaim cream module should support remote mysql setup

  • Bug #47254: Possible problems if the proxy used to talk with CREAM is shorter than 10 minutes 

  • Bug #47804: Possible problems configuring blah in CREAM-CE for LSF 

  • Bug #48786: Load should be one of the parameter of DISABLE_SUBMISSION_POLICY in CREAM 

  • Bug #49497: user proxies on CREAM do not get cleaned up 

  • Bug #50226: yaim-cream-ce should use config_secure_tomcat  

  • Bug #50723: CREAM: check for the jobtype is not case insensitive  

  • Bug #50875: CREAM: reason for cancelled jobs should be reported 

  • Bug #50876: CREAM reports that the proxy expired even when the problem is in detecting the lifetime of the proxy 

  • Bug #51046: CREAM: DelegProxyInfo info sometimes is wrong 

  • Bug #51118: config_cream_glexec doesn't set glexec permissions right 

  • Bug #51124: catalina.out is clogged with grid-proxy-init warnings 

  • Bug #51128: lcas-suexec.db on CREAM CE should be named lcas-glexec.db for consistency 

  • Bug #51249: [ yaim-cream-ce ] refactor config_cream_db 

  • Bug #51310: Wrong event timestamp 

  • Bug #51311: Wrong event timestamp generated by the CREAM Job Sensor 

  • Bug #51313: CEMon must not notify the expired events. 

  • Bug #51705: glexec-wrapper.sh should be removed from CREAM RPM 

  • Bug #51706: yaim-cream-ce: remove "lcg" prefix from JOB_MANAGER 

  • Bug #51892: Exception when using java.text.DateFormat.parse 

  • Bug #51928: BLAH crashes if the cerequirements classad attribute is malformed 

  • Bug #51978: CREAM can be slow to start 

  • Bug #51993: Proxy renewal not very efficient for multiple jobs having the same delegationid 

  • Bug #52020: [ yaim-cream-ce ] Support use of file (besides syslog) for glexec logging 

  • Bug #52050: misleading error message "The problem seems to be related to glexec" 

  • Bug #52051: CEMon must remove all expired subscriptions on start-up 

  • Bug #52052: Sometimes the getInfo() operation does not report the right list of topics. 

  • Bug #52268: BLAH leaves files in /tmp when CErequirements is set 

  • Bug #52577: [ yaim-cream-ce ] create CREAM_GLEXEC_USER_HOME variable  

  • Bug #52651: CREAM file descriptor overuse 

  • Bug #52719: Blah doesn't set the 'executable' flag if a local jobwrapper is found 

  • Bug #52942: Missing description for ISB/OSB error in jobwrapper 

  • Bug #53124: blparser_master could crash if some variable in blparser.conf are not set 

  • Bug #53459: [CREAM] Provide method to improve the detection of job status changes by ICE 

  • Bug #53499: CREAM job wrapper template should be put outside the jar 

  • Bug #54812: lsf_submit.sh job requirement 

  • Bug #54900: [ glite-yaim-cream-ce ] config_cream_tomcat_user should not add tomcat to VO groups 

  • Bug #54949: Some job can remain in running state when BLParser is restarted for both lsf and pbs 

  • Bug #55078: Possible final state not considered in BLParserPBS and BUpdaterPBS 

  • Bug #55420: Allow admin to purge CREAM jobs in a non terminal status 

  • Bug #55438: BUpdater problems in updating job state with AssignFinalState for all batch systems 

  • Bug #55531: BUpdaterPBS should consider lines like "unable to run job" 

  • Bug #55565: BLAH configuration attribute blah_disable_wn_proxy_renewal fails to disable proxy renewal. 

  • Bug #56075: Job failure reasons missing in the CREAM log file 

  • Bug #56339: [blah] "service glite-ce-blparser restart" does not always work 

  • Bug #56367: CREAM RPM depends on C libs 

  • Bug #56518: BLAH blparser doesn't start after boot of the machine 

  • Bug #56697: CREAM logging must be improved when CREAM register operation fails 

  • Bug #56762: CREAM doesn't accept anymore jobs with NodeNumber and/or CpuNumber ! 

  • Bug #57210: BLAH condor_submit script doesn't recognize certain options. 

  • Bug #57307: condor_submit.sh does not support the handling of "local" attributes 

  • Bug #57820: [yaim-cream-ce] CREAM-CE publishes GlueServiceDataValue incomplete 

  • Bug #58103: Cream database Query performance. 

  • Bug #58109: Wrong value for the "service version" property 

  • Bug #58119: CREAM CE: publish Production instead of Special as default value for GlueCEStateStatus 

  • Bug #59423: RFE: support for ISB/OSB transfers from/to gridftp servers running using credentials 

  • Bug #58659: NullPointerException from getStatus 

  • Bug #58792: JobRegister fails, because cream_sandbox directory doesn't exist. 

  • Bug #58941: [yaim-cream-ce] lcmaps confs for glexec and gridftp are not fully synchronized (TM) 

  • Bug #59005: Possible problem with hold/resumed jobs in BUpdaterLSF 

  • Bug #59329: Proxy symlinks left in the registry area until purged 

  • Bug #59686: Possible crash of BUpdarePBS due to wrong malloc 

  • Bug #59962: Sometimes the CREAM initialization fails with "UserId = ADMINISTRATOR is not enable for that operation!" error 

  • Bug #60831: Error log message: "CREAM_JOB_SENSOR_HOST parameter not specified!" 

  • Bug #61322: CREAM jw doesn't set GLITE_WMS_RB_BROKERINFO 

  • Bug #61407: Set CE_ID in the cream jw 

  • Bug #61493: [ yaim-cream-ce ] glexec_get_account policy order is wrong 

-- AlessioGianelle - 2010-02-05

 
This site is powered by the TWiki collaboration platformCopyright © 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback