PATCH 3179

Automatic tests

  • report #1:
    • CREAM UI version: 1.12.1; CREAM testsuite version: 1.0.7
    • Used event query for monitoring and BUpdater/BNotifier for status change detection
    • Batch system: LSF
    • All the tests complete successfully, view the reports

  • report #2
    • CREAM UI version: 1.11.1; CREAM testsuite version: 1.0.6
    • used direct polling for monitoring and BLParser for status change detection
    • Batch system: TORQUE
    • All the tests complete successfully, view the reports

Since the current version of the CREAM CE does not enable CEMonitor for a standard installation, all the tests that make use of the notification mechanism have not been taken into account

Test submission through a WMS (i.e. ICE)

Description:
  • 2880 collections each of 25 jobs
  • One collection every 60 seconds
  • Four users
  • We use these CEs located at Padua:
    • 6 CEs SL5/64b with cream version 1.12 (2 lsf + 4 torque)
    • 4 CEs SL4 with cream version 1.11 (2 lsf + 2 torque)
    • 11 CEs SL4 with cream version 1.12 (5 lsf + 6 torque)
  • Use automatic-delegation
  • The job is a "sleep random(7200)"
  • Resubmission is enabled
  • Use proxy renewal service (myproxy.cern.ch)
  • Lease mechanism is not used

Results

  • Collections correctly submitted: 2868 (71700 jobs)
    • DONE OK: 71663 (99.95%)
    • NOTDONE: 0 (0 %)
    • ABORTED: 0 (0 %)
    • CANCELLED: 37 (0.05 %) (Stucked in torque queues)
    • Resubmitted: 82 (0.11 %)

  • The 82 jobs have been resubmit due to an error like this one:
    Cannot move OSB (${globus_transfer_cmd} file:///tmp/CREAM954728532/env.err gsiftp://devel18.cnaf.infn.it:2811/var/glite/SandboxDir
    /P4/https_3a_2f_2fdevel15.cnaf.infn.it_3a9000_2fP4KWHXbaYyEAymru3kNugA/output/env.err): proxy expired; /opt/glite/bin/glite-lb-logevent: 
    edg_wll_LogEvent*(): LB server (bkserver,lbproxy) store protocol error (edg_wll_LogEvent():  LB server (bkserver,lbproxy) store protocol error;; Logging library 
    ERROR:  LB server (bkserver,lbproxy) store protocol error;; edg_wll_DoLogEvent(): edg_wll_log_connect error GSSAPI Error;; edg_wll_gss_connect();; GSS 
    Error: GSS failure occured: GSS Major Status: General failure  (GSS Minor Status Error Chain: globus_gsi_gssapi: Error with gss context globus_gsi_gssapi: 
    Error with GSI credential globus_gsi_gssapi: Error with gss credential handle globus_credential: Error with credential: The proxy credential: /home/dteam002
    /home_cre34_954728532/cre34_954728532.proxy       with subject: /C=IT/O=INFN/OU=Personal Certificate/L=Padova/CN=Luigi Zangrando/CN=proxy
    /CN=proxy/CN=proxy/CN=limited proxy       expired 42 minutes ago.  )) Cannot move OSB (${globus_transfer_cmd} file:///tmp/CREAM954728532/env.err 
    gsiftp://devel18.cnaf.infn.it:2811/var/glite/SandboxDir/P4/https_3a_2f_2fdevel15.cnaf.infn.it_3a9000_2fP4KWHXbaYyEAymru3kNugA/output/env.err): proxy 
    expired
    All the errors occur in the old CEs (i.e. version 1.11).

ice.png

Checked bugs

  • Bug #37430: BLParser should properly filter it's log output FIXED
    • Not too clear what the fix is supposed to be
    • According to the developer (M. Mezzadri) the command received by the old blparser from CREAM should be reported in the blparser log file without an extra new-line
    • Verified in the old blparser log file

  • Bug #45364: BLAH_JOB_CANCEL should report failure reason FIXED
    • submit a job top CREAM and then cancels it using the LRMS command (e.g. qdel). Before the blparser (and therefore CREAM) realizes that the job was cancelled, issue a glite-ce-job-cancel.
    • Issue a glite-ce-job-status -L 2. For the cancel command a failure )alomng with its reason) should be reported such as:

   *** Command Name              = [JOB_CANCEL]
       Command Category          = [JOB_MANAGEMENT]
       Command Status            = [ERROR]
       Command Fail Reason       = [qdel: Unknown Job Id 45299.cream-38.pd.infn.it]
       Creation Time             = [Fri 26 Feb 2010 18:43:27] (1267206207)
       Start Scheduling Time     = [Fri 26 Feb 2010 18:43:27] (1267206207)
       Start Processing Time     = [Fri 26 Feb 2010 18:43:27] (1267206207)
       Execution Completed Time  = [Fri 26 Feb 2010 18:43:30] (1267206210)

  • Bug #46419: CREAM sandbox area should be scratched when the CREAM DB is scratched FIXED
    • Submit at least one job to the CE and wait for its termination, so that the sandbox area is not empty
    • Increment the value of the parameters creamdb_database_version in the file /opt/glite/etc/glite-ce-cream/cream-config.xml.template
    • reconfigure the node with yaim and check whether the sandbox area is empty

  • Bug #47070: [ yaim-cream ] yaim cream module should support remote mysql setup HOPEFULLY FIXED

  • Bug #47254: Possible problems if the proxy used to talk with CREAM is shorter than 10 minutes FIXED
    • create a voms-proxy whose lifetime is shorter than 10 minutes
    • submit a simple job whose lifetime is shorter than the voms-proxy one and verify its correct termination

  • Bug #47804: Possible problems configuring blah in CREAM-CE for LSF FIXED
    • copy the file profile.lsf from the LSF configuration directory into a new destination, for example /tmp
    • define in the site-info.def the variable LSFPROFILE_DIR=/tmp and reconfigure with yaim
    • verify that in the file /opt/glite/etc/blah.conf the profile is loaded from the new path

  • Bug #48786: Load should be one of the parameter of DISABLE_SUBMISSION_POLICY in CREAM FIXED
    • specify a low load level in the file /opt/glite/bin/glite_cream_load_monitor
    • try to overload the node with the testsuite: cream-test-monitored-submit -r 30 -n 20000 -m 2000 -C 50 -l log4py.conf -j test.jdl -R <ceID> --sotimeout 60 --vo dteam --valid 02:00 and verify that with high load the submissions are rejected

  • Bug #49497: user proxies on CREAM do not get cleaned up FIXED
    • delegate a proxy whose lifetime is shorter than the parameter delegation_purge_rate of the CREAM configuration file
    • wait for the new proxy cleanup run (at least twice the delegation_purge_rate) and verify that the proxy file has been removed from the directory

  • Bug #50226: yaim-cream-ce should use config_secure_tomcat FIXED
    • install the CE node from scratch
    • verify the state of the trustmanager accessing the URL: https://ce-host:8443/ce-cream/services

  • Bug #50723: CREAM: check for the jobtype is not case insensitive FIXED
    • submit a job specifying the parameter "jobtype=Normal" in the JDL and verify the correct execution of the job

  • Bug #50875: CREAM: reason for cancelled jobs should be reported FIXED
    • submit and cancel a job using the CREAM CLI command and verify that the reason reports "Cancelled by user"
    • submit and cancel a job using the LRMS command (e.g. qdel) and verify that the reason reports "Cancelled by CE admin"

  • Bug #50876: CREAM reports that the proxy expired even when the problem is in detecting the lifetime of the proxy FIXED
    • force a failure for the command grid-proxy-init in the jobwrapper, for example delegating a proxy on the CE, manually renaming the corresponding delegated proxy in the sandbox area and then submitting a job using the given delegation ID.
    • verify that the failure reason reported by the job status contains the message: Problem to detect the lifetime of the proxy

  • Bug #51046: CREAM: DelegProxyInfo info sometimes is wrong FIXED
    • submit a job, wait for its termination and verify the correct lifetime of the proxy in the glite-ce-job-status output

  • Bug #51118: config_cream_glexec doesn't set glexec permissions right FIXED
    • install a CE node from scratch and verify the permissions for /opt/glite/sbin/glexec (6555) and /opt/glite/etc/glexec.conf (640)

  • Bug #51124: catalina.out is clogged with grid-proxy-init warnings FIXED
    • submit a job and check the catalina.out file

  • Bug #51128: lcas-suexec.db on CREAM CE should be named lcas-glexec.db for consistency FIXED
    • install a CE node from scratch
    • verify the existence of the files: /opt/glite/etc/lcas/lcas-glexec.db and /opt/glite/etc/lcmaps/lcmaps-glexec.db

  • Bug #51249: [ yaim-cream-ce ] refactor config_cream_db FIXED
    • Install the node from scratch and verify all the basic operations of the CREAM service

  • Bug #51310: Wrong event timestamp FIXED
    • run the consumer server (glite-ce-monitor-consumer) on the client machine
    • create a subscription for the topic CREAM_JOBS on the CE specifying the URL of the consumer server above
    • submit a job and verify the validity of the field TIMESTAMP of any event

  • Bug #51311: Wrong event timestamp generated by the CREAM Job Sensor FIXED

  • Bug #51313: CEMon must not notify the expired events CANNOT REPRODUCE

  • Bug #51705: glexec-wrapper.sh should be removed from CREAM RPM FIXED
    • check the content of glite-ce-cream rpm

  • Bug #51706: yaim-cream-ce: remove "lcg" prefix from JOB_MANAGER FIXED
    • change the value of JOB_MANAGER in the siteinfo.def e.g. from lsf to lcglsf
    • configure the node with YAIM and verify that in the resource BDII the string lsf (and not lcglsf) appears in the glueeeuniqueids

  • Bug #51892: Exception when using java.text.DateFormat.parse FIXED
    • try to overload the node with the testsuite: cream-test-monitored-submit -r 30 -n 20000 -m 2000 -C 50 -l log4py.conf -j test.jdl -R --sotimeout 60 --vo dteam --valid 02:00 and verify that with high load the submissions are rejected
    • verify the log of the CREAM service

  • Bug #51928: BLAH crashes if the cerequirements classad attribute is malformed FIXED
    • submit a job specifying a malformed cerequirements parameter
    • verify that the job is executed and the parameter is ignored

  • Bug #51978: CREAM can be slow to start FIXED
    • submit a big bunch of long-lived jobs, for example using cream-test-monitored-submit -r 30 -n 2000 -m 2000 -C 100 --sotimeout 60 -j long.jdl -R <ce_id> where long.jdl is "[executable="/bin/sleep";arguments="3600";]"
    • when all the jobs have been submitted restart the service and verify the startup time.
    • verify in the CREAM and BLAHP logs that the jobs are checked one by one at startup, instead of polling all jobs from a given timestamp

  • Bug #51993: Proxy renewal not very efficient for multiple jobs having the same delegationid FIXED
    • stress the renewal mechanism with a single short delegated proxy, for example with the following test: cream-test-monitored-submit -r 30 -n 2000 -m 2000 -C 50 --sotimeout 60 -j long.jdl -R <ce_id> --vo <vo_name> --valid <00:20>

  • Bug #52020: [ yaim-cream-ce ] Support use of file (besides syslog) for glexec logging FIXED
    • install from scratch, submit a job and verify that the operation has been logged into the syslog.
    • define the following variables in the site-info.def:
      GLEXEC_CREAM_LOG_DESTINATION=file
      GLEXEC_CREAM_LOG_DIR=/tmp/tests
      
      run yaim again, submit a new job and verify that the log is written into the specified directory.

  • Bug #52050: misleading error message "The problem seems to be related to glexec FIXED
    • The CREAM service does not make use of glexec anymore, and therefore this error message can't appear anymore

  • Bug #52051: CEMon must remove all expired subscriptions on start-up FIXED
    • create a subscription for the topic CREAM_JOBS on the CE with a short lifetime
    • shutdown the service and wait for the expiration of the subscription
    • restart the service and verify that the subscription does not exist anymore in the directory /opt/glite/var/cemonitor/subscription

  • Bug #52052: Sometimes the getInfo() operation does not report the right list of topics FIXED
    • enable or disable the CE sensor removing or adding the corresponding tag in the file /opt/glite/etc/glite-ce-monitor/cemonitor-config.xml
    • wait for cemonitor to reload the configuration (usually 10m)
    • verify the availability of the topic using the command glite-ce-monitor-gettopics

  • Bug #52268: BLAH leaves files in /tmp when CErequirements is set FIXED
    • submit a job specifying a simple CE requirements (e.g. cerequirements="other.GlueHostMainMemoryRAMSize > 2000")
    • verify that, after the execution of the job, in the tmp directory no files ce-req-file-* are left

  • Bug #52651: CREAM file descriptor overuse FIXED
    • try to overload the node with the testsuite: cream-test-monitored-submit -r 30 -n 20000 -m 2000 -C 50 -l log4py.conf -j test.jdl -R --sotimeout 60 --vo dteam --valid 02:00 and verify that with high load the submissions are rejected
    • seek "too many open files" in the CREAM log

  • Bug #52719: Blah doesn't set the 'executable' flag if a local jobwrapper is found FIXED
    • Submitted a job to a CREAM CE
    • Checked the BLAH wrapper: the chmod u+x of the CREAM JobWrapper is done in all cases (even if the job is going to be run on the WN via a local jobwrapper)

  • Bug #52942: Missing description for ISB/OSB error in jobwrapper FIXED
    • submit a job with an unreachable host in the inputsandbox or in the outputsandboxbasedesturi parameter
    • verify that the output of the glite-ce-job-status contains the full description of the failure

  • Bug #53459: [CREAM] Provide method to improve the detection of job status changes by ICE FIXED
    • run the "monitored" part of the testsuite, the latest version of the testsuite makes use of the "event query" mechanism for keeping track of the job status.

  • Bug #53499: CREAM job wrapper template should be put outside the jar FIXED
    • check whether the file /opt/glite/share/webapps/ce-cream.war contains the file WEB-INF/jobwrapper.tpl

  • Bug #54812: lsf_submit.sh job requirement FIXED
    • Created (and chmoded +x) the file /opt/glite/bin/lsf_local_submit_attributes.sh on the CREAM CE with the following content:

                     #!/bin/sh
                     echo "BSUB -n 2"

    • Submitted a job to that CE, without specifying in the JDL the cerequirements attribute
    • Checked (via bjobs -l) that the -n 2 directive was used (which means that the lsf_local_submit_attributes.sh was run)

  • Bug #54900: [ glite-yaim-cream-ce ] config_cream_tomcat_user should not add tomcat to VO FIXED
    • check the membership of any VO group

  • Bug #54949: Some job can remain in running state when BLParser is restarted for both lsf and pbs HOPEFULLY FIXED
    • Not easy to reproduce
    • Submitted several jobs (logged in different batch system log files) to a CREAM CE configured with the old blparser
    • Restarted CREAM
    • Didn't notice problems in getting the status of these jobs

  • Bug #55078: Possible final state not considered in BLParserPBS and BUpdaterPBS CANNOT REPRODUCE
    • To test the fix it would be necessary to have a scenario for which in the Torque log file for a certain job the event "Job Run..." is followed by the event "dequeuing from"
    • Not able to reproduce such scenario

  • Bug #55420: Allow admin to purge CREAM jobs in a non terminal status FIXED
    • temporary disconnect any WN from the CE, e.g. shutting down the mom server in a TORQUE installation
    • submit a job
    • on the CE with administrator privileges run the command: /opt/glite/sbin/JobDBAdminPurger.sh -u -p -s 2 as described in the wiki page
    • verify with glite-ce-job-list that the job has been purged from the database
    • verify that the sandbox directory of that job has been removed from /opt/glite/var/cream_sandbox
    • remove manually the job from the batch system and reconnect all the WN

  • Bug #55438: BUpdater problems in updating job state with AssignFinalState for all batch system FIXED
    • Submitted 3 jobs lasting 2 hours to a CREAM CE with only 2 job slots.
    • For all the jobs the right events were logged by the bnotifier (i.e. it didn't log status=4 with failurereason=999)

  • Bug #55565: BLAH configuration attribute blah_disable_wn_proxy_renewal fails to disable proxy renewal. FIXED
    • Verified issuing a BLAH_JOB_REFRESH proxy for a running job
    • Moreover the BLAH proxy renewal operation is not used anymore (the proxy on the CE is renewed by CREAM and no more by BLAH)

  • Bug #56075: Job failure reasons missing in the CREAM log file FIXED
    • submit a job with an unreachable host in the inputsandbox or in the outputsandboxbasedesturi parameter
    • verify that in the log file appears the message: failureReason=Cannot move ISB (): error: globus_xio: Unable to connect to xxxx:2811 globus_xio: globus_libc_getaddrinfo failed.globus_common: Name or service not known

  • Bug #56339: [blah] "service glite-ce-blparser restart" does not always work FIXED
    • try the command /opt/glite/etc/init.d/glite-ce-blparser restart and verify the correct behaviour of the script

  • Bug #56367: CREAM RPM depends on C libs FIXED
    • check if the package of glite-ce-cream contains any elf executable

  • Bug #56518: BLAH blparser doesn't start after boot of the machine FIXED
    • install the CE node from scratch specifying the parameter BLPARSER_WITH_UPDATER_NOTIFIER=false in the yaim configuration for creamCE
    • reboot the machine and verify that the blparser_master is running

  • Bug #56697: CREAM logging must be improved when CREAM register operation fails FIXED
    • force the service to fail a register operation, e.g. temporary renaming the sandbox directory
    • verify that the log reports at least the JobID and the reason of the failure

  • Bug #57210: BLAH condor_submit script doesn't recognize certain options. CANNOT REPRODUCE
    • Not possible to test the fix since we don't have CREAM based CEs with Condor as batch system

  • Bug #57307: condor_submit.sh does not support the handling of "local" attributes CANNOT REPRODUCE
    • Not possible to test the fix since we don't have CREAM based CEs with Condor as batch system

  • Bug #57820: [yaim-cream-ce] CREAM-CE publishes GlueServiceDataValue incomplete FIXED
    • run the infoprovider: /opt/glite/etc/gip/provider/glite-info-provider-service-cream-wrapper | grep GlueServiceDataValue
    • verify that 3 different values are returned for the GlueServiceDataValue: the version, the DN and the host name of the CE

  • Bug #58103: Cream database Query performance FIXED
    • Internal improvement
    • run a set of stress-tests and verify the performance

  • Bug #58109: Wrong value for the "service version" property FIXED
    • verify the property using the command glite-ce-service-info

  • Bug #58119: CREAM CE: publish Production instead of Special as default value for GlueCEStateStatus FIXED
    • verify with /opt/glite/libexec/glite-info-wrapper | grep -i gluecestatestatus

  • Bug #58423: RFE: support for ISB/OSB transfers from/to gridftp servers running using credentials FIXED
    • tested using java-based UI
    • tested using the following JDL:
      [
      executable="/bin/ls";
      inputsandbox={"gsiftp://lxsgaravatto.pd.infn.it:6787/etc/fstab?DN=/C=IT/O=INFN/OU=Personal Certificate/L=Padova/CN=Massimo Sgaravatto"};
      stdoutput="out-gsi.out";
      stderror="err-gsi.err";
      outputsandbox={"out-gsi.out", "err-gsi.err"}
      outputsandboxbasedesturi="gsiftp://lxsgaravatto.pd.infn.it:6787/tmp?DN=/C=IT/O=INFN/OU=Personal Certificate/L=Padova/CN=Massimo Sgaravatto";
      ]
      

  • Bug #58659: NullPointerException from getStatus FIXED
    • try to overload the node with the testsuite: cream-test-monitored-submit -r 30 -n 20000 -m 2000 -C 50 -l log4py.conf -j test.jdl -R --sotimeout 60 --vo dteam --valid 02:00 and verify that the log of the testsuite does not report any NullPointerException

  • Bug #58792: JobRegister fails, because cream_sandbox directory doesn't exist FIXED
    • temporary rename the directory /opt/glite/var/cream_sandbox without turning off the service
    • submit a job and verify that the failure reports "cannot create the job's working directory!"

  • Bug #58941: [yaim-cream-ce] lcmaps confs for glexec and gridftp are not fully synchronized FIXED
    • verify that the file /opt/glite/etc/lcmaps/lcmaps.db is complaint with the one attached to the bug.

  • Bug #59005: Possible problem with hold/resumed jobs in BUpdaterLSF FIXED
    • Verified as reported here

  • Bug #59329: Proxy symlinks left in the registry area until purged FIXED
    • submit a job and verify the verify the existence of the related symlink in the directory /opt/glite/var/blah/user_blah_job_registry.bjr/registry.proxydir
    • when the job terminates verify that the symlink has been removed by blah.

  • Bug #59686: Possible crash of BUpdarePBS due to wrong malloc FIXED
    • Define the parameter pbs_spoolpath in the file /opt/glite/etc/blah.config
    • run the BUpdaterPBS daemon and verify its liveness

  • Bug #59862: [ yaim-cream-ce ] broken -v functionality FIXED
    • remove a mandatory variable from the site-info.def, for examples JOB_MANAGER
    • run yaim configurator with option -v and verify that all the yaim functions are called.

  • Bug #59962: Sometimes the CREAM initialization fails with "UserId = ADMINISTRATOR is not enable for that operation"CANNOT REPRODUCE

  • Bug #60831: Error log message: "CREAM_JOB_SENSOR_HOST parameter not specified" FIXED
    • verify that the parameter "CREAM_JOB_SENSOR_HOST" is not defined in the file /opt/glite/etc/glite-ce-cream/cream-config.xml
    • submit several jobs
    • verify that the log of the CREAM service does not report the error above

  • Bug #61322: CREAM jw doesn't set GLITE_WMS_RB_BROKERINFO FIXED
    • submit a job and verify that the jobwrapper script, contained into the sandbox area for that job, defines correctly the __brokerinfo variable

  • Bug #61401: config_cream_blah and config_cream_clean don't take into account GLITE_LOCATION_LOG FIXED
    • verify that the log files of blahp are saved into the directory specified by GLITE_LOCATION_LOG

  • Bug #61402: [yaim-cream-ce] does not use GLITE_LOCATION_VAR/LOG is some cases FIXED
    • change the value of GLITE_LOCATION_VAR and GLITE_LOCATION_LOG and run the yaim configurator
    • verify that the new installation has been deployed into the the new directory and the log is written in the new location

  • Bug #61407: Set CE_ID in the cream jw FIXED
    • submit a job and verify that the jobwrapper script, contained into the sandbox area for that job, defines correctly the CE_ID variable

  • Bug #61493: [ yaim-cream-ce ] glexec_get_account policy order is wrong FIXED
    • As reported in the bug, this was fixed fixing bug #58941

  • Bug #61604: yaim-cream-ce should not install config_gip_software_plugin FIXED
    • verify that the glite-yaim-cream-ce package does not contain the file config_gip_software_plugin but it contains config_cream_gip_software_plugin instead

  • Bug #61730: CREAM jw: GLITE_WMS_LOG_DESTINATION should always be set with the FQDN FIXED
    • submit a job and verify that the jobwrapper script, contained into the sandbox area for that job, defines a FQDN in the __ce_hostname variable

  • Bug #61761: CEMon must guarantee the notification rate FIXED
    • enable the "CE Sensor" plugin
    • create a subscription for the topic published by the sensor above with a running consumer: glite-ce-monitor-subscribe --cert <user_proxy> --key <user_proxy> --topic CE_MONITOR --dialects ISM_CLASSAD_GLUE_1.2 --consumer-url <consumer_url> --rate 10 --duration 600 <cemonitor_url>
    • create on ore more subscriptions to non-existing consumer URL or to a fake blocking one (e.g. using nc -l -p <consumer port>) specifying the same rate as above
    • verify that the notification rate for the first consumer is correct

  • Bug #61790: Problems in CREAM CE when there are "strange" characters in the subject certificate FIXED
    • Verified submitting a job to a Torque CREAM CE with a proxy with subject: /DC=gov/DC=fnal/O=Fermilab/OU=Robots/CN=lcgcaf/CN=cdf/CN=Donatella Lucchesi/CN=UID:lucchesi
    • With the same proxy there were problems before (see https://gus.fzk.de/ws/ticket_info.php?ticket=54767)

  • Bug #62070: Possible problem with notification time in BNotifier HOPEFULLY FIXED
    • Not possible to reproduce it according to the developer (M. Mezzadri)

  • Bug #62207: [ yaim-cream ] Enable Glue 2.0 publishing FIXED

  • Bug #62436: Possible problem with updater if job remain queued too long FIXED
    • Fixed as reported here: 3 jobs lasting 2 hours were submitted to a CREAM CE with only 2 job slots. For the third one the BNotifier logged the right events (i.e. it didn't log status=4 with failurereason=999)

  • Bug #62565: yaim-cream-ce requires BLPARSER_HOST even if the new blparser has to be configured FIXED
    • Install the CE node from scratch removing the BLPARSER_HOST definition from the site-info.def and defining BLPARSER_WITH_UPDATER_NOTIFIER=true.
    • verify that the yaim log about any error concerning the variable above and the BNotifier and the Bupdater run correctly.

  • Bug #62776: Yaim config for CREAM CE erroneously requires tomcat in glexec group FIXED
    • Install the CE node from scratch and verify the following permissions:
      -r--r----- 1 root glexec   535 Mar  1 10:47 /opt/glite/etc/glexec.conf
      -r-sr-sr-x 1 root glexec 79792 Jun 11  2009 /opt/glite/sbin/glexec
      
    • verify that the glexec.conf file contains the property: "user_white_list = tomcat"

  • Bug #62893: Possible proxy renewal problem in the CREAM jw FIXED
    • try to overload the node with the testsuite: cream-test-monitored-submit -r 30 -n 20000 -m 2000 -C 50 -l log4py.conf -j test.jdl -R <ceID> --sotimeout 60 --vo dteam --valid 00:30
    • verify that no proxy related issues occur

  • Bug #63398: CREAM jw: removal of token should be retried in case of failure FIXED
    • submit the following jdl:
      [
      environment= {"__token_file=gsiftp://host/path"};
      executable="/bin/sleep";
      arguments="30";
      ]
      specifying existing host and path first and verify that the job terminate successfully; the owner of the token must be the mapped-user.
    • submit the jdl above but specifying a fake host and/or path and verify that the job status reports 3 different failed attempts for taking the token:
      "/opt/edg/libexec/edg-gridftp-base-rm: error globus_ftp_client: the server responded with an error 500 500-Command failed : System error in unlink: No such file or directory 500-A system call failed: No such file or directory 500 End"

  • Bug #63874: CREAM sandbox dir creation program should not attempt creation of parent directories.FIXED
    • temporary rename the directory /opt/glite/var/cream_sandbox/<voname>
    • submit a job using voms-proxy published by the given VO and verify that the job fails and no directory /opt/glite/var/cream_sandbox/<voname> has been created.

Clean installation

  • Installation steps:
    wget http://etics-repository.cern.ch:8080/repository/pm/registered/repomd/name/patch_3179/etics-registered-build-by-name.repo -O /etc/yum.repos.d/glite-CREAM.repo
    yum install xml-commons-apis
    yum install glite-CREAM
    wget http://grid-deployment.web.cern.ch/grid-deployment/glite/repos/3.2/glite-TORQUE_utils.repo -O /etc/yum.repos.d/glite-TORQUE_utils.repo
    wget http://grid-deployment.web.cern.ch/grid-deployment/glite/repos/3.2/glite-TORQUE_server.repo -O /etc/yum.repos.d/glite-TORQUE_server.repo
    yum install glite-TORQUE_utils glite-TORQUE_server
    /opt/glite/yaim/bin/yaim -c -s site-info.def -n creamCE -n TORQUE_server -n TORQUE_utils
    
  • View the log of yum for a clean installation
  • View the log of yaim for a clean installation (TORQUE is used)

Upgrade from production

  • Upgrade steps:
    wget http://etics-repository.cern.ch:8080/repository/pm/registered/repomd/name/patch_3179/etics-registered-build-by-name.repo -O /etc/yum.repos.d/glite-CREAM.repo
    yum update
    /opt/glite/yaim/bin/yaim -c -s site-info.def -n creamCE -n TORQUE_server -n TORQUE_utils
    
  • View the log of yum for an upgrade
  • View the log of yaim for an upgrade (TORQUE is used)

-- AlessioGianelle - 2010-02-05

Topic attachments
I Attachment Action Size Date Who Comment
PNGpng ice.png manage 4.5 K 2010-03-08 - 16:15 AlessioGianelle WMS test
Unknown file formatgz reports_patch3179_01.tar.gz manage 257.3 K 2010-03-03 - 12:20 PaoloAndreetto Testsuite reports for patch 3179
Unknown file formatgz reports_patch3179_02.tar.gz manage 182.0 K 2010-03-04 - 11:12 PaoloAndreetto Testsuite reports for patch 3179
Unknown file formatgz yaim_installation_log.txt.gz manage 19.7 K 2010-03-10 - 13:37 PaoloAndreetto Log from yaim installation
Unknown file formatgz yaim_update_log.txt.gz manage 16.4 K 2010-03-10 - 13:38 PaoloAndreetto Log from yaim update
Unknown file formatgz yum_installation_log.txt.gz manage 10.1 K 2010-03-10 - 13:36 PaoloAndreetto Log from yum installation
Unknown file formatgz yum_update_log.txt.gz manage 2.4 K 2010-03-10 - 13:37 PaoloAndreetto Log from yum update

This topic: EgeeJra1It > CreamTestsP3179
Topic revision: r50 - 2010-03-10 - PaoloAndreetto
 
This site is powered by the TWiki collaboration platformCopyright © 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback