Tags:
, view all tags

Certification report patch 3621

Author(s): Elisabetta Molinari & Alessio Gianelle

Outcome: in certification...

Clean installation

Upgrade from production

Test Report

List Match

  • without data: Yes / Done
  • with data: Yes / Done

Submission/GetOutput

  • Normal jobs through
    • ICE work: Yes / Done
    • JC work: Yes / Done

  • Dag jobs through:
    • JC work: Yes / Done OK

  • Collection jobs through:
    • ICE work: Yes / Done
    • JC work: Yes / Done
    • also job-output for collections works even though only the parent node is set to 'Cleared'

  • Parametric jobs through:
    • ICE work: Yes / Done
    • JC work: Yes / Done
      • tested with the following
         [
          JobType = "parametric";
          Executable = "/usr/bin/env";
          Environment = {"MYPATH_PARAM_=$PATH:/bin:/usr/bin:$HOME"};
          StdOutput = "echo_PARAM_.out";
          StdError = "echo_PARAM_.err";
          OutputSandbox = {"echo_PARAM_.out","echo_PARAM_.err"};
          Parameters =  5;
                usertags = [ jdl = "parametric" ];
         ]

  • Bulk jobs sent both through ICE and JC and RetryCount = 0; :
    • Submit a bulk of 3 jobs -> success 100% Yes / Done both to ICE and JC
    • Submit a bulk of 50 jobs -> success 100% Yes / Done both to ICE and JC
    • Submit a bulk of 100 jobs -> success 100% Yes / Done both to ICE and JC
    • Submit a bulk of 500 jobs -> success 99.9% Yes / Done both to ICE and JC
    • Submit a bulk of 1000 jobs -> success 99.9% Yes / Done both to ICE and JC

  • Perusal jobs through:
    • JC work: Yes / Done
    • ICE work: Yes / Done

  • MPICH jobs: No

Cancel

  • Normal jobs
    • ICE: Yes / Done
    • JC: Yes / Done
  • Dag: Yes / Done
    • Note that children nodes in status 'submitted' don't get cancelled
  • Collection
    • ICE: Yes / Done
    • JC: Yes / Done
  • Node of a collection: Yes / Done
Note: collections stay in status 'waiting' when all the nodes are Done (Success) except for one that is 'Cancelled'

Others

  • BrokerInfo
    • ICE creation Yes / Done
    • JC creation: Yes / Done

  • Resubmission
    • Shallow: Yes / Done
    • Deep: Yes / Done

  • Job Recovery
    • Tested with a few collections re-starting the wm while some node jobs are still in a 'submitted or 'waiting' status Yes / Done

  • Prologue and Epilogue jobs
    • ICE: Yes / Done
    • JC: Yes / Done



Check bugs:

  • Bug #42288: Problem in forwarding cerequirements to a CREAM CE

  • Bug #48910: Failure starting LM if its output jobdir doesn't exist; unprotected chown in WM/LM/JC startup scripts FIXED
    • stopped gLite services
    • deleted the jobdir under '/var/glite/workload_manager'
    • re-started the LM service checking that the jobdir gets recreated

  • Bug #52934: [ICE] Delegation in ICE doesn't refer to the myproxy server

  • Bug #53460: [ICE] Detection of job status changes for CREAM jobs should be improved

  • Bug #55103: [ICE] ICE port 7010 not cleaned up properly

  • Bug #55452: CMS production struck by waves of "Globus error 10: data transfer to the server failed"

  • Bug #56636: [ICE] statistics counters for monitoring

  • Bug #57295: [ICE] queryDb tool may create empty DB as root

  • Bug #57579: [ICE] Occasionally the ICE's start/stop script doesn't kill the ICE process

  • Bug #57596: [ICE] non resubmission if job failed for proxy expiration

  • Bug #58387: [ICE] should log a job aborted when it cannot resubmit the job for missing user proxy

  • Bug #58977: [ICE] Wrong database colum name in ICE SQL query

  • Bug #59240: [ICE] abort reasons not always printed in its logfile

  • Bug #59399: [ICE] doesn't correctly handle request in jobdir/old when it is restarted

  • Bug #59453: [ICE] polling needs to be improved

  • Bug #60688: [ICE] does not respect LB server/proxy selection through the LBproxy attribute

  • Bug #61312: [ICE] Error in handling user dn in ICE's poller

  • Bug #61405: [ICE] Missing proxy validity evaluation in ICE

  • Bug #61413: [ICE] should not call EventQuery for a userDN if he/she doesn't have active jobs

  • Bug #61748: [ICE] EventQuery/Polling must be done also to blacklisted CE

  • Bug #63989: [ICE] doesn't handle exception raised by jobDir::new_entries()

-- AlessioGianelle - 2010-02-05

Edit | Attach | PDF | History: r75 | r25 < r24 < r23 < r22 | Backlinks | Raw View | More topic actions...
Topic revision: r23 - 2010-03-16 - AlessioGianelle
 
  • Edit
  • Attach
This site is powered by the TWiki collaboration platformCopyright © 2008-2023 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback