Certification report patch 3621
Author(s): Elisabetta Molinari & Alessio Gianelle
Outcome:
in certification...
Clean installation
Upgrade from production
Test Report
List Match
- without data:
- with data:
Submission/GetOutput
-
Normal
jobs through
- ICE work:
- JC work:
-
Dag
jobs through:
- JC work:
OK
-
Collection
jobs through:
- ICE work:
- JC work:
- also job-output for collections works even though only the parent node is set to 'Cleared'
-
Parametric
jobs through:
- ICE work:
- JC work:
-
Bulk
jobs sent both through ICE and JC and RetryCount = 0; :
- Submit a bulk of 3 jobs -> success 100%
both to ICE and JC
- Submit a bulk of 50 jobs -> success 100%
both to ICE and JC
- Submit a bulk of 100 jobs -> success 100%
both to ICE and JC
- Submit a bulk of 500 jobs -> success 99.9%
both to ICE and JC
- Submit a bulk of 1000 jobs -> success 99.9%
both to ICE and JC
-
Perusal
jobs through:
- JC work:
- ICE work:
-
MPICH
jobs:
Cancel
- Normal jobs
- ICE:
- JC:
- Dag:
- Note that children nodes in status 'submitted' don't get cancelled
- Collection
- ICE:
- JC:
- Node of a collection:
Note: collections stay in status
'waiting' when all the nodes are Done (Success) except for one that is 'Cancelled'
Others
-
BrokerInfo
- ICE creation
- JC creation:
-
Resubmission
- Shallow:
- Deep:
-
Job Recovery
- Tested with a few collections re-starting the wm while some node jobs are still in a 'submitted or 'waiting' status
-
Prologue
and Epilogue
jobs
- ICE:
- JC:
Check bugs:
- Bug #42288
: Problem in forwarding cerequirements to a CREAM CE
- Bug #48910
: Failure starting LM if its output jobdir doesn't exist; unprotected chown in WM/LM/JC startup scripts FIXED
- stopped gLite services
- deleted the jobdir under '/var/glite/workload_manager'
- re-started the LM service checking that the jobdir gets recreated
- Bug #53460
: [ICE] Detection of job status changes for CREAM jobs should be improved
- Bug #55452
: CMS production struck by waves of "Globus error 10: data transfer to the server failed"
- Bug #56636
: [ICE] statistics counters for monitoring
- Bug #57295
: [ICE] queryDb tool may create empty DB as root
- Bug #57579
: [ICE] Occasionally the ICE's start/stop script doesn't kill the ICE process
- Bug #57596
: [ICE] non resubmission if job failed for proxy expiration
- Bug #58387
: [ICE] should log a job aborted when it cannot resubmit the job for missing user proxy
- Bug #58977
: [ICE] Wrong database colum name in ICE SQL query
- Bug #59240
: [ICE] abort reasons not always printed in its logfile
- Bug #59399
: [ICE] doesn't correctly handle request in jobdir/old when it is restarted
- Bug #59453
: [ICE] polling needs to be improved
- Bug #60688
: [ICE] does not respect LB server/proxy selection through the LBproxy attribute
- Bug #61312
: [ICE] Error in handling user dn in ICE's poller
- Bug #61405
: [ICE] Missing proxy validity evaluation in ICE
- Bug #61413
: [ICE] should not call EventQuery for a userDN if he/she doesn't have active jobs
- Bug #61748
: [ICE] EventQuery/Polling must be done also to blacklisted CE
- Bug #63989
: [ICE] doesn't handle exception raised by jobDir::new_entries()
--
AlessioGianelle - 2010-02-05