WMS Test Plan
Unit tests
N/A
Deployment tests
Generic repository
- epel.repo
- EGI-trustanchors
- sl.repo
- sl-security.repo
Installation test
First of all, install the yum-protectbase rpm:
yum install yum-protectbase.noarch
Then proceed with the installation of the CA certificates by issuing:
yum install ca-policy-egi-core
Install the WMS metapackage:
yum install emi-wms
After the definition of the
site-info.def file configure the WMS:
/opt/glite/yaim/bin/yaim -c -s site-info.def -n WMS
Update test
Starting from a
production WMS add the patch repository then issue:
yum update
If necessary reconfigure the WMS:
/opt/glite/yaim/bin/yaim -c -s site-info.def -n WMS
Functionality tests
Features/Scenarios to be tested
WMS can be deployed into two modes:
- Using an LB server installed in the same machine (BOTH mode)
- Using an external LB server (PROXY mode)
Both scenarios should be tested.
Test job cycle (from submission to output retrieve)
Submit a job to the WMS service and when finished retrieve the output; a the end the final status of the jobs should be
Cleared.
Submission can be tested using different type of proxy:
- Proxy from different VO (TBD)
- Proxy with different ROLE (TBD)
- Delegated proxy retrieved from a MyproxyServer (TBD)
- RFC 3820 compliant proxy (TBD)
Test job submission with the following type of jobs:
Normal Job
- Test the complete cycle submitting to the two types of CE: lcg and Cream Implemented.
More different jdls can added in the future. In particular these attributes should be tested:
- DataRequirements (with differents DataCatalogType) (TBD)
- OutputData (TBD)
- InputSandboxBaseURI, OutputSandboxDestURI and OutputSandboxBaseDestURI (TBD)
- AllowZippedISB and ZippedISB (TBD)
- ExpiryTime (TBD)
- ShortDeadlineJob (TBD)
Perusal job
Job perusal is the ability to view output from a job while it is running.
Implemented.
DAG job
Directed Acyclic Graphs (a set of jobs where the input/output/execution of one of more jobs may depend on one or more other jobs).
Implemented.
- Also the nodes should be in state Cleared
More different jdls can added in the future.
Parametric Job
Multiple jobs with one parametrized description.
Implemented.
Collection Job
Multiple jobs with a common description. There are two ways to submit collection:
- you can create a single jdl with all the jdls of nodes TBD
- you can submit all the jdls stored in a directory (bulk submission) Implemented.
Parallel Job
Jobs that can be running in one or more cpus in parallel.
Implemented.
Delegation
- There are two types of delegation: the automatic ones or you can delegate before submission. Submit jdls using both methods Implemented.
- Make a delegation with an expired proxy. Command should fails. Implemented.
- Submit with an expired delegation. Command should fails. Implemented.
Shallow and deep re-submission
There two type of resubmission; the first is defined
deep occurs when the user's job has stardted running on the WN and then the job itself or the WMS JobWrapper has failed. The second one is called
shallow and occurs when the WMS JobWrapper has failed before starting the actual user's job.
Implemented.
Job List-match Testing
Test various matching requests
Implemented.
With data
Test matchmaking using data requests
TBD
- You need to register a file on an SE, then try a list-match using a jdl like this one (as InputData put the lfn(s) registered before):
###########################################
# JDL with Data Requirements #
###########################################
Executable = "calc-pi.sh";
Arguments = "1000";
StdOutput = "std.out";
StdError = "std.err";
Prologue = "prologue.sh";
InputSandbox = {"calc-pi.sh", "fileA", "fileB","prologue.sh"};
OutputSandbox = {"std.out", "std.err","out-PI.txt","out-e.txt"};
Requirements = true;
DataRequirements = {
[
DataCatalogType = "DLI";
DataCatalog = "http://lfcserver.cnaf.infn.it:8085";
InputData = {"lfn:/grid/infngrid/cesini/PI_1M.txt","lfn:/grid/infngrid/cesini/e-2M.txt"};
]
};
DataAccessProtocol = "gsiftp";
The listed CEs should be the ones "close" to the used SE
Gang-Matching TBD
If we consider for example a job that requires a CE and a determined amount of free space on a close SE to run successfully, the matchmaking solution to this problem requires three participants in the match (i.e., job, CE and SE), which cannot be accommodated by conventional (bilateral) matchmaking. The gangmatching feature of the classads library provides a multilateral matchmaking formalism to address this deficiency.
Try some listmatch using different expressions of Requirements which use these built-in functions:
- anyMatch()
- whichMatch()
- allMatch()
WMS Job Cancel Testing
Test the cancellation of these type of jobs (final status should be
Cancelled):
- Submit and cancel a normal job Implemented.
- Submit a dag job and then cancel it (the parent) Implemented.
- Submit a dag job and then cancel some of its nodes Implemented.
- Submit a collection job and then cancel it (the parent) Implemented.
- Submit a collection job and then cancel some of its nodes Implemented.
- Cancellation of a Done job should fails. Implemented.
Prologue and Epilogue jobs
In the jdl you can specify two attributes
prologue and
epilogue which are scripts that are execute respectively before and after the user's job.
Implemented.
Proxy renewal
- Submit a long job with myproxyserver set using a short proxy to both CE (lcg and CREAM). Job should finishes Done (Success) Implemented.
- Submit a long job without setting myproxyserver using a short proxy to both CE (lcg and CREAM). Job should finishes Aborted with reason "proxy expired" Implemented.
WMS feedback (TBD)
This mechanism avoid a job to remain stuck for long time in queue waiting to be assigned to a worker node for execution. There are three parameters in the jdl that can be used to manage this mechanism:
- EnableWMSFeedback
- ReplanGracePeriod
- MaxRetryCount
The test should submit a lot of long jobs with short ReplanGracePeriod using a small number of resources, at the end of the test some jobs should be replanned (i.e. reassigned to different CEs). This can be evinced from the logging info of the jobs.
Performance tests
Collection of multiple nodes
Submit a collection of
n (a good compromise should be 1000) nodes. (
TBD)
Stress test
Stress tests can parametrized some features: (
partially implemented)
- Type of submitted job (e.g. Collections, normal, dag, parametric)
- Submission frequency (i.e. number of submissions for minute)
- Number of submission (i.e. duration of test)
- Number of parallel submission threads (i.e. each one with a different user proxy)
- With or without automatic delegation
- With or without resubmission enable
- With or without proxy renewal enable
- Jdl description
- with or without sandbox
- with or without cpu computation
- executable duration
- with or without data transfer
- etc...
This could be an example of stress test
- 2880 collections each of 20 jobs (2 days of test)
- One collection every 60 seconds
- Four users
- Use LCG-CEs and CREAM-CEs (with different batch systems)
- Use automatic-delegation
- The job is a "sleep random(666)"
- Resubmission is enabled
- Enable proxy renewal
Regression tests
Complete list of Rfc tests
Note
Implemented. means that an automatic test exists. Otherwise test must be developed and or execute by hand.