Tags:
, view all tags

WMS Test Plan

Service Description

The Workload Management System (gLite WMS) is a software service of the gLite suite which is responsible for distributing and managing tasks across computing and storage resources available on a Grid. WMS assigns user jobs to CEs and SEs belonging to a Grid environment in a convenient fashion, so that:

  • jobs are always executed on resources that match the job requirements
  • grid-wide load balance is maintained, i.e. jobs are evenly and efficiently distributed across the entire Grid.

The WMS basically receives requests of job execution from a client, finds the required appropriate resources, then dispatches and follows them until completion. This is done handling failure in between and whenever possible. Other than single batch-like jobs, compound job types handled by the WMS are Directed Acyclic Graphs (a set of jobs where the input/output/execution of one of more jobs may depend on one or more other jobs), Parametric Jobs (multiple jobs with one parametrized description), and Collections (multiple jobs with a common description). Jobs are described via a flexible, high-level Job Definition Language (JDL).

Deployment scenarios

TBD

Functionality tests

Features/Scenarios to be tested

YAIM-WMS Configuration Testing

  • Installation and configuration starting from a cleaning machine (i.e. only OS)
  • Update and configuration from a previous version

WMS Job Submission/GetOutput Testing

Submit a job to the WMS service and when finished retrieve the output. Test job submission with the following type of jobs:

Normal Job
  • Test submission of normal jobs with different options and situation Implemented

  • Test the complete cycle with the two types of CEs: lcg and Cream Implemented

More different jdls can added in the future.

Perusal job

Job perusal is the ability to view output from a job while it is running. Implemented

DAG job

Directed Acyclic Graphs (a set of jobs where the input/output/execution of one of more jobs may depend on one or more other jobs).

  • Submit a jdl like this one:
[
  type = "dag";
  DefaultNodeShallowRetryCount = 3;
  nodes = [
    nodeA = [
      node_type = "edg-jdl";
      file ="jdl/arg.jdl" ; 
    ];
    nodeB = [
      node_type = "edg-jdl";
      file ="jdl/arg.jdl" ; 
    ];
    nodeC = [
      node_type = "edg-jdl";
      file ="jdl/arg.jdl" ; 
    ];
    dependencies = {
      { nodeA, nodeB },
      { nodeA, nodeC }
    }
  ];
]

  • When dag finishes retrieve the output files
  • Check the final status of the dag (all nodes and parent should be "Cleared")

More different jdls can added in the future.

Parametric Job

Multiple jobs with one parametrized description. Implemented

Collection Job

Multiple jobs with a common description. There are two ways to submit collection: you can create a single jdl with all the jdls of nodes or you can submit all the jdls stored in a directory (bulk submission)

  • Submit a jdl like this one:
[
nodes = {
   [
   file="jdl/arg.jdl";
   ],
   [
  executable="/bin/env";
  ShallowRetryCount = 0;
  RetryCount = 0;
  Stdoutput = "file.out" ;
  StdError =  "file.err" ;
  OutputSandbox ={ "file.out" ,"file.err"} ;
  FuzzyRank = true;
   ],
   [
   NodeName="nodeA";
   executable="/bin/ls" ;
  Stdoutput = "file.out" ;
  OutputSandbox ={ "file.out"} ;
   ]
};
Type = "Collection" ;
requirements =  other.GlueCEStateStatus == "Production" ;
rank = -other.GlueCEStateEstimatedResponseTime ;
]

  • When collection finishes retrieve the output files
  • Check the final status of the collectionall nodes and parent should be "Cleared")

  • To test bulk submission use option "--collection" of glite-wms-job-submit command.
  • When collection finishes retrieve the output files
  • Check the final status of the collection (all nodes and parent should be "Cleared")

More different jdls can added in the future.

Parallel Job

Jobs that can be running in one or more cpus in parallel.

  • Submit a jdl like this one:
[
Executable = "cpi";
CpuNumber = 2;
Stdoutput = "cpi.out" ;
StdError =  "cpi.err" ;
OutputSandbox = { "cpi.out" ,"cpi.err"} ;
InputSandbox = { "exe/cpi" };
FuzzyRank = true;
usertags = [ exe = "cpi" ];
]

  • When job finishes retrieve the output files
  • Check the final status of the job

WMS Job shallow and deep re-submission

There two type of resubmission; the first is defined deep occurs when the user's job has stardted running on the WN and then the job itself or the WMS JobWrapper has failed. The second one is called shallow and occurs when the WMS JobWrapper has failed before starting the actual user's job. Implemented

WMS Job List-match Testing

Without data

Test job-list-command and its option Implemented

With data

  • You need to register a file on an SE, then submit a jdl like this one (as InputData put the lfn(s) registered before):
###########################################
#      JDL with Data Requirements         #
###########################################

Executable = "calc-pi.sh";
Arguments = "1000";
StdOutput = "std.out";
StdError = "std.err";
Prologue = "prologue.sh";
InputSandbox = {"calc-pi.sh", "fileA", "fileB","prologue.sh"};
OutputSandbox = {"std.out", "std.err","out-PI.txt","out-e.txt"};
Requirements = true;

DataRequirements = {
[
DataCatalogType = "DLI";
DataCatalog = "http://lfcserver.cnaf.infn.it:8085";
InputData = {"lfn:/grid/infngrid/cesini/PI_1M.txt","lfn:/grid/infngrid/cesini/e-2M.txt"};
]
};
DataAccessProtocol = "gsiftp";

  • Then try a list-match, the listed CEs should be the ones "close" to the used SE

WMS Job Cancel Testing

Test the cancellation of these type of jobs (final status should be cleared):

Normal job

Submit and cancel a normal job Implementd

DAG job

Submit a dag job and then cancel it (the parent)

Collection

Submit a collection job and then cancel it (the parent)

Node of a collection

Submit a collection job and then some of its nodes

Others

Delegation Testing

Test the delegation command and its options Implementd

Job-info Testing

Test the job-info command and its options Implementd

Logging-info Testing

Test the logging-info command and its options Implemented

Job Status Testing

Test the job-status commend and its options Implemented

Prologue and Epilogue jobs

In the jdl you can specify two attributes prologue and epilogue which are scripts that are execute respectively before and after the user's job. Implemented

Performance tests

Collection of 1000 nodes

Submit a collection of 1000 nodes.

Stress test

This could be an example of stress test

  • 2880 collections each of 20 jobs
  • One collection every 60 seconds
  • Four users
  • Use LCG-CEs and CREAM-CEs (with different batch systems)
  • Use automatic-delegation
  • The job is a "sleep random(672)"
  • Resubmission is enabled
  • Enable proxy renewal

-- ElisabettaMolinari - 2010-02-24

Edit | Attach | PDF | History: r30 | r20 < r19 < r18 < r17 | Backlinks | Raw View | More topic actions...
Topic revision: r18 - 2011-04-29 - AlessioGianelle
 
  • Edit
  • Attach
This site is powered by the TWiki collaboration platformCopyright © 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback