Difference: TestPlan (6 vs. 7)

Revision 72011-10-20 - MassimoSgaravatto

Line: 1 to 1
 
META TOPICPARENT name="WebHome"
Line: 156 to 156
 
Gang-Matching
If we consider for example a job that requires a CE and a determined amount of free space on a close SE to run successfully, the matchmaking solution to this problem requires three participants in the match (i.e., job, CE and SE), which cannot be accommodated by conventional (bilateral) matchmaking. The gangmatching feature of the classads library provides a multilateral matchmaking formalism to address this deficiency.
Changed:
<
<
Try some listmatch using different expressions of Requirements which use these built-in functions: TBD
>
>
Try some listmatch using different expressions of Requirements which use the anyMatch() function: TBD
 
Deleted:
<
<
  • anyMatch()
  • whichMatch()
  • allMatch()
 

WMS Job Cancel Testing

Line: 226 to 224
 

Configuration file

The file /etc/glite-wms/glite_wms.conf is used to configure all the daemons running on a WMS. A lot of parameters should be set with this file. Almost all these parameters should be checked. (TBD)
Added:
>
>
It should be verified that in the configuration file /etc/glite-wms/glite_wms.conf there are these hard-coded values:

For the common section:

DGUser = "\${GLITE_WMS_USER}"
HostProxyFile = "\${WMS_LOCATION_VAR}/glite/wms.proxy"
LBProxy = true

For the JobController section:

CondorSubmit = "${CONDORG_INSTALL_PATH}/bin/condor_submit"
CondorRemove = "${CONDORG_INSTALL_PATH}/bin/condor_rm"
CondorQuery = "${CONDORG_INSTALL_PATH}/bin/condor_q"
CondorRelease = "${CONDORG_INSTALL_PATH}/bin/condor_release"
CondorDagman = "${CONDORG_INSTALL_PATH}/bin/condor_dagman"
DagmanMaxPre = 10
SubmitFileDir = "${WMS_LOCATION_VAR}/jobcontrol/submit"
OutputFileDir = "${WMS_LOCATION_VAR}/jobcontrol/condorio"
InputType = "jobdir"
Input = "${WMS_LOCATION_VAR}/jobcontrol/jobdir/"
LockFile = "${WMS_LOCATION_VAR}/jobcontrol/lock"
LogFile = "\${WMS_LOCATION_LOG}/jobcontoller_events.log"
LogLevel = 5
MaximumTimeAllowedForCondorMatch = 1800
ContainerRefreshThreshold = 1000

For the NetworkServer section:

II_Port  = 2170
Gris_Port = 2170
II_Timeout = 100
Gris_Timeout = 20
II_DN = "mds-vo-name=local, o=grid"
Gris_DN = "mds-vo-name=local, o=grid"
BacklogSize = 64
ListeningPort = 7772
MasterThreads = 8
DispatcherThreads = 10
SandboxStagingPath = "${WMS_LOCATION_VAR}/SandboxDir"
LogFile = "${WMS_LOCATION_LOG}/networkserver_events.log"
LogLevel = 5
EnableQuotaManagement = false
MaxInputSandboxSize = 10000000
EnableDynamicQuotaAdjustment = false
QuotaAdjustmentAmount = 10000
QuotaInsensibleDiskPortion = 2.0
DLI_SI_CatalogTimeout = 60
ConnectionTimeout = 300

For the LogMonitor section:

JobsPerCondorLog = 1000
LockFile = "${WMS_LOCATION_VAR}/logmonitor/lock"
LogFile = "${WMS_LOCATION_LOG}/logmonitor_events.log"
LogLevel = 5
ExternalLogFile = "\${WMS_LOCATION_LOG}/logmonitor_external.log"
MainLoopDuration = 5
CondorLogDir = "${WMS_LOCATION_VAR}/logmonitor/CondorG.log"
CondorLogRecycleDir = "${WMS_LOCATION_VAR}/logmonitor/CondorG.log/recycle"
MonitorInternalDir = "${WMS_LOCATION_VAR}/logmonitor/internal"
IdRepositoryName = "irepository.dat"
AbortedJobsTimeout = 600
GlobusDownTimeout = 7200
RemoveJobFiles = true
ForceCancellationRetries = 2

or the Workloadmanager section:

PipeDepth = 200
WorkerThreads = 5
DispatcherType = "jobdir"
Input = "${WMS_LOCATION_VAR}/workload_manager/jobdir"
LogLevel = 5
LogFile  = "${WMS_LOCATION_LOG}/workload_manager_events.log"
MaxRetryCount = 10
CeMonitorServices = {}
CeMonitorAsynchPort = 0
IsmBlackList = {}
IsmUpdateRate = 600
IsmIiPurchasingRate = 480
JobWrapperTemplateDir = "${WMS_JOBWRAPPER_TEMPLATE}"
IsmThreads = false
IsmDump = "${WMS_LOCATION_VAR}/workload_manager/ismdump.fl"
SiServiceName = "org.glite.SEIndex"
DliServiceName = "data-location-interface"
MaxRetryCount = 10
DisablePurchasingFromGris = true
EnableBulkMM = true
CeForwardParameters = {"GlueHostMainMemoryVirtualSize","GlueHostMainMemoryRAMSize","GlueCEPolicyMaxCPUTime"}
MaxOutputSandboxSize = -1
EnableRecovery = true
QueueSize = 1000
ReplanGracePeriod = 3600
MaxReplansCount = 5
WmsRequirements  = ((ShortDeadlineJob =?= TRUE) ? RegExp(".*sdj$", other.GlueCEUniqueID) : !RegExp(".*sdj$", other.GlueCEUniqueID)) && (other.GlueCEPolicyMaxTotalJobs == 0 || other.GlueCEStateTotalJobs < other.GlueCEPolicyMaxTotalJobs) && (EnableWmsFeedback =?= TRUE ? RegExp("cream", other.GlueCEImplementationName, "i") : true)

For the WorkloadManagerProxy:

SandboxStagingPath = "${WMS_LOCATION_VAR}/SandboxDir"
LogFile = "${WMS_LOCATION_LOG}/wmproxy.log"
LogLevel = 5
MaxInputSandboxSize = 100000000
ListMatchRootPath = "/tmp"
GridFTPPort = 2811
LBLocalLogger = "localhost:9002"
MinPerusalTimeInterval = 1000
AsyncJobStart = true
EnableServiceDiscovery = false
LBServiceDiscoveryType = "org.glite.lb.server"
ServiceDiscoveryInfoValidity = 3600
WeightsCacheValidity = 86400
MaxServedRequests = 50
OperationLoadScripts = [
jobRegister = "${WMS_LOCATION_SBIN}/glite_wms_wmproxy_load_monitor --oper jobRegister --load1 22 --load5 20 --load15 18 --memusage 99 --diskusage 95 --fdnum 1000 --jdnum 1500 --ftpconn 300"
jobSubmit = "${WMS_LOCATION_SBIN}/glite_wms_wmproxy_load_monitor --oper jobSubmit --load1 22 --load5 20 --load15 18 --memusage 99 --diskusage 95 --fdnum 1000 --jdnum 1500  --ftpconn 300"
RuntimeMalloc = "/usr/lib64/libtcmalloc_minimal.so"
]

For the ICE section:

start_listener  =  false
start_lease_updater  =  false
logfile  =  "${WMS_LOCATION_LOG}/ice.log"
log_on_file = true
creamdelegation_url_prefix  =  "https://"
listener_enable_authz  =  true
poller_status_threshold_time  =  30*60
ice_topic  =  "CREAM_JOBS"
subscription_update_threshold_time  =  3600
lease_delta_time  =  0
notification_frequency  =  3*60
start_proxy_renewer  =  true
max_logfile_size  =  100*1024*1024
ice_host_cert  =  "${GLITE_HOST_CERT}"
Input  =  "${WMS_LOCATION_VAR}/ice/jobdir"
job_cancellation_threshold_time  =  300
poller_delay  =  2*60
persist_dir  =  "${WMS_LOCATION_VAR}/ice/persist_dir"
lease_update_frequency  =  20*60
log_on_console = false
cream_url_postfix  =  "/ce-cream/services/CREAM2"
subscription_duration  =  86400
bulk_query_size  =  100
purge_jobs  =  false
InputType  =  "jobdir"
listeneristener_enable_authn  =  true
ice_host_key  =  "${GLITE_HOST_KEY}"
start_poller  =  true
creamdelegation_url_postfix  =  "/ce-cream/services/gridsite-delegation"
cream_url_prefix  =  "https://"
max_ice_threads  =  10
cemon_url_prefix  =  "https://"
start_subscription_updater  =  true
proxy_renewal_frequency  =  600
ice_log_level  =  700
soap_timeout  =  60
max_logfile_rotations  =  10
cemon_url_postfix  =  "/ce-monitor/services/CEMonitor"
max_ice_mem = 2096000
ice_empty_threshold = 600

It should then be verified that:

  • The attribute II_Contact of NetworkServer section matches the value of the yaim variable BDII_HOST
  • The attribute WMExpiryPeriod of WorkloadManager section matches the value of yaim variable WMS_EXPIRY_PERIOD
  • The attribute MatchRetryPeriod of WorkloadManager section matches the value of yaim variable WMS_MATCH_RETRY_PERIOD
  • The attribute IsmIiLDAPCEFilterExt of WorkloadManager section is (|(GlueCEAccessControlBaseRule=VO:vo1)(GlueCEAccessControlBaseRule=VOMS:/vo1/*)(GlueCEAccessControlBaseRule=VO=vo2...
  • The attribute LBServer of the WorkloadManagerProxy section matches the value of yaim variable LB_HOST
 

Performance tests

Collection of multiple nodes

 
This site is powered by the TWiki collaboration platformCopyright © 2008-2021 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback