Algorithms for estimated and worst response time
Starting with version 2.5 of the package
dynsched-generic, the dynamic scheduler makes use of a simple collection of job statistics in order to calculate the Estimated Response Time (ERT) and the Worst Response Time (WRT).
Statistics are stored in readable text files in a temporary directory, usually /var/tmp/info-dynamic-scheduler-generic, one file per batch system queue.
The dynamic scheduler registers an event, consisting on timestamp, job id and elapsed time, for any job submitted; the information provider keeps a maximum number of events for each queue, removing the oldest ones in a round robin schema.
The maximum number of events registered into a file can be configured, see below. The files for statistics are created dynamically by the information provider; they can be also removed dynamically if the corresponding queue is not enabled.
Since the information provider runs with no root privileges, the directory containing the collection of statistics must have the correct ownership and access attributes.
1 Description of the algorithm
Given:
- N the number of queued jobs
- R the number of running jobs
- K the number of slots available in the batch system
- Savg the average job wall clock time
- Smax the maximum job wall clock time
we have:
- ERT = 0 if R < K
- ERC = ceiling((N / K) + 1) * Savg if R = K
and:
- WRT = 0 if R < K
- WRT = ceiling((N / K) + 1) * Smax if R = K
2 Configuration of the algorithm
The configuration file for the estimator is /etc/lrms/scheduler.conf.
The configuration parameters required by the estimator are:
- sample_number is the maximum number of events registered for any given queue, default is 1000
- sample_dir is the directory containing the files for statistics, default is /var/tmp/info-dynamic-scheduler-generic
These parameters are contained into the "Main" section.
--
PaoloAndreetto - 2013-02-06