Difference: DistributedWMS (17 vs. 18)

Revision 182009-11-09 - FabioCapannini

Line: 1 to 1
 
META TOPICPARENT name="WMS_guide"
Line: 23 to 23
 Interoperation between the various WMS components running on two different hosts is guaranteed by exporting /var/glite on devel20 to the host gundam via NFS, this choise is only done for simplicity. gundam mounts devel20 filesystem under /mnt/devel20. Since the gahp_server is also CPU-bound, other than I/O bound, this physical architecture should be better than just using a WMS+LB on a single machine with two separately controlled disks.

devel20: NFS server configuration

Changed:
<
<
On devel20, as root, insert the following lines in /etc/hosts.deny:
>
>
On devel20, as root, insert the following lines in /etc/hosts.deny:
 portmap: ALL lockd: ALL statd: ALL mountd: ALL rquotad: ALL
Changed:
<
<
Insert the following line in /etc/hosts.allow:
>
>
Insert the following line in /etc/hosts.allow:
 portmap: gundam.cnaf.infn.it lockd: gundam.cnaf.infn.it statd: gundam.cnaf.infn.it mountd: gundam.cnaf.infn.it rquotad: gundam.cnaf.infn.it
Changed:
<
<
There is no need to restart the portmap daemon.
>
>
There is no need to restart the portmap daemon.
  Start the NFS service:
Line: 51 to 47
  Insert the following line in /etc/exports:
Changed:
<
<
/var/glite  gundam.cnaf.infn.it(rw,sync,wdelay)
>
>
/var/glite gundam.cnaf.infn.it(rw,sync,wdelay,no_root_squash)
  Re-export the filesystem:

# exportfs -r

gundam: NFS client configuration

Changed:
<
<
In order to prevent any problems during the booting process, we don't mount the NFS filesystem at boot on gundam. Instead, we configure automount to mount the filesystem automatically at first access, and disable subsequent auto-unmount.
>
>
In order to prevent any problems during the booting process, we don't mount the NFS filesystem at boot on gundam. Instead, we configure automount to mount the filesystem automatically at first access, and disable subsequent auto-unmount.
  As root, insert the following line in /etc/auto.master:
Line: 83 to 78
 Of course, upon subsequent access attempt, the filesystem gets automatically remounted.

gundam: creation of the necessary links

Changed:
<
<
On gundam create the following symbolic links:
>
>
On gundam create the following symbolic links:
If necessary rename the existing directories under /var/glite before creating the links.
 # ln -s /mnt/devel20/jobcontrol /var/glite/jobcontrol # ln -s /mnt/devel20/SandboxDir /var/glite/SandboxDir # ln -s /mnt/devel20/spool /var/glite/spool
Added:
>
>
# ln -s /mnt/devel20/workload_manager /var/glite/workload_manager
 
Changed:
<
<
This is more than important because ClassAds attributes will still point to the canonical "/var/glite/...."

configuration steps:

On devel20 no changes were made to the default configuration file glite_wms.conf.
On gundam it is necessary to update some entries for the JobController and LogMonitor to find the jobdir under /mnt/devel20.

  1. On gundam
>
>
Each component stores its logs locally, this is especially important for gundam where the LM, JC and CondorG logs produce a huge amount of data.
 
Changed:
<
<
Configure LBproxy = false in the Common section of the WMS configuration file.

after "exporting"

       GLITE_LOCAL_LOCATION_LOG=/var/log/glite 
       GLITE_LOCAL_LOCATION_VAR=/var/glite 
       GLITE_REMOTE_LOCATION_VAR=/mnt/devel20 
modify the following line in the LogMonitor [] section of /opt/glite/etc/glite_wms.conf:
       LogMonitor = [
       ...
       MonitorInternalDir  =  "${GLITE_LOCAL_LOCATION_VAR}/logmonitor/internal";
       ...
       ]
  1. Always on gundam, modify the following lines in the JobController [] section of /opt/glite/etc/glite_wms.conf:
    JobController = [
    ...
    Input  =  "${GLITE_REMOTE_LOCATION_VAR}/jobcontrol/jobdir";
    LockFile  =  "${GLITE_REMOTE_LOCATION_VAR}/jobcontrol/lock";
    SubmitFileDir  =  "${GLITE_REMOTE_LOCATION_VAR}/jobcontrol/submit";
    ...
    ]

Of course each component stores its logs locally, this is especially important for gundam where the LM, JC and CondorG logs produce a huge amount of data:

LogMonitor = [
...
CondorLogRecycleDir  =  "${GLITE_LOCAL_LOCATION_VAR}/logmonitor/CondorG.log/recycle";
LockFile  =  "${GLITE_LOCAL_LOCATION_LOG}/logmonitor/lock";
CondorLogDir  =  "${GLITE_LOCAL_LOCATION_LOG}/logmonitor/CondorG.log";
LogFile  =  "${GLITE_LOCAL_LOCATION_LOG}/logmonitor_events.log";
ExternalLogFile  =  "${GLITE_LOCAL_LOCATION_LOG}/logmonitor_external.log";
...
]

JobController = [
...
LogFile  =  "${GLITE_LOCAL_LOCATION_LOG}/jobcontoller_events.log";
OutputFileDir  =  "${GLITE_LOCAL_LOCATION_LOG}/jobcontrol/condorio";
...
]
>
>

configuration:

  1. Set LBproxy = false in the Common section of the WMS configuration file.
  2. The log_monitor daemon looks for X509 credentials in order to authenticate with LB logd under ~glite/.globus. On gundam create the following links to avoid authentication errors (as an alternative, a valid proxy for the user "glite" can be put in /tmp/x509up_uXYZ):
# ln -s /home/glite/.certs /home/glite/.globus
# ln -s /home/glite/.certs/hostcert.pem  /home/glite/.certs/usercert.pem
# ln -s /home/glite/.certs/hostkey.pem  /home/glite/.certs/userkey.pem
 
Changed:
<
<
TODO: Condor tweaks:
  1. Can't send RESCHEDULE command to condor scheduler SUBMIT SEND RESCHEDULE = False
... GRIDMANAGER_MAX_JOBMANAGERS_PER_RESOURCE = 100
>
>
TODO: Condor tweaks:
  1. Can't send RESCHEDULE command to condor scheduler SUBMIT SEND RESCHEDULE = False
... GRIDMANAGER_MAX_JOBMANAGERS_PER_RESOURCE = 100
 

scripts

Line: 172 to 121
  Gundam must be superuser for the LB@devel20
Changed:
<
<
A preview from stress tests recently made with CMS (thanks to Enzo Miccio): a >1Hz stable rate to Condor (blue line) whenever Grid resources were able to keep the pace: These test have been made with an experimental version for the gLite WMS which will be released after patch #1841. -- FabioCapannini - 02 Oct 2008
>
>
A preview from stress tests recently made with CMS (thanks to Enzo Miccio): a >1Hz stable rate to Condor (blue line) whenever Grid resources were able to keep the pace: These test have been made with an experimental version for the gLite WMS which will be released after patch #1841. -- FabioCapannini - 02 Oct 2008
 
This site is powered by the TWiki collaboration platformCopyright © 2008-2020 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback