Tags:
,
view all tags
%TOC% ---++ HOW-TO optimise performance distributing WMS and LB on two hosts ---+++ WMS+LB physical architecture In order to gain better performance, the components of a single WMS instance have been distributed on two hosts according to a layout different from the typical one. LBserver is hosted on one machine, in our case devel20, together with WMproxy and WM and without LBproxy, not to store the same events twice on database (this issue will disappear with the advent of LB 2.0) . The Job Submission Service is moved to another machine, 'gundam' in our case, so that JC+LM+CondorG are hosted by gundam. They connect to the LBserver at devel20 without using an LBproxy outpost on gundam. COMPONENTS LAYOUT: | *Components* | *host devel20* | *host gundam* | | glite_wms_wmproxy | %ICON{choice-yes}% | %ICON{choice-no}% | | glite-wms-workload_manager | %ICON{choice-yes}% | %ICON{choice-no}% | | glite-proxy-renewd | %ICON{choice-yes}% | %ICON{choice-no}% | | glite-wms-job_controller | %ICON{choice-no}% | %ICON{choice-yes}% | | glite-wms-log_monitor | %ICON{choice-no}% | %ICON{choice-yes}% | | <nop>CondorG | %ICON{choice-no}% | %ICON{choice-yes}% | | glite-lb-logd | %ICON{choice-yes}% | %ICON{choice-yes}% | | glite-lb-interlogd | %ICON{choice-yes}% | %ICON{choice-yes}% | | glite-lb-bkserverd | %ICON{choice-yes}% | %ICON{choice-no}% | ---+++ Filesystem sharing Interoperation between the various WMS components running on two different hosts is guaranteed by exporting /var/glite on devel20 to the host gundam via %RED% NFS%ENDCOLOR%, this choise is only done for simplicity. *gundam* mounts devel20 filesystem under */mnt/devel20*. Since the gahp_server is also CPU-bound, other than I/O bound, this physical architecture should be better than just using a WMS+LB on a single machine with two separately controlled disks. ---++++ devel20: NFS server configuration On *devel20*, as root, insert the following lines in */etc/hosts.deny*: <verbatim> portmap: ALL lockd: ALL statd: ALL mountd: ALL rquotad: ALL </verbatim> Insert the following line in */etc/hosts.allow*: <verbatim> portmap: gundam.cnaf.infn.it lockd: gundam.cnaf.infn.it statd: gundam.cnaf.infn.it mountd: gundam.cnaf.infn.it rquotad: gundam.cnaf.infn.it </verbatim> There is no need to restart the portmap daemon. Start the NFS service: =# /etc/init.d/nfs start= Make the NFS service start at boot: =# chkconfig nfs on= Insert the following line in */etc/exports*: =/var/glite gundam.cnaf.infn.it(rw,sync,wdelay,no_root_squash)= Re-export the filesystem: =# exportfs -r= ---++++ gundam: NFS client configuration In order to prevent any problems during the booting process, we don't mount the NFS filesystem at boot on *gundam*. Instead, we configure %RED% automount %ENDCOLOR% to mount the filesystem automatically at first access, and disable subsequent auto-unmount. As root, insert the following line in */etc/auto.master*: =/mnt /etc/auto.mnt --timeout=0= Create the file */etc/auto.mnt* with the following line: =devel20 -rw,hard,intr,nosuid,noauto,timeo=600,wsize=32768,rsize=32768,tcp devel20.cnaf.infn.it:/var/glite= Start the *automount* daemon: =# /etc/init.d/autofs start= Make *automount* start at boot: =# chkconfig autofs on= The filesystem */mnt/devel20* gets mounted automatically at first access attempt after boot, and is never automatically unmounted. If the filesystem is not busy, it can be manually unmounted either by: * issuing the usual command =`umount /mnt/devel20`= * sending the %BLUE% USR1 %ENDCOLOR% signal to the *automount* daemon Of course, upon subsequent access attempt, the filesystem gets automatically remounted. ---++++ gundam: creation of the necessary links On *gundam* create the following symbolic links: <verbatim> If necessary rename the existing directories under /var/glite before creating the links. # ln -s /mnt/devel20/jobcontrol /var/glite/jobcontrol # ln -s /mnt/devel20/SandboxDir /var/glite/SandboxDir # ln -s /mnt/devel20/spool /var/glite/spool # ln -s /mnt/devel20/workload_manager /var/glite/workload_manager </verbatim> Each component stores its logs locally, this is especially important for *gundam* where the *LM*, *JC* and *CondorG* logs produce a huge amount of data. ---+++ configuration: 1 Set LBproxy = false in the Common section of the WMS configuration file. 1 The log_monitor daemon looks for X509 credentials in order to authenticate with LB logd under ~glite/.globus. On *gundam* create the following links to avoid authentication errors (as an alternative, a valid proxy for the user "glite" can be put in /tmp/x509up_uXYZ): <verbatim style="margin-left: 40px;"># ln -s /home/glite/.certs /home/glite/.globus # ln -s /home/glite/.certs/hostcert.pem /home/glite/.certs/usercert.pem # ln -s /home/glite/.certs/hostkey.pem /home/glite/.certs/userkey.pem </verbatim> Useful Condor tweaks: SUBMIT_SEND_RESCHEDULE = False /* on high load it can happen to hit the error "Can't send RESCHEDULE command to condor scheduler" */ GRIDMANAGER_MAX_JOBMANAGERS_PER_RESOURCE = 100 ---+++ scripts *devel20*: <verbatim> # /opt/glite/etc/init.d/glite-wms-wm start/stop/status # /opt/glite/etc/init.d/glite-wms-wmproxy start/stop/status # /opt/glite/etc/init.d/glite-proxy-renewald start/stop/status # /opt/glite/etc/init.d/glite-lb-locallogger start/stop/status # /opt/glite/etc/init.d/glite-lb-bkserverd start/stop/status </verbatim> *gundam*: <verbatim> # /opt/glite/etc/init.d/glite-wms-lm start/stop/status # /opt/glite/etc/init.d/glite-wms-jc start/stop/status # /opt/glite/etc/init.d/glite-lb-locallogger start/stop/status # /opt/glite/etc/init.d/glite-lb-bkserverd start/stop/status </verbatim> *Gundam must be superuser for the LB@devel20* A preview from stress tests recently made with CMS (thanks to Enzo Miccio): a >1Hz stable rate to Condor (blue line) whenever Grid resources were able to keep the pace: These test have been made with an experimental version for the gLite WMS which will be released after patch #1841. -- Main.FabioCapannini - 02 Oct 2008
Edit
|
Attach
|
PDF
|
H
istory
:
r20
<
r19
<
r18
<
r17
<
r16
|
B
acklinks
|
V
iew topic
|
More topic actions...
Topic revision: r19 - 2009-11-10
-
MarcoCecchi
Home
Site map
CEMon web
CREAM web
Cloud web
Cyclops web
DGAS web
EgeeJra1It web
Gows web
GridOversight web
IGIPortal web
IGIRelease web
MPI web
Main web
MarcheCloud web
MarcheCloudPilotaCNAF web
Middleware web
Operations web
Sandbox web
Security web
SiteAdminCorner web
TWiki web
Training web
UserSupport web
VOMS web
WMS web
WMSMonitor web
WeNMR web
EgeeJra1It Web
Create New Topic
Index
Search
Changes
Notifications
RSS Feed
Statistics
Preferences
P
View
Raw View
Print version
Find backlinks
History
More topic actions
Edit
Raw edit
Attach file or image
Edit topic preference settings
Set new parent
More topic actions
Account
Log In
Edit
Attach
Copyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki?
Send feedback