Difference: DistributedWMS (1 vs. 20)

Revision 202009-11-10 - FabioCapannini

Line: 1 to 1
 
META TOPICPARENT name="WMS_guide"
Line: 20 to 20
 
glite-lb-bkserverd Yes / Done No

Filesystem sharing

Changed:
<
<
Interoperation between the various WMS components running on two different hosts is guaranteed by exporting /var/glite on devel20 to the host gundam via NFS, this choise is only done for simplicity. gundam mounts devel20 filesystem under /mnt/devel20. Since the gahp_server is also CPU-bound, other than I/O bound, this physical architecture should be better than just using a WMS+LB on a single machine with two separately controlled disks.
>
>
Interoperation between the various WMS components running on two different hosts is guaranteed by exporting /var/glite on devel20 to the host gundam via NFS, this choice is only done for simplicity. gundam mounts devel20 filesystem under /mnt/devel20. Since the gahp_server is also CPU-bound, other than I/O bound, this physical architecture should be better than just using a WMS+LB on a single machine with two separately controlled disks.
 

devel20: NFS server configuration

On devel20, as root, insert the following lines in /etc/hosts.deny:
Line: 88 to 88
  Each component stores its logs locally, this is especially important for gundam where the LM, JC and CondorG logs produce a huge amount of data.
Changed:
<
<

configuration:

  1. Set LBproxy = false in the Common section of the WMS configuration file.
  2. The log_monitor daemon looks for X509 credentials in order to authenticate with LB logd under ~glite/.globus. On gundam create the following links to avoid authentication errors (as an alternative, a valid proxy for the user "glite" can be put in /tmp/x509up_uXYZ):
>
>

Configuration

  • Set LBproxy = false in the Common section of the WMS configuration file.
  • The log_monitor daemon looks for X509 credentials in order to authenticate with LB logd under ~glite/.globus. On gundam create the following links to avoid authentication errors (as an alternative, a valid proxy for the user "glite" can be put in /tmp/x509up_uXYZ):
 
# ln -s /home/glite/.certs /home/glite/.globus
# ln -s /home/glite/.certs/hostcert.pem  /home/glite/.certs/usercert.pem
# ln -s /home/glite/.certs/hostkey.pem  /home/glite/.certs/userkey.pem
Changed:
<
<
Useful Condor tweaks:
>
>
  • Disable glite-wms-check-daemons.cron or modify /opt/glite/libexec/glite-wms-check-daemons.sh so that only the desired services are restarted
  • Useful Condor tweaks:
 SUBMIT_SEND_RESCHEDULE = False /* on high load it can happen to hit the error "Can't send RESCHEDULE command to condor scheduler" */ GRIDMANAGER_MAX_JOBMANAGERS_PER_RESOURCE = 100
Added:
>
>
 
Changed:
<
<

scripts

>
>

Scripts

  devel20:
Line: 119 to 121
 # /opt/glite/etc/init.d/glite-lb-bkserverd start/stop/status
Changed:
<
<
Gundam must be superuser for the LB@devel20
>
>

 A preview from stress tests recently made with CMS (thanks to Enzo Miccio): a >1Hz stable rate to Condor (blue line) whenever Grid resources were able to keep the pace: These test have been made with an experimental version for the gLite WMS which will be released after patch #1841. -- FabioCapannini - 02 Oct 2008 \ No newline at end of file

Revision 192009-11-10 - MarcoCecchi

Line: 1 to 1
 
META TOPICPARENT name="WMS_guide"
Line: 96 to 96
 # ln -s /home/glite/.certs/hostkey.pem /home/glite/.certs/userkey.pem
Changed:
<
<
TODO: Condor tweaks:
  1. Can't send RESCHEDULE command to condor scheduler SUBMIT SEND RESCHEDULE = False
... GRIDMANAGER_MAX_JOBMANAGERS_PER_RESOURCE = 100
>
>
Useful Condor tweaks: SUBMIT_SEND_RESCHEDULE = False /* on high load it can happen to hit the error "Can't send RESCHEDULE command to condor scheduler" */ GRIDMANAGER_MAX_JOBMANAGERS_PER_RESOURCE = 100
 

scripts

Revision 182009-11-09 - FabioCapannini

Line: 1 to 1
 
META TOPICPARENT name="WMS_guide"
Line: 23 to 23
 Interoperation between the various WMS components running on two different hosts is guaranteed by exporting /var/glite on devel20 to the host gundam via NFS, this choise is only done for simplicity. gundam mounts devel20 filesystem under /mnt/devel20. Since the gahp_server is also CPU-bound, other than I/O bound, this physical architecture should be better than just using a WMS+LB on a single machine with two separately controlled disks.

devel20: NFS server configuration

Changed:
<
<
On devel20, as root, insert the following lines in /etc/hosts.deny:
>
>
On devel20, as root, insert the following lines in /etc/hosts.deny:
 portmap: ALL lockd: ALL statd: ALL mountd: ALL rquotad: ALL
Changed:
<
<
Insert the following line in /etc/hosts.allow:
>
>
Insert the following line in /etc/hosts.allow:
 portmap: gundam.cnaf.infn.it lockd: gundam.cnaf.infn.it statd: gundam.cnaf.infn.it mountd: gundam.cnaf.infn.it rquotad: gundam.cnaf.infn.it
Changed:
<
<
There is no need to restart the portmap daemon.
>
>
There is no need to restart the portmap daemon.
  Start the NFS service:
Line: 51 to 47
  Insert the following line in /etc/exports:
Changed:
<
<
/var/glite  gundam.cnaf.infn.it(rw,sync,wdelay)
>
>
/var/glite gundam.cnaf.infn.it(rw,sync,wdelay,no_root_squash)
  Re-export the filesystem:

# exportfs -r

gundam: NFS client configuration

Changed:
<
<
In order to prevent any problems during the booting process, we don't mount the NFS filesystem at boot on gundam. Instead, we configure automount to mount the filesystem automatically at first access, and disable subsequent auto-unmount.
>
>
In order to prevent any problems during the booting process, we don't mount the NFS filesystem at boot on gundam. Instead, we configure automount to mount the filesystem automatically at first access, and disable subsequent auto-unmount.
  As root, insert the following line in /etc/auto.master:
Line: 83 to 78
 Of course, upon subsequent access attempt, the filesystem gets automatically remounted.

gundam: creation of the necessary links

Changed:
<
<
On gundam create the following symbolic links:
>
>
On gundam create the following symbolic links:
If necessary rename the existing directories under /var/glite before creating the links.
 # ln -s /mnt/devel20/jobcontrol /var/glite/jobcontrol # ln -s /mnt/devel20/SandboxDir /var/glite/SandboxDir # ln -s /mnt/devel20/spool /var/glite/spool
Added:
>
>
# ln -s /mnt/devel20/workload_manager /var/glite/workload_manager
 
Changed:
<
<
This is more than important because ClassAds attributes will still point to the canonical "/var/glite/...."

configuration steps:

On devel20 no changes were made to the default configuration file glite_wms.conf.
On gundam it is necessary to update some entries for the JobController and LogMonitor to find the jobdir under /mnt/devel20.

  1. On gundam
>
>
Each component stores its logs locally, this is especially important for gundam where the LM, JC and CondorG logs produce a huge amount of data.
 
Changed:
<
<
Configure LBproxy = false in the Common section of the WMS configuration file.

after "exporting"

       GLITE_LOCAL_LOCATION_LOG=/var/log/glite 
       GLITE_LOCAL_LOCATION_VAR=/var/glite 
       GLITE_REMOTE_LOCATION_VAR=/mnt/devel20 
modify the following line in the LogMonitor [] section of /opt/glite/etc/glite_wms.conf:
       LogMonitor = [
       ...
       MonitorInternalDir  =  "${GLITE_LOCAL_LOCATION_VAR}/logmonitor/internal";
       ...
       ]
  1. Always on gundam, modify the following lines in the JobController [] section of /opt/glite/etc/glite_wms.conf:
    JobController = [
    ...
    Input  =  "${GLITE_REMOTE_LOCATION_VAR}/jobcontrol/jobdir";
    LockFile  =  "${GLITE_REMOTE_LOCATION_VAR}/jobcontrol/lock";
    SubmitFileDir  =  "${GLITE_REMOTE_LOCATION_VAR}/jobcontrol/submit";
    ...
    ]

Of course each component stores its logs locally, this is especially important for gundam where the LM, JC and CondorG logs produce a huge amount of data:

LogMonitor = [
...
CondorLogRecycleDir  =  "${GLITE_LOCAL_LOCATION_VAR}/logmonitor/CondorG.log/recycle";
LockFile  =  "${GLITE_LOCAL_LOCATION_LOG}/logmonitor/lock";
CondorLogDir  =  "${GLITE_LOCAL_LOCATION_LOG}/logmonitor/CondorG.log";
LogFile  =  "${GLITE_LOCAL_LOCATION_LOG}/logmonitor_events.log";
ExternalLogFile  =  "${GLITE_LOCAL_LOCATION_LOG}/logmonitor_external.log";
...
]

JobController = [
...
LogFile  =  "${GLITE_LOCAL_LOCATION_LOG}/jobcontoller_events.log";
OutputFileDir  =  "${GLITE_LOCAL_LOCATION_LOG}/jobcontrol/condorio";
...
]
>
>

configuration:

  1. Set LBproxy = false in the Common section of the WMS configuration file.
  2. The log_monitor daemon looks for X509 credentials in order to authenticate with LB logd under ~glite/.globus. On gundam create the following links to avoid authentication errors (as an alternative, a valid proxy for the user "glite" can be put in /tmp/x509up_uXYZ):
# ln -s /home/glite/.certs /home/glite/.globus
# ln -s /home/glite/.certs/hostcert.pem  /home/glite/.certs/usercert.pem
# ln -s /home/glite/.certs/hostkey.pem  /home/glite/.certs/userkey.pem
 
Changed:
<
<
TODO: Condor tweaks:
  1. Can't send RESCHEDULE command to condor scheduler SUBMIT SEND RESCHEDULE = False
... GRIDMANAGER_MAX_JOBMANAGERS_PER_RESOURCE = 100
>
>
TODO: Condor tweaks:
  1. Can't send RESCHEDULE command to condor scheduler SUBMIT SEND RESCHEDULE = False
... GRIDMANAGER_MAX_JOBMANAGERS_PER_RESOURCE = 100
 

scripts

Line: 172 to 121
  Gundam must be superuser for the LB@devel20
Changed:
<
<
A preview from stress tests recently made with CMS (thanks to Enzo Miccio): a >1Hz stable rate to Condor (blue line) whenever Grid resources were able to keep the pace: These test have been made with an experimental version for the gLite WMS which will be released after patch #1841. -- FabioCapannini - 02 Oct 2008
>
>
A preview from stress tests recently made with CMS (thanks to Enzo Miccio): a >1Hz stable rate to Condor (blue line) whenever Grid resources were able to keep the pace: These test have been made with an experimental version for the gLite WMS which will be released after patch #1841. -- FabioCapannini - 02 Oct 2008

Revision 172009-11-04 - MarcoCecchi

Line: 1 to 1
 
META TOPICPARENT name="WMS_guide"

HOW-TO optimise performance distributing WMS and LB on two hosts

WMS+LB physical architecture

Changed:
<
<
In order to gain better performance, the components of a single WMS instance have been distributed on two hosts according to a layout different from the typical one. LBserver is hosted on one machine, in our case devel20, together with WMproxy and WM and without LBproxy, not to store the same events twice on database (this issue will disappear with the advent of LB 2.0) . The Job Submission Service is moved to another machine, gundamfor us. JC+LM+CondorG are hosted by gundam. They connect to the LBserver at devel20 without using an LBproxy outpost on gundam.
>
>
In order to gain better performance, the components of a single WMS instance have been distributed on two hosts according to a layout different from the typical one. LBserver is hosted on one machine, in our case devel20, together with WMproxy and WM and without LBproxy, not to store the same events twice on database (this issue will disappear with the advent of LB 2.0) . The Job Submission Service is moved to another machine, 'gundam' in our case, so that JC+LM+CondorG are hosted by gundam. They connect to the LBserver at devel20 without using an LBproxy outpost on gundam.
 
Changed:
<
<
Configure LBproxy = false on gundam.
>
>
COMPONENTS LAYOUT:
 
Components host devel20 host gundam
glite_wms_wmproxy Yes / Done No
Line: 18 to 18
 
glite-lb-logd Yes / Done Yes / Done
glite-lb-interlogd Yes / Done Yes / Done
glite-lb-bkserverd Yes / Done No
Added:
>
>
 

Filesystem sharing

Changed:
<
<
Interoperation between the various WMS components running on two different hosts is (temporarily) guaranteed by exporting /var/glite on devel20 to the host gundam via NFS, this is done only for simplicity. gundam mounts the filesystem under /mnt/devel20. Since the gahp_server is highly CPU-bound this physical architecture should be better than just using a WMS+LB on a single machine with two separately controlled disks.
>
>
Interoperation between the various WMS components running on two different hosts is guaranteed by exporting /var/glite on devel20 to the host gundam via NFS, this choise is only done for simplicity. gundam mounts devel20 filesystem under /mnt/devel20. Since the gahp_server is also CPU-bound, other than I/O bound, this physical architecture should be better than just using a WMS+LB on a single machine with two separately controlled disks.
 

devel20: NFS server configuration

On devel20, as root, insert the following lines in /etc/hosts.deny:
Line: 91 to 92
  This is more than important because ClassAds attributes will still point to the canonical "/var/glite/...."
Changed:
<
<

configuration

>
>

configuration steps:

 On devel20 no changes were made to the default configuration file glite_wms.conf.
On gundam it is necessary to update some entries for the JobController and LogMonitor to find the jobdir under /mnt/devel20.

  1. On gundam
Added:
>
>
Configure LBproxy = false in the Common section of the WMS configuration file.
  after "exporting"
       GLITE_LOCAL_LOCATION_LOG=/var/log/glite 

Revision 162009-08-31 - MarcoCecchi

Line: 1 to 1
 
META TOPICPARENT name="WMS_guide"
Changed:
<
<

HOW-TO optimise performance distributing WMS+LB services on two different hosts

>
>

HOW-TO optimise performance distributing WMS and LB on two hosts

 

WMS+LB physical architecture

Changed:
<
<
In order to gain better performance, the components of a single WMS instance have been distributed on two hosts according to a pattern different from the typical one. LBserver (without LBproxy not to store the same events twice on database - this issue will disappear with the advent of LB 2.0) is hosted on devel20, whereby WMproxy and WM are also present. The Job Submission Service is totally moved to another machine so that JC+LM+CondorG are hosted by gundam. They connect to the LBserver at devel20 without using an LBproxy outpost on gundam.
>
>
In order to gain better performance, the components of a single WMS instance have been distributed on two hosts according to a layout different from the typical one. LBserver is hosted on one machine, in our case devel20, together with WMproxy and WM and without LBproxy, not to store the same events twice on database (this issue will disappear with the advent of LB 2.0) . The Job Submission Service is moved to another machine, gundamfor us. JC+LM+CondorG are hosted by gundam. They connect to the LBserver at devel20 without using an LBproxy outpost on gundam.
  Configure LBproxy = false on gundam.
Line: 165 to 165
 # /opt/glite/etc/init.d/glite-lb-bkserverd start/stop/status
Added:
>
>
Gundam must be superuser for the LB@devel20
 A preview from stress tests recently made with CMS (thanks to Enzo Miccio): a >1Hz stable rate to Condor (blue line) whenever Grid resources were able to keep the pace: These test have been made with an experimental version for the gLite WMS which will be released after patch #1841.
Deleted:
<
<
devel20gundam_stress_tests_CMS.jpg
 -- FabioCapannini - 02 Oct 2008
Deleted:
<
<
META FILEATTACHMENT attachment="devel20gundam_stress_tests_CMS.jpg" attr="" comment="" date="1223471791" name="devel20gundam_stress_tests_CMS.jpg" path="devel20gundam_stress_tests_CMS.jpg" size="109553" stream="devel20gundam_stress_tests_CMS.jpg" user="Main.MarcoCecchi" version="1"

Revision 152009-04-10 - MarcoCecchi

Line: 1 to 1
 
META TOPICPARENT name="WMS_guide"
Line: 6 to 6
 

WMS+LB physical architecture

In order to gain better performance, the components of a single WMS instance have been distributed on two hosts according to a pattern different from the typical one. LBserver (without LBproxy not to store the same events twice on database - this issue will disappear with the advent of LB 2.0) is hosted on devel20, whereby WMproxy and WM are also present. The Job Submission Service is totally moved to another machine so that JC+LM+CondorG are hosted by gundam. They connect to the LBserver at devel20 without using an LBproxy outpost on gundam.
Added:
>
>
Configure LBproxy = false on gundam.
 
Components host devel20 host gundam
glite_wms_wmproxy Yes / Done No
glite-wms-workload_manager Yes / Done No

Revision 132008-10-16 - FabioCapannini

Line: 1 to 1
 
META TOPICPARENT name="WMS_guide"
Line: 93 to 93
 On devel20 no changes were made to the default configuration file glite_wms.conf.
On gundam it is necessary to update some entries for the JobController and LogMonitor to find the jobdir under /mnt/devel20.
Changed:
<
<
1 On gundam
>
>
  1. On gundam
 after "exporting"
Changed:
<
<
GLITE_LOCAL_LOCATION_LOG=/var/log/glite
GLITE_LOCAL_LOCATION_VAR=/var/glite
GLITE_REMOTE_LOCATION_VAR=/mnt/devel20
>
>
       GLITE_LOCAL_LOCATION_LOG=/var/log/glite 
       GLITE_LOCAL_LOCATION_VAR=/var/glite 
       GLITE_REMOTE_LOCATION_VAR=/mnt/devel20 
 modify the following line in the LogMonitor [] section of /opt/glite/etc/glite_wms.conf:
Deleted:
<
<
 
Deleted:
<
<
...
 LogMonitor = [ ... MonitorInternalDir = "${GLITE_LOCAL_LOCATION_VAR}/logmonitor/internal"; ...
Changed:
<
<
]
  1. Always on gundam, modify the following lines in the JobController [] section of /opt/glite/etc/glite_wms.conf:
>
>
]
  1. Always on gundam, modify the following lines in the JobController [] section of /opt/glite/etc/glite_wms.conf:
 
JobController = [
...
Line: 119 to 114
 LockFile = "${GLITE_REMOTE_LOCATION_VAR}/jobcontrol/lock"; SubmitFileDir = "${GLITE_REMOTE_LOCATION_VAR}/jobcontrol/submit"; ...
Changed:
<
<
]
>
>
]
  Of course each component stores its logs locally, this is especially important for gundam where the LM, JC and CondorG logs produce a huge amount of data:

Revision 122008-10-15 - MarcoCecchi

Line: 1 to 1
 
META TOPICPARENT name="WMS_guide"
Changed:
<
<

HOW-TO optimise performance distributing WMS+LB components on two different hosts

WMS+LB components distribution

>
>

HOW-TO optimise performance distributing WMS+LB services on two different hosts

WMS+LB physical architecture

 In order to gain better performance, the components of a single WMS instance have been distributed on two hosts according to a pattern different from the typical one. LBserver ( without LBproxy not to store the same events twice on database - this issue will disappear with the advent of LB 2.0) is hosted on devel20, whereby WMproxy and WM are also present. The Job Submission Service is totally moved to another machine so that JC+LM+CondorG are hosted by gundam. They connect to the LBserver at devel20 without using an LBproxy outpost on gundam.

Components host devel20 host gundam
Line: 106 to 106
 ... LogMonitor = [ ...
Changed:
<
<
MonitorInternalDir = "${GLITE_REMOTE_LOCATION_VAR}/logmonitor/internal";
>
>
MonitorInternalDir = "${GLITE_LOCAL_LOCATION_VAR}/logmonitor/internal";
 ... ]

Revision 112008-10-08 - MarcoCecchi

Line: 1 to 1
 
META TOPICPARENT name="WMS_guide"
Line: 17 to 17
 
glite-lb-interlogd Yes / Done Yes / Done
glite-lb-bkserverd Yes / Done No

Filesystem sharing

Changed:
<
<
Interoperation between the various WMS components running on two different hosts is (temporarily) guaranteed by exporting /var/glite on devel20 to the host gundam via NFS. gundam mounts the filesystem under /mnt/devel20. Since the gahp_server is highly CPU-bound this physical architecture should be better than just using a WMS+LB on a single machine with two separately controlled disks.
>
>
Interoperation between the various WMS components running on two different hosts is (temporarily) guaranteed by exporting /var/glite on devel20 to the host gundam via NFS, this is done only for simplicity. gundam mounts the filesystem under /mnt/devel20. Since the gahp_server is highly CPU-bound this physical architecture should be better than just using a WMS+LB on a single machine with two separately controlled disks.
 

devel20: NFS server configuration

On devel20, as root, insert the following lines in /etc/hosts.deny:
Line: 90 to 90
 This is more than important because ClassAds attributes will still point to the canonical "/var/glite/...."

configuration

Changed:
<
<
On devel20 no changes are made to the configuration of WMS.
On gundam, on the other way, it is necessary to instruct the JobController to find the jobdir under the /mnt/devel20 filesystem.
>
>
On devel20 no changes were made to the default configuration file glite_wms.conf.
On gundam it is necessary to update some entries for the JobController and LogMonitor to find the jobdir under /mnt/devel20.
 
Changed:
<
<
  1. On gundam, modify the following line in the LogMonitor [] section of /opt/glite/etc/glite_wms.conf:
>
>
1 On gundam
 
Changed:
<
<
after "exporting" GLITE_LOCAL_LOCATION_LOG=$GLITE_LOCATION_LOG
 and GLITE_REMOTE_LOCATION_VAR=/mnt/devel20
>
>
after "exporting" GLITE_LOCAL_LOCATION_LOG=/var/log/glite
GLITE_LOCAL_LOCATION_VAR=/var/glite
GLITE_REMOTE_LOCATION_VAR=/mnt/devel20

modify the following line in the LogMonitor [] section of /opt/glite/etc/glite_wms.conf:

 
Added:
>
>
...
 LogMonitor = [ ... MonitorInternalDir = "${GLITE_REMOTE_LOCATION_VAR}/logmonitor/internal"; ... ]
Changed:
<
<
  1. On gundam, modify the following lines in the JobController [] section of /opt/glite/etc/glite_wms.conf:
>
>
  1. Always on gundam, modify the following lines in the JobController [] section of /opt/glite/etc/glite_wms.conf:
 
JobController = [
Line: 165 to 170
 

A preview from stress tests recently made with CMS (thanks to Enzo Miccio): a >1Hz stable rate to Condor (blue line) whenever Grid resources were able to keep the pace:

Added:
>
>
These test have been made with an experimental version for the gLite WMS which will be released after patch #1841.
  devel20gundam_stress_tests_CMS.jpg

Revision 102008-10-08 - MarcoCecchi

Line: 1 to 1
 
META TOPICPARENT name="WMS_guide"
Changed:
<
<

HOW-TO distribute WMS components on two different hosts

WMS Components distribution

In order to gain better performances the components of a single WMS instance have been distributed on two hosts according to the following table:
>
>

HOW-TO optimise performance distributing WMS+LB components on two different hosts

WMS+LB components distribution

In order to gain better performance, the components of a single WMS instance have been distributed on two hosts according to a pattern different from the typical one. LBserver ( without LBproxy not to store the same events twice on database - this issue will disappear with the advent of LB 2.0) is hosted on devel20, whereby WMproxy and WM are also present. The Job Submission Service is totally moved to another machine so that JC+LM+CondorG are hosted by gundam. They connect to the LBserver at devel20 without using an LBproxy outpost on gundam.
 
Components host devel20 host gundam
glite_wms_wmproxy Yes / Done No
Line: 12 to 12
 
glite-proxy-renewd Yes / Done No
glite-wms-job_controller No Yes / Done
glite-wms-log_monitor No Yes / Done
Changed:
<
<
CondorG No Yes / Done
>
>
CondorG No Yes / Done
 
glite-lb-logd Yes / Done Yes / Done
Changed:
<
<
glite-lb-iterlogd Yes / Done Yes / Done
>
>
glite-lb-interlogd Yes / Done Yes / Done
 
glite-lb-bkserverd Yes / Done No

Filesystem sharing

Changed:
<
<
The inter-operation of the WMS components running on two different hosts is guaranteed by exporting the filesystem devel20:/var/glite to the host gundam via NFS. The host gundam mounts the filesystem under /mnt/devel20.
>
>
Interoperation between the various WMS components running on two different hosts is (temporarily) guaranteed by exporting /var/glite on devel20 to the host gundam via NFS. gundam mounts the filesystem under /mnt/devel20. Since the gahp_server is highly CPU-bound this physical architecture should be better than just using a WMS+LB on a single machine with two separately controlled disks.
 

devel20: NFS server configuration

On devel20, as root, insert the following lines in /etc/hosts.deny:
Changed:
<
<
portmap: ALL
>
>
portmap: ALL
 lockd: ALL statd: ALL mountd: ALL
Changed:
<
<
rquotad: ALL
>
>
rquotad: ALL
 Insert the following line in /etc/hosts.allow:
Changed:
<
<
portmap: gundam.cnaf.infn.it
>
>
portmap: gundam.cnaf.infn.it
 lockd: gundam.cnaf.infn.it statd: gundam.cnaf.infn.it mountd: gundam.cnaf.infn.it
Changed:
<
<
rquotad: gundam.cnaf.infn.it
>
>
rquotad: gundam.cnaf.infn.it
 There is no need to restart the portmap daemon.

Start the NFS service:

Line: 77 to 81
 

gundam: creation of the necessary links

On gundam create the following symbolic links:
Changed:
<
<
# ln -s /mnt/devel20/jobcontrol /var/glite/jobcontrol
>
>
# ln -s /mnt/devel20/jobcontrol /var/glite/jobcontrol
 # ln -s /mnt/devel20/SandboxDir /var/glite/SandboxDir # ln -s /mnt/devel20/spool /var/glite/spool
Changed:
<
<

WMS configuration

>
>
This is more than important because ClassAds attributes will still point to the canonical "/var/glite/...."

configuration

 On devel20 no changes are made to the configuration of WMS.
On gundam, on the other way, it is necessary to instruct the JobController to find the jobdir under the /mnt/devel20 filesystem.
Changed:
<
<
  1. On gundam, modify the following line in the LogMonitor [] section of /opt/glite/etc/glite_wms.conf:
>
>
  1. On gundam, modify the following line in the LogMonitor [] section of /opt/glite/etc/glite_wms.conf:
 
Added:
>
>
after "exporting" GLITE_LOCAL_LOCATION_LOG=$GLITE_LOCATION_LOG and GLITE_REMOTE_LOCATION_VAR=/mnt/devel20
 LogMonitor = [ ...
Changed:
<
<
MonitorInternalDir = "/mnt/devel20/logmonitor/internal";
>
>
MonitorInternalDir = "${GLITE_REMOTE_LOCATION_VAR}/logmonitor/internal";
 ...
Changed:
<
<
]
  1. On gundam, modify the following lines in the JobController [] section of /opt/glite/etc/glite_wms.conf:
    JobController = [
>
>
]
  1. On gundam, modify the following lines in the JobController [] section of /opt/glite/etc/glite_wms.conf:

JobController = [
 ...
Changed:
<
<
Input = "/mnt/devel20/jobcontrol/jobdir"; LockFile = "/mnt/devel20/jobcontrol/lock"; SubmitFileDir = "/mnt/devel20/jobcontrol/submit";
>
>
Input = "${GLITE_REMOTE_LOCATION_VAR}/jobcontrol/jobdir"; LockFile = "${GLITE_REMOTE_LOCATION_VAR}/jobcontrol/lock"; SubmitFileDir = "${GLITE_REMOTE_LOCATION_VAR}/jobcontrol/submit";
 ...
Changed:
<
<
]
>
>
]
  Of course each component stores its logs locally, this is especially important for gundam where the LM, JC and CondorG logs produce a huge amount of data:
Changed:
<
<
LogMonitor = [
>
>
LogMonitor = [
 ...
Changed:
<
<
CondorLogRecycleDir = "${GLITE_LOCATION_VAR}/logmonitor/CondorG.log/recycle"; LockFile = "${GLITE_LOCATION_LOG}/logmonitor/lock"; CondorLogDir = "${GLITE_LOCATION_LOG}/logmonitor/CondorG.log"; LogFile = "${GLITE_LOCATION_LOG}/logmonitor_events.log"; ExternalLogFile = "${GLITE_LOCATION_LOG}/logmonitor_external.log";
>
>
CondorLogRecycleDir = "${GLITE_LOCAL_LOCATION_VAR}/logmonitor/CondorG.log/recycle"; LockFile = "${GLITE_LOCAL_LOCATION_LOG}/logmonitor/lock"; CondorLogDir = "${GLITE_LOCAL_LOCATION_LOG}/logmonitor/CondorG.log"; LogFile = "${GLITE_LOCAL_LOCATION_LOG}/logmonitor_events.log"; ExternalLogFile = "${GLITE_LOCAL_LOCATION_LOG}/logmonitor_external.log";
 ... ]

JobController = [ ...

Changed:
<
<
LogFile = "${GLITE_LOCATION_LOG}/jobcontoller_events.log"; OutputFileDir = "${GLITE_LOCATION_LOG}/jobcontrol/condorio";
>
>
LogFile = "${GLITE_LOCAL_LOCATION_LOG}/jobcontoller_events.log"; OutputFileDir = "${GLITE_LOCAL_LOCATION_LOG}/jobcontrol/condorio";
 ... ]

TODO: Condor tweaks:

Changed:
<
<
  1. Can't send RESCHEDULE command to condor scheduler
>
>
  1. Can't send RESCHEDULE command to condor scheduler
  SUBMIT SEND RESCHEDULE = False
Added:
>
>
... GRIDMANAGER_MAX_JOBMANAGERS_PER_RESOURCE = 100
 
Changed:
<
<

WMS startup

>
>

scripts

 
Changed:
<
<
On devel20 start the following components:
# /opt/glite/etc/init.d/glite-wms-wm start
# /opt/glite/etc/init.d/glite-wms-wmproxy start
# /opt/glite/etc/init.d/glite-proxy-renewald start
# /opt/glite/etc/init.d/glite-lb-locallogger start 
# /opt/glite/etc/init.d/glite-lb-bkserverd start

On gundam start the following components:

# /opt/glite/etc/init.d/glite-wms-lm start
# /opt/glite/etc/init.d/glite-wms-jc start
# /opt/glite/etc/init.d/glite-lb-locallogger start
# /opt/glite/etc/init.d/glite-lb-bkserverd start 
>
>
devel20:
# /opt/glite/etc/init.d/glite-wms-wm start/stop/status
# /opt/glite/etc/init.d/glite-wms-wmproxy start/stop/status
# /opt/glite/etc/init.d/glite-proxy-renewald start/stop/status
# /opt/glite/etc/init.d/glite-lb-locallogger start/stop/status 
# /opt/glite/etc/init.d/glite-lb-bkserverd start/stop/status

gundam:

# /opt/glite/etc/init.d/glite-wms-lm start/stop/status
# /opt/glite/etc/init.d/glite-wms-jc start/stop/status
# /opt/glite/etc/init.d/glite-lb-locallogger start/stop/status
# /opt/glite/etc/init.d/glite-lb-bkserverd start/stop/status 

A preview from stress tests recently made with CMS (thanks to Enzo Miccio): a >1Hz stable rate to Condor (blue line) whenever Grid resources were able to keep the pace:

devel20gundam_stress_tests_CMS.jpg

  -- FabioCapannini - 02 Oct 2008
Added:
>
>
META FILEATTACHMENT attachment="devel20gundam_stress_tests_CMS.jpg" attr="" comment="" date="1223471791" name="devel20gundam_stress_tests_CMS.jpg" path="devel20gundam_stress_tests_CMS.jpg" size="109553" stream="devel20gundam_stress_tests_CMS.jpg" user="Main.MarcoCecchi" version="1"

Revision 82008-10-06 - FabioCapannini

Line: 1 to 1
 
META TOPICPARENT name="WMS_guide"
Line: 75 to 75
 
  • sending the USR1 signal to the automount daemon
Of course, upon subsequent access attempt, the filesystem gets automatically remounted.

Changed:
<
<

Creation of the necessary links on gundam

>
>

gundam: creation of the necessary links

 On gundam create the following symbolic links:
# ln -s /mnt/devel20/jobcontrol /var/glite/jobcontrol
# ln -s /mnt/devel20/SandboxDir /var/glite/SandboxDir
Line: 102 to 102
 ... ]
Changed:
<
<
Of course each component stores its logs locally, this is especially important for gundam where the LM, JC and CondorG logs produce a huge amount of data. (...)
>
>
Of course each component stores its logs locally, this is especially important for gundam where the LM, JC and CondorG logs produce a huge amount of data:

LogMonitor = [
...
CondorLogRecycleDir  =  "${GLITE_LOCATION_VAR}/logmonitor/CondorG.log/recycle";
LockFile  =  "${GLITE_LOCATION_LOG}/logmonitor/lock";
CondorLogDir  =  "${GLITE_LOCATION_LOG}/logmonitor/CondorG.log";
LogFile  =  "${GLITE_LOCATION_LOG}/logmonitor_events.log";
ExternalLogFile  =  "${GLITE_LOCATION_LOG}/logmonitor_external.log";
...
]

JobController = [
...
LogFile  =  "${GLITE_LOCATION_LOG}/jobcontoller_events.log";
OutputFileDir  =  "${GLITE_LOCATION_LOG}/jobcontrol/condorio";
...
]
  TODO: Condor tweaks:
Changed:
<
<
1) Can't send RESCHEDULE command to condor scheduler
>
>
  1. Can't send RESCHEDULE command to condor scheduler
 SUBMIT SEND RESCHEDULE = False

WMS startup

Revision 72008-10-04 - MarcoCecchi

Line: 1 to 1
 
META TOPICPARENT name="WMS_guide"
Line: 105 to 105
 Of course each component stores its logs locally, this is especially important for gundam where the LM, JC and CondorG logs produce a huge amount of data. (...)
Added:
>
>
TODO: Condor tweaks: 1) Can't send RESCHEDULE command to condor scheduler SUBMIT SEND RESCHEDULE = False
 

WMS startup

On devel20 start the following components:

Revision 62008-10-03 - MarcoCecchi

Line: 1 to 1
 
META TOPICPARENT name="WMS_guide"
Line: 117 to 117
 On gundam start the following components:
# /opt/glite/etc/init.d/glite-wms-lm start
# /opt/glite/etc/init.d/glite-wms-jc start
Changed:
<
<
# /opt/glite/etc/init.d/glite-lb-locallogger start
>
>
# /opt/glite/etc/init.d/glite-lb-locallogger start # /opt/glite/etc/init.d/glite-lb-bkserverd start
  -- FabioCapannini - 02 Oct 2008

Revision 52008-10-03 - FabioCapannini

Line: 1 to 1
 
META TOPICPARENT name="WMS_guide"
Line: 15 to 15
 
CondorG No Yes / Done
glite-lb-logd Yes / Done Yes / Done
glite-lb-iterlogd Yes / Done Yes / Done
Changed:
<
<
>
>
glite-lb-bkserverd Yes / Done No
 

Filesystem sharing

Changed:
<
<
The inter-operation of the WMS components running on two different hosts is guaranteed by exporting the filesystem devel20:/var/glite to the host gundam via NFS. The host gundam mounts the filesystem under /mnt/devel20.
>
>
The inter-operation of the WMS components running on two different hosts is guaranteed by exporting the filesystem devel20:/var/glite to the host gundam via NFS . The host gundam mounts the filesystem under /mnt/devel20.
 

devel20: NFS server configuration

On devel20, as root, insert the following lines in /etc/hosts.deny:
Line: 52 to 52
 

gundam: NFS client configuration

In order to prevent any problems during the booting process, we don't mount the NFS filesystem at boot on gundam.
Changed:
<
<
Instead, we configure automount to mount the filesystem automatically at first access.
>
>
Instead, we configure automount to mount the filesystem automatically at first access, and disable subsequent auto-unmount.
 
Changed:
<
<
As root, create the mount point:
>
>
As root, insert the following line in /etc/auto.master:
 
Changed:
<
<
# mkdir /mnt/gundam
>
>
/mnt   /etc/auto.mnt --timeout=0
 
Changed:
<
<
# chmod 700 /mnt/devel20
>
>
Create the file /etc/auto.mnt with the following line:
 
Changed:
<
<
Insert the following line in /etc/auto.master:
>
>
devel20   -rw,hard,intr,nosuid,noauto,timeo=600,wsize=32768,rsize=32768,tcp   devel20.cnaf.infn.it:/var/glite
 
Changed:
<
<
/mnt   /etc/auto.mnt
>
>
Start the automount daemon:
 
Changed:
<
<
Create the file /etc/auto.mnt with the following line:
>
>
# /etc/init.d/autofs start
 
Changed:
<
<
devel20   -rw,hard,intr,nosuid,noauto,timeo=600,wsize=32768,rsize=32768,tcp   devel20.cnaf.infn.it:/var/glite
>
>
Make automount start at boot:

# chkconfig autofs on

 
Changed:
<
<
The filesystem /mnt/devel20 gets mounted at first access attempt.
>
>
The filesystem /mnt/devel20 gets mounted automatically at first access attempt after boot, and is never automatically unmounted. If the filesystem is not busy, it can be manually unmounted either by:
  • issuing the usual command `umount /mnt/devel20`
  • sending the USR1 signal to the automount daemon
Of course, upon subsequent access attempt, the filesystem gets automatically remounted.
 

Creation of the necessary links on gundam

On gundam create the following symbolic links:
Line: 78 to 83
 

WMS configuration

Changed:
<
<
On devel20 no changes are made to the configuration of WMS.
>
>
On devel20 no changes are made to the configuration of WMS.
On gundam, on the other way, it is necessary to instruct the JobController to find the jobdir under the /mnt/devel20 filesystem.
 
Changed:
<
<
On gundam, on the other way, it is necessary to instruct the JobController to find the jobdir under the /mnt/devel20 filesystem.

On gundam, modify the following line in the LogMonitor [] section of /opt/glite/etc/glite_wms.conf:

LogMonitor = [
>
>
  1. On gundam, modify the following line in the LogMonitor [] section of /opt/glite/etc/glite_wms.conf:
    LogMonitor = [
 ... MonitorInternalDir = "/mnt/devel20/logmonitor/internal"; ... ]
Changed:
<
<
On gundam, modify the following lines in the JobController [] section of /opt/glite/etc/glite_wms.conf:
>
>
  1. On gundam, modify the following lines in the JobController [] section of /opt/glite/etc/glite_wms.conf:
 
JobController = [
...
Input  =  "/mnt/devel20/jobcontrol/jobdir";
Line: 98 to 102
 ... ]
Changed:
<
<
Of course each component stores its logs locally, this is especially important for gundam where the LM, JC and CondorG logs produce a huge amount of data.
>
>
Of course each component stores its logs locally, this is especially important for gundam where the LM, JC and CondorG logs produce a huge amount of data.
 (...)

WMS startup

Revision 42008-10-03 - MarcoCecchi

Line: 1 to 1
 
META TOPICPARENT name="WMS_guide"
Line: 98 to 98
 ... ]
Added:
>
>
Of course each component stores its logs locally, this is especially important for gundam where the LM, JC and CondorG logs produce a huge amount of data. (...)
 

WMS startup

Added:
>
>
 On devel20 start the following components:
# /opt/glite/etc/init.d/glite-wms-wm start
# /opt/glite/etc/init.d/glite-wms-wmproxy start
Line: 103 to 107
 
# /opt/glite/etc/init.d/glite-wms-wm start
# /opt/glite/etc/init.d/glite-wms-wmproxy start
# /opt/glite/etc/init.d/glite-proxy-renewald start
Changed:
<
<
# /opt/glite/etc/init.d/glite-lb-locallogger start
>
>
# /opt/glite/etc/init.d/glite-lb-locallogger start # /opt/glite/etc/init.d/glite-lb-bkserverd start
  On gundam start the following components:
Changed:
<
<
# /opt/glite/etc/init.d/glite-wms-wm start
# /opt/glite/etc/init.d/glite-wms-lm start
>
>
# /opt/glite/etc/init.d/glite-wms-lm start
 # /opt/glite/etc/init.d/glite-wms-jc start # /opt/glite/etc/init.d/glite-lb-locallogger start

Revision 32008-10-02 - FabioCapannini

Line: 1 to 1
 
META TOPICPARENT name="WMS_guide"
Changed:
<
<
prova
>
>
 
Deleted:
<
<
-- MarcoCecchi - 02 Oct 2008
 \ No newline at end of file
Added:
>
>

HOW-TO distribute WMS components on two different hosts

WMS Components distribution

In order to gain better performances the components of a single WMS instance have been distributed on two hosts according to the following table:

Components host devel20 host gundam
glite_wms_wmproxy Yes / Done No
glite-wms-workload_manager Yes / Done No
glite-proxy-renewd Yes / Done No
glite-wms-job_controller No Yes / Done
glite-wms-log_monitor No Yes / Done
CondorG No Yes / Done
glite-lb-logd Yes / Done Yes / Done
glite-lb-iterlogd Yes / Done Yes / Done

Filesystem sharing

The inter-operation of the WMS components running on two different hosts is guaranteed by exporting the filesystem devel20:/var/glite to the host gundam via NFS. The host gundam mounts the filesystem under /mnt/devel20.

devel20: NFS server configuration

On devel20, as root, insert the following lines in /etc/hosts.deny:
portmap: ALL
lockd: ALL
statd: ALL
mountd: ALL
rquotad: ALL 
Insert the following line in /etc/hosts.allow:
portmap: gundam.cnaf.infn.it
lockd: gundam.cnaf.infn.it 
statd: gundam.cnaf.infn.it
mountd: gundam.cnaf.infn.it
rquotad: gundam.cnaf.infn.it 
There is no need to restart the portmap daemon.

Start the NFS service:

# /etc/init.d/nfs start

Make the NFS service start at boot:

# chkconfig nfs on

Insert the following line in /etc/exports:

/var/glite   gundam.cnaf.infn.it(rw,sync,wdelay)

Re-export the filesystem:

# exportfs -r

gundam: NFS client configuration

In order to prevent any problems during the booting process, we don't mount the NFS filesystem at boot on gundam. Instead, we configure automount to mount the filesystem automatically at first access.

As root, create the mount point:

# mkdir /mnt/gundam

# chmod 700 /mnt/devel20

Insert the following line in /etc/auto.master:

/mnt   /etc/auto.mnt

Create the file /etc/auto.mnt with the following line:

devel20   -rw,hard,intr,nosuid,noauto,timeo=600,wsize=32768,rsize=32768,tcp   devel20.cnaf.infn.it:/var/glite

The filesystem /mnt/devel20 gets mounted at first access attempt.

Creation of the necessary links on gundam

On gundam create the following symbolic links:
# ln -s /mnt/devel20/jobcontrol /var/glite/jobcontrol
# ln -s /mnt/devel20/SandboxDir /var/glite/SandboxDir
# ln -s /mnt/devel20/spool /var/glite/spool

WMS configuration

On devel20 no changes are made to the configuration of WMS.

On gundam, on the other way, it is necessary to instruct the JobController to find the jobdir under the /mnt/devel20 filesystem.

On gundam, modify the following line in the LogMonitor [] section of /opt/glite/etc/glite_wms.conf:

LogMonitor = [
...
MonitorInternalDir  =  "/mnt/devel20/logmonitor/internal";
...
]

On gundam, modify the following lines in the JobController [] section of /opt/glite/etc/glite_wms.conf:

JobController = [
...
Input  =  "/mnt/devel20/jobcontrol/jobdir";
LockFile  =  "/mnt/devel20/jobcontrol/lock";
SubmitFileDir  =  "/mnt/devel20/jobcontrol/submit";
...
]

WMS startup

On devel20 start the following components:
# /opt/glite/etc/init.d/glite-wms-wm start
# /opt/glite/etc/init.d/glite-wms-wmproxy start
# /opt/glite/etc/init.d/glite-proxy-renewald start
# /opt/glite/etc/init.d/glite-lb-locallogger start 

On gundam start the following components:

# /opt/glite/etc/init.d/glite-wms-wm start
# /opt/glite/etc/init.d/glite-wms-lm start
# /opt/glite/etc/init.d/glite-wms-jc start
# /opt/glite/etc/init.d/glite-lb-locallogger start 

-- FabioCapannini - 02 Oct 2008

Revision 22008-10-02 - FabioCapannini

Line: 1 to 1
 
META TOPICPARENT name="WMS_guide"
Changed:
<
<
>
>
prova
  -- MarcoCecchi - 02 Oct 2008 \ No newline at end of file

Revision 12008-10-02 - MarcoCecchi

Line: 1 to 1
Added:
>
>
META TOPICPARENT name="WMS_guide"

-- MarcoCecchi - 02 Oct 2008

 
This site is powered by the TWiki collaboration platformCopyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback