Introduction
This installation guide is divided as follows:
- Installation of the sensor on the machine to be monitored (WMS/LB/WMSLB)
- Installation of the data_collector server
This version of the server and of the sensor requires SL4 and snmp (installed by the installation scripts) on all the nodes.
Python, http, php are also needed on the collector machine.
If sensors or/and data_collector are already running at your site, instruction on how to upgrade are given highlighted in
green.
It is advisable, even if not required that the sensors are installed before the collector.
In case of problems during installation please contact wms-support<at>cnaf.infn.it
NOTE: In release 2.0 the database schema has changed! if you are running an update it will be automatically ported to the new schema by the installation script. However old data will disappear from the new VO statistic page.
If you are want to port the old data to the new VO statistics page please contact wms-support<at>cnaf.infn.it after the instalaltion/configuration completes.
WMS/LB sensors installation:
Sensors MUST be installed on every machine that needs to be monitored, being it a WMS, an LB or a coupled WMSLB. The following procedure is identical for every kind of nodes, since all the sensors are installed on every node. This waste a little bit of h/d space but simplifies the following installation procedure.
(done all the following as root)
- Download the following tar.gz in /root:
cd /root
wget
http://grid-it.cnaf.infn.it/certification/downloads/wmsmon_sensors_v2.0.tgz
- If this is an upgrade remove the /root/wmsmon directory, possibly backup the wmsmon_site-info.def file
tar -xvzf wmsmon_sensors_v2.0.tgz
- cd /root/wmsmon
- Edit the wmsmon_site-info.def (after having read the next few notes)
The wmsmon_site-info.def contains all the information needed to set up correctly all the monitor components. It is important that it is edited carefully using a key = value notation.
Comments can be made using # at the beginning of the line.
You can leave the defaults if you do not see the need for a change. For sure on the WMS/LB instances you need to change the following keys:
- WMSMON_HOST = 'Set here the name of the wmsmon data collector host'
- LEMONFLAG = 1 'If you do not have a lemon tool on the wms/lb instance set to 0'
- LEMONURL = 'Set to the machine lemon url if any'
- SERVER_MYSQL_PASSWORD = 'If the node is an LB set the root mysql passwd here' !!! IT IS IMPORTANT TO HAVE THE RIGHT PASSWORD ON LB nodes info.def, it is not important on WMS. If the node was installed using the gLite yaim tool this password can be grabbed from the site_info.def file used at configuration time.
- SNMPPASSWD = 'The password you choose for snmp communication between data collector and wms instance !! NOTE: it must be the same on every wmsmon_site-info.def
When setting a password please read the comments in the wmsmon_site-info.def file to know if the password must be enclosed between ' '
- LB_PARA_HOST = Host publishing the LB_PARAMETER FILE that initializes lb queries. It is the procol://host:port of the http service running on the collector instances.
If you follow the instruction below for the server installation without modifying the httpd configuration it s
http://<WMSMMON_HOST>
# Other examples in case of modified httpd conf
#Examples:
#
http://host.domain#
https://host.domain#
https://host.domain:8443
- Run the setup.sh file on /root/wmsmon
At the end you should have the following rpm installed on the machine:
net-snmp-devel-5.1.2-11.el4_6.11.2
net-snmp-5.1.2-11.el4_6.11.2
net-snmp-utils-5.1.2-11.el4_6.11.2
net-snmp-libs-5.1.2-11.el4_6.11.2
net-snmp-perl-5.1.2-11.el4_6.11.2
(version numbers may vary)
Finally the file /etc/rc.local should contain a line about snmp
Please check all the above statements.
- Run in /root/wmsmon "python WMSLB_wmsmon_configuration.py"
If python is not present on the machine try running the executable WMSLB_wmsmon_configuration on /root/wmsmon/snmpconf/
At the end check that the following lines are present in /etc/snmp/snmpd.conf file
exec .1.3.6.1.4.1.10403.98 /bin/sh /root/wmsmon/bin/send_ce_stats.sh
exec .1.3.6.1.4.1.10403.97 /bin/sh /root/wmsmon/bin/send_users_stats.sh
exec .1.3.6.1.4.1.10403.96 /bin/sh /root/wmsmon/bin/sendLongFile/send_long_file /root/wmsmon/tmp/USERSMAPPING.txt 5
exec .1.3.6.1.4.1.10403.95 /bin/sh /root/wmsmon/bin/sendLongFile/send_long_file /root/wmsmon/tmp/CE_MM.txt 50
exec .1.3.6.1.4.1.10403.94 /bin/sh /root/wmsmon/bin/CE_MM.sh
exec .1.3.6.1.4.1.10403.60 /bin/sh /root/wmsmon/bin/wms/wms-sensor-wrapper
exec .1.3.6.1.4.1.10403.70 /bin/sh /root/wmsmon/bin/lb/lb-sensor-wrapper
exec .1.3.6.1.4.1.10403.75 /bin/sh /root/wmsmon/bin/lb-refill/lb-refill-sensor-wrapper
OIDs are those set in the wmsmon_site-info.def file.
and the snmp daemon should be running
#service snmpd status
#snmpd (pid 21770) is running...
- The sensors installation on the node is now completed.
WMSMON data collector installation
(done all the following as root)
- Install an SL4 machine (Not to be done if this is an upgrade)
- If it is an upgrade it is safer to create a backup copy of the wmsmon database using an utility such as mysqldump
- cd /root
- Download this install script file in /root and give it executable permission:
wget
http://grid-it.cnaf.infn.it/certification/downloads/install_wmsmon_server-v2.0_00.py
chmod +x /root/install_wmsmon_server-v2.0_00.py
- Install mysql (if it's not yet installed) and start the corresponding service:
yum install mysql-server
service mysqld start
- Run the script with the -i option to start the installation process
/root/install_wmsmon_server-v2.0_00.py -i
The script will look for older wmsmon installations. If none is found a brand new installation will start otherwise an upgrade process will be attempted.
Errors will be reported, please pay attention to them.
The root mysql password will be asked in order to check older installation.
When the script is completed the following directory should be present:
/root/wmsmon
/var/www/html/wmsmon
If an older installation was found a wmsmon_old and /tmp/wmsmon_web_bkp directory are created containing all the old files.
Now you can proceed with the configuration of the server.
WMSMON data collector configuration
- Edit the /root/wmsmon/wmsmon_site-info.def file
- If this is an upgrade you are advised to re-edit the wmsmon_site-info.def file from scratch using the same old values for the same variables.
The file is self commented, but the most fundamental variables are:
WMSMON_HOST
WMSMON_DB_PWD
SNMPUSER
SNMPPASSWD
- Edit the /root/wmsmon/wmslist.conf file
- If this is an upgrade you can maintain the same wmslist.conf file used by the previous release, but note that it is no more called wmsmonlist.con but wmslist.conf.
This file contains the list of wms present in your cluster (those were the wmsmon sensors are or will be installed).
It must be edit as shown in the template file in the following way:
wms1.your_domain lb1.your_domain vo1
....
wmsn.your_domain lbn.your_domain von
You should insert the wms/lb pairs monitored and a vo served by the pair. The vo indication will be used only to group the wms in the wmsmon web pages, and not for the job counting per vo.
If a pair serve more than one VO you can choose a word like multi or multiVO. You can also use this tag to group the wms by their role: PROD, DEVEL etc...
THE FILE CANNOT BE COMMENTED.
- Run the install script with the -c option
/root/install_wmsmon_server-v2.0_00.py -c
As before pay attention to error messages, if any.
When the script completes the
/root/wmsmon/ directory should have been moved to the
INSTALL_PATH defined in the
wmsmon_site-info.def file and the
/var/www/html/wmsmon directory should now be in
/var/www/html/WEBDIR, where
WEBDIR is defined in the
wmsmon_site-info.def file.
Two crons should be present in
/etc/cron.d:
wmsmon.cron and
wmsmon_logrotate.conf.
The
wmsmon.cron is the cron that launches the main data collector script. By default it is launched every 15 mins, if you want to change this frequency you should change the cron file by hand, and in this case in order to optimize performance you are advised to change also the
STEPDATE value in the
wmsmon_site-info.def file to 2x(cron frequency).
By default the data collector script logs to the
/var/log/WMSMONITOR.log file, but you can configure this on the .def file. By default the wmsmon_logrotate cron creates 10 files 100MB big, to modify this parameter you should change by hand the
/etc/wmsmon_logrotate.cron file.
The standard output of the data collector script is maintained in the
/var/log/data_collector_main.log for debugging reasons, this file is rotated as the main log file, please keep it.
If no errors were reported by the configuration script you should be able to find at the following url the WMSMON main page:
http://your_wmsmon_server.your_domain/wmsmon/main/main.php
If you see no data at that page, but only the top banner and an empty table it means that data were not collected and you probably need to wait for the main cron to run at least once.
If the cron already run, it is likely that communication problems between the collector and the sensor exist and the log file should be investigated in order to find problem what's going on.
Post installation STEPS
-
Increased php available memory
In order to increase php performance it is advisable that you increase the php allocable memory.
Modify the
/etc/php.ini to have the folloving line:
memory_limit = 56M
-
Optional http port change
The WMSMON web configuration does not modify the default port (80) used by httpd. To modify the port number edit
/etc/httpd/conf/httpd.conf and edit the line:
Listen <port_number>
Then restart httpd (
service httpd restart)
-
Optional secure http enabled
The WMSMON server needs a valid host certificate stored in a HOST_CERTIFICATE_DIR (i.e.
/etc/grid-security)
Install mod_ssl package:
-Run:
yum install mod_ssl
Install the accepted ca packages, i.e. you can execute the following:
- Create the /etc/yum.repos.d/lcg-ca.repo file containing:
[CA]
name=CAs
baseurl=http://linuxsoft.cern.ch/LCG-CAs/current
protect=1
- Run :
yum install lcg_CA
Edit the
/etc/httpd/conf/httpd.con f and add the following inside the
<Directory /var/www/html> section the following lines
:
SSLRequireSSL
SSLVerifyClient require
SSLVerifyDepth 10
Edit the
/etc/httpd/conf.d/ssl.conf and:
- set the
SSLCertificateFile variable to HOST_CERTIFICATE_DIR/hostcert.pem and comment any other line that set this variable.
- set the
SSLCertificateKeyFile variable to HOST_CERTIFICATE_DIR/hostkey.pem and comment any other line that set this variable. _ _
- set the
SSLCertificatePath variable to the name of the directory containing the CA file (i.e.
/etc/grid-security/certificates if you installed the
lcg_CA metapackage) and comment any other line that set this variable.
If you want to change the default https port (443) you should change in the
/etc/httpd/conf.d/ssl.conf file the line:
Listen <port_number> (i.e. Listen 8443)
Oprtional redirect - If you want to automatically redirect http requests to https pages you should add to the /etc/httpd/conf/httpd.conf file the following section (using the proper values for the variables SERVER_HOST_IP,SERVER_HOST_NAME and YOUR_DOMAIN):
<VirtualHost SERVER_HOST_IP:80>
DocumentRoot /var/www/html
ServerName SERVER_HOST_NAME.YOUR_DOMAIN
RedirectMatch (/.*)$ https://SERVER_HOST_NAME.YOUR_DOMAIN/$1
</VirtualHost>
Restart httpd
-
Unlock protected pages to specific certificate DNs
WMSMONitor reports a section with users activity on each WMS. Name and surname of each user is reported on some pages and these pages for privacy reasons are not exposed to all certificates
If you want to unlock those pages to some certificate DN you should enable https protocol as descrbed in the previous paragraph(Optional secure http enabled) and do the following:
- Edit the
/var/www/html/wmsmon/common/config.php
file and modify the last line adding the list of DNs in the line
$config->dnEnabledList=array('DN1','DN2'....'DNn');
If you want to unlock the pages with sensible data to everyone (using either http or https) you have to change the value of the
$config->protectedPage variable in
/var/www/html/WEBDIR/common/config.php file (WEBDIR is defined in the site-info.def file) and set it to 0:
$config->protectedPages=0;
-
Enable high port range communication for inter cluster monitoring
WMSMON uses the snmp standard port (in general the 183) for sensor-collector communications. If a firewall blocks the snmp port it is possible to enable the communication on a not-standard port. This can be useful when the data collector and the WMS cluster are not in the same computing centre.
If you are not in this case you can skip this section.
NOTE: this feature is not well tested, please report any problem and bug found to wms-support<at>cnaf.infn.it
To enable high port support you should modify the
wmslist.conf file adding a fourth column indicating which is the port number to be used on that particular wms/lb pair:
wms1.your_domain lb1.your_domain vo1 port1
NOTE: it is not possible to specify 2 different port numbers for WMS and LB.
On WMS/LB sensor side you should set snmp in order to listen for request on the port you choose. This is accomplished by adding in the /etc/snmp.conf file the following line:
agentaddress <port_number>
and restarting snmp (service snmpd restart)