Under Construction
WMS/LB sensors installation:
Sensors MUST be installed on every machine that needs to be monitored, being it a WMS, an LB or a coupled WMSLB. The following procedure is identical for every kind of nodes, since all the sensors are installed on every node. This waste a little bit of h/d space but simplifies the following installation procedure.
(done all the following as root)
- Download the following tar.gz in /root:
cd /root
wget --no-check-certificate
https://grid-it.cnaf.infn.it/certification/downloads/wmsmon-v1.3-0.tgz
tar -xvzf wmsmon-v1.3-0.tgz
- cd /root/wmsmon
- Edit the wmsmon_site-info.def (after having read the next few notes)
The wmsmon_site-info.def contains all the information needed to set up correctly all the monitor components. It is important that it is edited carefully using a key = value notation.
Comments can be made using # at the beginning of the line.
You can leave the defaults if you do not see the need for a change. For sure on the WMS/LB instances you need to change the following keys:
- WMSMON_HOST = 'Set here the name of the wmsmon data collector host'
- LEMONFLAG = 1 'If you do not have a lemon tool on the wms/lb instance set to 0'
- LEMONURL = 'Set to the machine lemon url if any'
- SERVER_MYSQL_PASSWORD = 'If the node is an LB set the root mysql passwd here' !!! IT IS IMPORTANT TO HAVE THE RIGHT PASSWORD ON LB nodes info.def, it is not important on WMS. If the node was installed using the gLite yaim tool this password can be grabbed from the site_info.def file used at configuration time.
- SNMPPASSWD = 'The password you choose for snmp communication between data collector and wms instance !! NOTE: it must be the same on every wmsmon_site-info.def
When setting a password please read the comments in the wmsmon_site-info.def file to know if the password must be enclosed between ' '
- Run the setup.sh file on /root/wmsmon
At the end you should have the following rpm installed on the machine:
net-snmp-devel-5.1.2-11.el4_6.11.2
net-snmp-5.1.2-11.el4_6.11.2
net-snmp-utils-5.1.2-11.el4_6.11.2
net-snmp-libs-5.1.2-11.el4_6.11.2
net-snmp-perl-5.1.2-11.el4_6.11.2
(version numbers may vary)
and the snmp daemon should be running
#service snmpd status
#snmpd (pid 21770) is running...
Finally the file /etc/rc.local should contain a line about snmp
Please check all the above statements.
- Run in /root/wmsmon "python WMSLB_wmsmon_configuration.py"
If python is not present on the machine try running the executable WMSLB_wmsmon_configuration on /root/wmsmon/snmpconf/
At the end check that the following lines are present in /etc/snmp/snmp.conf file
exec .1.3.6.1.4.1.10403.60 /bin/sh /root/wmsmon/bin/wms/wms-sensor-wrapper
exec .1.3.6.1.4.1.10403.70 /bin/sh /root/wmsmon/bin/lb/lb-sensor-wrapper
exec .1.3.6.1.4.1.10403.75 /bin/sh /root/wmsmon/bin/lb-refill/lb-refill-sensor-wrapper
OIDs are those set in the wmsmon_site-info.def file.
- The sensors installation on the node is now completed.
WMSMON data collector installation
(done all the following as root)
- Prepare an SL4 machine
- Install the following packages
yum install httpd
(tested with httpd-2.0.52-38.sl4.2)
yum install php php-mysql
(tested with php-4.3.9-3.22.9, php-mysql-4.3.9-3.22.9)
yum install python
(tested with PYTHON-2.3.4-14.4.EL4_6.1)
#PYTHON 2.3 is REQUIRED
yum install mysql mysql-server
(tested with mysql-4.1.20-3.RHEL4.1.el4_6.1, mysql-server-4.1.20-3.RHEL4.1.el4_6.1)
yum install MySQL -python python-devel python-sqlite
(tested with python-sqlite-1.1.6-1,
MySQL -python-1.2.1_p2-1.el4.1, python-devel-2.3.4-14.4.el4_6.1 )
yum install net-snmp net-snmp-devel net-snmp-perl net-snmp-libs net-snmp-utils
Tested with:
net-snmp-utils-5.1.2-11.el4_6.11.2
net-snmp-5.1.2-11.el4_6.11.2
net-snmp-libs-5.1.2-11.el4_6.11.2
net-snmp-perl-5.1.2-11.el4_6.11.2
net-snmp-devel-5.1.2-11.el4_6.11.2
- Download this tar.gz file in /root :
wget --no-check-certificate https://grid-it.cnaf.infn.it/certification/downloads/wmsmon_server-v1.3-0.tgz
- Untar the downloaded file
tar -xvzf wmsmon_server-v1.3-0.tgz
- Edit the wmsmon_site-info.def
You must pay attention in editing the wmsmon_site-info.def file because it contains all the information needed by executables to run properly.
It is important that it is edited carefully using a key = value notation.
Comments can be made using # at the beginning of the line.
You can leave the defaults if you do not see the need for a change.
For sure on the collector instance you need to change the following keys:
- WMSMON_HOST = 'Set here the name of the wmsmon data collector host'
- WMSMON_DB_PWD = 'set here the collector root mysql passwd'
- SNMPPASSWD = 'The password you choose for snmp communication
between data collector and wms instance !! NOTE: it must be the same on
every wmsmon_site-info.def
When setting a password please read the comments in the
wmsmon_site-info.def file to know if the password must be enclosed
between ' '
service mysqld start
./createdb.sh pass1 pass2 (read the following comment)
createdb.sh needs 2 arguments, the first one is the current
mysql_root@localhost passwd
If it is not defined define it using:
mysql -e "SET PASSWORD FOR root@localhost = PASSWORD('pwd')"
or
mysql -e "SET PASSWORD FOR root = PASSWORD('pwd')"
The second one is the password that will be used in wmsmonitor db connection and that was set in wmsmon_site-info.def file with WMSMON_DB_PWD
This script needs on the running directory the wmsmon.sql file, that should be included in distribution tar
Please note if errors are issued during the run of the createdb.sh script
python snmpdconf_script.py
Please note if errors are issued during the run
Add /etc/init.d/snmpd start to the /etc/rc.local file in order to be sure that snmp start at next reboot
cp /root/wmsmon/cron.d/*.cron /etc/cron.d
cp /root/wmsmon/cron.d/wmsmon_logrotate.conf /etc/
(Pay attention that if you change the log file name in wmsmon_site-info.def you need to change it also in the wmsmon_logrotate.conf file)
The wmsmon.cron and the wmsmon_daily.cron cronjobs trigger the execution of the two python executables that collect the data. The first one is executed every 15 minutes and collects data from the wms/lb instances, while the second one creates some daily aggregated statistics and runs every two hours.
If you want to increase the rate of data collection you need to increase the running frequency of the first cronjob. Note the the tool was not widely tested at acquisition rates greater than 900 seconds.
- Edit the wmsmon.list file
Put here the WMS/LB instances you want to monitor as described in the file.
Every line contains a WMS(first string), its associated LB (second string) and a word indicating which VO is served by the WMS (you can put just one word) (third string)
Please do not use comments in this file
- Test the snmp communication
Run the following command:
snmpwalk -v 3 -u 'wmsmon' -l authNoPriv -a MD5 -A 'snmp_passwd' <a_wmshost_in_wmsmonlist_file> .1.3.6.1.4.1.10403.60 -t 180
Execute the cron executables:
/root/wmsmon/bin/data_collector_main_autoup.py /root/wmsmon/wmsmonlist.conf 900
Check if errors are reported
WEB INSTALLATION AND CONFIGURATION (not working for now, we are working on it....)
- Download and untar the web tar.gz
cd /var/www/html/
wget --no-check-certificate https://grid-it.cnaf.infn.it/certification/downloads/wmsmon_web-v1.3-0.tgz
tar -xvzf wmsmon_web-v1.3-0.tgz
cd /root/wmsmon
yum install php-gd gd gd-devel
python phpconf_script.py /root/wmsmon/wmsmonlist.conf (not working for now)
In /etc/php.ini set : memory_limit = 24M
(by default it shoul dbe 8M)
service httpd start