Difference: NotesAboutWN(torque,Mpi,Glexec)-EMI-2SL6 (2 vs. 3)

Revision 32012-06-04 - PaoloVeronesi

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

Notes about Installation and Configuration of a WN - EMI-2 - SL6 (torque, mpi, glexec)

  • These notes are provided by site admins on a best effort base as a contribution to the IGI communities and MUST not be considered as a subsitute of the Official IGI documentation.
Line: 56 to 56
 
# yum clean all 
Changed:
<
<
# yum install ca-policy-egi-core emi-wn emi-torque-utils glite-mpi emi-glexec_wn openmpi openmpi-devel mpich2 mpich2-devel
>
>
# yum install ca-policy-egi-core emi-wn emi-torque-utils glite-mpi emi-glexec_wn openmpi openmpi-devel mpich2 mpich2-devel emi-torque-client
 

Service configuration

Line: 128 to 128
 

users and groups

You can download them from HERE.
Added:
>
>

Munge

Copy the key /etc/munge/munge.key from the Torque server to every host of your cluster, adjust the permissions and start the service
# chown munge:munge /etc/munge/munge.key

# ls -ltr /etc/munge/
total 4
-r-------- 1 munge munge 1024 Jan 13 14:32 munge.key

# chkconfig munge on
# /etc/init.d/munge restart
 

site-info.def

Changed:
<
<
SUGGESTION: use the same site-info.def for CREAM and WNs: for this reason in this example file there are yaim variable used by CREAM, TORQUE or emi-WN.
>
>
KISS: Keep it simple, stupid! For your convenience there is an explanation of each yaim variable. For more details look HERE.
 
Changed:
<
<
It is also included the settings of some VOs
>
>
SUGGESTION: use the same site-info.def for CREAM and WNs: for this reason in this example file there are yaim variable used by CREAM, TORQUE or emi-WN.
# cat site-info.def 
CE_HOST=cream-01.cnaf.infn.it
SITE_NAME=IGI-BOLOGNA

BATCH_SERVER=batch.cnaf.infn.it
BATCH_LOG_DIR=/var/torque

BDII_HOST=egee-bdii.cnaf.infn.it

CE_BATCH_SYS=torque
JOB_MANAGER=pbs
BATCH_VERSION=torque-2.5.7
#CE_DATADIR=

CE_INBOUNDIP=FALSE
CE_OUTBOUNDIP=TRUE
CE_OS="ScientificSL"
CE_OS_RELEASE=6.2
CE_OS_VERSION="Carbon"

CE_RUNTIMEENV="IGI-BOLOGNA"

CE_PHYSCPU=8
CE_LOGCPU=16
CE_MINPHYSMEM=16000
CE_MINVIRTMEM=32000
CE_SMPSIZE=8
CE_CPU_MODEL=Xeon
CE_CPU_SPEED=2493
CE_CPU_VENDOR=intel
CE_CAPABILITY="CPUScalingReferenceSI00=1039 glexec"
CE_OTHERDESCR="Cores=1,Benchmark=4.156-HEP-SPEC06"
CE_SF00=951
CE_SI00=1039
CE_OS_ARCH=x86_64

CREAM_PEPC_RESOURCEID="http://cnaf.infn.it/cremino"

USERS_CONF=/root/siteinfo/ig-users.conf
GROUPS_CONF=/root/siteinfo/ig-users.conf

VOS="comput-er.it dteam igi.italiangrid.it infngrid ops gridit"
QUEUES="cert prod"
CERT_GROUP_ENABLE="dteam infngrid ops /dteam/ROLE=lcgadmin /dteam/ROLE=production /ops/ROLE=lcgadmin /ops/ROLE=pilot /infngrid/ROLE=SoftwareManager /infngrid/ROLE=pilot"
PROD_GROUP_ENABLE="comput-er.it gridit igi.italiangrid.it /comput-er.it/ROLE=SoftwareManager /gridit/ROLE=SoftwareManager /igi.italiangrid.it/ROLE=SoftwareManager"
VO_SW_DIR=/opt/exp_soft

WN_LIST="/root/siteinfo/wn-list.conf"
MUNGE_KEY_FILE=/etc/munge/munge.key
CONFIG_MAUI="no"

MYSQL_PASSWORD=*********************************
APEL_DB_PASSWORD=not_used
APEL_MYSQL_HOST=not_used
SE_LIST="darkstorm.cnaf.infn.it"
SE_MOUNT_INFO_LIST="none"
  For your convenience there is an explanation of each yaim variable. For more details look at [6, 7, 8, 9]
Deleted:
<
<
</>
<--/twistyPlugin-->
 
Deleted:
<
<
<--/twistyPlugin twikiMakeVisibleInline-->
 

glite-mpi

Deleted:
<
<
 in the following example, it is enabled the support for MPICH2 and OPENMPI; moreover the WNs are configured to use shared homes
Line: 149 to 210
 in the following example, it is enabled the support for MPICH2 and OPENMPI; moreover the WNs are configured to use shared homes
Deleted:
<
<
############################################ # Mandatory parameters in services/mpi # ############################################

# N.B. this file contains common configuration for CE and WN # As such, it should be included in your site-info.def to ensure # that the configuration of the CE and WNs remains in sync.

#---------------------------------- # MPI-related configuration: #---------------------------------- # Several MPI implementations (or "flavours") are available. # If you do NOT want a flavour to be configured, set its variable # to "no". Otherwise, set it to "yes". If you want to use an # already installed version of an implementation, set its "_PATH" and # "_VERSION" variables to match your setup (examples below). # # NOTE 1: the CE_RUNTIMEENV will be automatically updated in the file # functions/config_mpi_ce, so that the CE advertises the MPI implementations # you choose here - you do NOT have to change it manually in this file. # It will become something like this: # # CE_RUNTIMEENV="$CE_RUNTIMEENV # MPICH # MPICH-1.2.7p4 # MPICH2 # MPICH2-1.0.4 # OPENMPI # OPENMPI-1.1 # LAM" # # NOTE 2: it is currently NOT possible to configure multiple concurrent # versions of the same implementations (e.g. MPICH-1.2.3 and MPICH-1.2.7) # using YAIM. Customize "/opt/glite/yaim/functions/config_mpi_ce" file # to do so.

############### # The following example are applicable to default SL 5.3 x86_64 (gLite 3.2 WN) # Support for MPICH 1 is dropped

 MPI_MPICH_ENABLE="no" MPI_MPICH2_ENABLE="yes" MPI_OPENMPI_ENABLE="yes" MPI_LAM_ENABLE="no"
Deleted:
<
<
#--- # Example for using an already installed version of MPI. # Just fill in the path to its current installation (e.g. "/usr") # and which version it is (e.g. "6.5.9"). #---

# DEFAULT Parameters # The following parameters are correct for a default SL 5.X x86_64 WN

 #MPI_MPICH_PATH="/opt/mpich-1.2.7p1/" #MPI_MPICH_VERSION="1.2.7p1"
Changed:
<
<
MPI_MPICH2_PATH="/usr/lib64/mpich2/" MPI_MPICH2_VERSION="1.2.1p1" MPI_OPENMPI_PATH="/usr/lib64/openmpi/1.4-gcc/" MPI_OPENMPI_VERSION="1.4"
>
>
MPI_MPICH2_PATH="/usr/lib64/mpich2/bin" MPI_MPICH2_VERSION="1.2.1" MPI_OPENMPI_PATH="/usr/lib64/openmpi/bin/" MPI_OPENMPI_VERSION="1.5.4"
 #MPI_LAM_VERSION="7.1.2"
Deleted:
<
<
# If you provide mpiexec (http://www.osc.edu/~pw/mpiexec/index.php) # for MPICH or MPICH2, please state the full path to that file here. # Otherwise do not set this variable. (Default is to set this to # the location of mpiexec set by the glite-MPI_WN metapackage.
 # Most versions of MPI now distribute their own versions of mpiexec # However, I had some problems with the MPICH2 version - so use standard mpiexec MPI_MPICH_MPIEXEC="/usr/bin/mpiexec" MPI_MPICH2_MPIEXEC="/usr/bin/mpiexec"
Changed:
<
<
MPI_OPENMPI_MPIEXEC="/usr/lib64/openmpi/1.4-gcc/bin/mpiexec"
>
>
MPI_OPENMPI_MPIEXEC="/usr/lib64/openmpi/bin/mpiexec"
  ######### MPI_SHARED_HOME section # Set this variable to one of the following:
Line: 227 to 235
 # MPI_SHARED_HOME="yes" if the HOME directory area is shared # MPI_SHARED_HOME="/Path/to/Shared/Location" if a shared area other # than the HOME dirirectory is used.
Deleted:
<
<
# If you do NOT provide a shared home, set MPI_SHARED_HOME to "no" (default). #MPI_SHARED_HOME=${MPI_SHARED_HOME:-"no"}
 # If you do provide a shared home and Grid jobs normally start in that area, # set MPI_SHARED_HOME to "yes". MPI_SHARED_HOME="yes"
Deleted:
<
<
# If you have a shared area but Grid jobs don't start there, then set # MPI_SHARED_HOME to the location of this shared area. The permissions # of this area need to be the same as /tmp (i.e. 1777) so that users # can create their own subdirectories. #MPI_SHARED_HOME=/share/cluster/mpi
 ######## Intra WN authentication
Deleted:
<
<
# This variable is normally set to yes when shared homes are not used. # This allows the wrapper script to copy the job data to the other nodes # # If enabling SSH Hostbased Authentication you must ensure that # the appropriate ssh config files are deployed. # Affected files are the system ssh_config, sshd_config and ssh_know_hosts. # The edg-pbs-knownhosts can be use to generate the ssh_know_hosts # # If you do NOT have SSH Hostbased Authentication between your WNs, # set this variable to "no" (default). Otherwise set it to "yes". #
 MPI_SSH_HOST_BASED_AUTH=${MPI_SSH_HOST_BASED_AUTH:-"no"}
Deleted:
<
<
<--/twistyPlugin-->
 
Deleted:
<
<
<--/twistyPlugin twikiMakeVisibleInline-->

glite-mpi_ce

# Setup configuration variables that are common to both the CE and WN

if [ -r ${config_dir}/services/glite-mpi ]; then
 source ${config_dir}/services/glite-mpi
fi

# The MPI CE config function can create a submit filter for
# Torque to ensure that CPU allocation is performed correctly.
# Change this variable to "yes" to have YAIM create this filter.
# Warning: if you have an existing torque.cfg it will be modified.
#MPI_SUBMIT_FILTER=${MPI_SUBMIT_FILTER:-"no"}
MPI_SUBMIT_FILTER="yes"
<--/twistyPlugin-->

<--/twistyPlugin twikiMakeVisibleInline-->
 

glite-mpi_wn

Line: 303 to 254
  source ${config_dir}/services/glite-mpi fi
Deleted:
<
<
<--/twistyPlugin-->
 
Changed:
<
<
<--/twistyPlugin twikiMakeVisibleInline-->

munge configuration

IMPORTANT: The updated EPEL5 build of torque-2.5.7-1 as compared to previous versions enables munge as an inter node authentication method.

  • verify that munge is correctly installed:
# rpm -qa | grep munge
munge-libs-0.5.8-8.el5
munge-0.5.8-8.el5
  • On one host (for example the batch server) generate a key by launching:
>
>

services/glite-glexec_wn

 
Changed:
<
<
# /usr/sbin/create-munge-key

# ls -ltr /etc/munge/ total 4 -r-------- 1 munge munge 1024 Jan 13 14:32 munge.key

>
>
GLEXEC_WN_SCAS_ENABLED="no" GLEXEC_WN_ARGUS_ENABLED="yes" GLEXEC_WN_OPMODE="setuid"
 
Deleted:
<
<
  • Copy the key, /etc/munge/munge.key to every host of your cluster, adjusting the permissions:
# chown munge:munge /etc/munge/munge.key
  • Start the munge daemon on each node:
# service munge start
Starting MUNGE:                                            [  OK  ]

# chkconfig munge on
<--/twistyPlugin-->
 
Deleted:
<
<
<--/twistyPlugin twikiMakeVisibleInline-->

software area settings

you have to import the software area from CE (or another host).
  • Edit the file /etc/fstab by adding a line like the following:
cremino.cnaf.infn.it:/opt/exp_soft/ /opt/exp_soft/ nfs rw,defaults 0 0
  • check nfs and portmap status
# service nfs status
rpc.mountd is stopped
nfsd is stopped

# service portmap status
portmap is stopped

# service portmap start
Starting portmap:                                          [  OK  ]

# service nfs start
Starting NFS services:                                     [  OK  ]
Starting NFS daemon:                                       [  OK  ]
Starting NFS mountd:                                       [  OK  ]
Starting RPC idmapd:                                       [  OK  ]

# chkconfig nfs on
# chkconfig portmap on
  • after any modification in /etc/fstab launch
mount -a
  • verify the mount:
# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/sda3              65G  1.9G   59G   4% /
/dev/sda1              99M   18M   76M  19% /boot
tmpfs                 2.0G     0  2.0G   0% /dev/shm
cremino.cnaf.infn.it:/opt/exp_soft/
                       65G  4.4G   57G   8% /opt/exp_soft
<--/twistyPlugin-->

<--/twistyPlugin twikiMakeVisibleInline-->
 

yaim check

Changed:
<
<
# /opt/glite/yaim/bin/yaim -v -s site-info_batch.def -n MPI_WN -n WN_torque_noafs
INFO
Using site configuration file: site-info_batch.def
INFO
Sourcing service specific configuration file: ./services/glite-mpi_wn
INFO
###################################################################

. /'.-. ') . yA,-"-,( ,m,:/ ) .oo. oo o ooo o. .oo . / .-Y a a Y-. 8. .8' 8'8. 8 8b d'8 . / ~ ~ / 8' .8oo88. 8 8 8' 8 . (_/ '====' 8 .8' 8. 8 8 Y 8 . Y,-''-,Yy,-.,/ o8o o8o o88o o8o o8o o8o . I_))_) I_))_)

current working directory: /root site-info.def date: Apr 24 09:22 site-info_batch.def yaim command: -v -s site-info_batch.def -n MPI_WN -n WN_torque_noafs log file: /opt/glite/yaim/bin/../log/yaimlog Tue Apr 24 11:53:02 CEST 2012 : /opt/glite/yaim/bin/yaim

Installed YAIM versions: glite-yaim-clients 5.0.0-1 glite-yaim-core 5.0.2-1 glite-yaim-mpi 1.1.10-10 glite-yaim-torque-client 5.0.0-1 glite-yaim-torque-utils 5.0.0-1

####################################################################

INFO
The default location of the grid-env.(c)sh files will be: /usr/libexec
INFO
Sourcing the utilities in /opt/glite/yaim/functions/utils
INFO
Detecting environment
INFO
Executing function: config_mpi_wn_check
INFO
Executing function: config_ntp_check
INFO
Executing function: config_sysconfig_lcg_check
INFO
Executing function: config_globus_clients_check
INFO
Executing function: config_lcgenv_check
INFO
Executing function: config_users_check
INFO
Executing function: config_sw_dir_check
INFO
Executing function: config_amga_client_check
INFO
Executing function: config_wn_check
INFO
Executing function: config_vomsdir_check
INFO
Executing function: config_vomses_check
INFO
Executing function: config_glite_saga_check
INFO
Executing function: config_add_pool_env_check
INFO
Executing function: config_wn_info_check
INFO
Executing function: config_torque_client_check
INFO
Checking is done.
INFO
All the necessary variables to configure MPI_WN WN_torque_noafs are defined in your configuration files.
INFO
Please, bear in mind that YAIM only guarantees the definition of variables
INFO
controlled in the _check functions.
INFO
YAIM terminated succesfully.
<--/twistyPlugin-->

<--/twistyPlugin twikiMakeVisibleInline-->
>
>
# /opt/glite/yaim/bin/yaim -v -s site-info_batch.def -n MPI_WN -n WN_torque_noafs -n GLEXEC_wn
 

yaim config

Changed:
<
<
# /opt/glite/yaim/bin/yaim -c -s site-info_batch.def -n MPI_WN -n WN_torque_noafs
INFO
Using site configuration file: site-info_batch.def
INFO
Sourcing service specific configuration file: ./services/glite-mpi_wn
INFO
###################################################################

. /'.-. ') . yA,-"-,( ,m,:/ ) .oo. oo o ooo o. .oo . / .-Y a a Y-. 8. .8' 8'8. 8 8b d'8 . / ~ ~ / 8' .8oo88. 8 8 8' 8 . (_/ '====' 8 .8' 8. 8 8 Y 8 . Y,-''-,Yy,-.,/ o8o o8o o88o o8o o8o o8o . I_))_) I_))_)

current working directory: /root site-info.def date: Apr 24 09:22 site-info_batch.def yaim command: -c -s site-info_batch.def -n MPI_WN -n WN_torque_noafs log file: /opt/glite/yaim/bin/../log/yaimlog Tue Apr 24 11:53:15 CEST 2012 : /opt/glite/yaim/bin/yaim

Installed YAIM versions: glite-yaim-clients 5.0.0-1 glite-yaim-core 5.0.2-1 glite-yaim-mpi 1.1.10-10 glite-yaim-torque-client 5.0.0-1 glite-yaim-torque-utils 5.0.0-1

####################################################################

INFO
The default location of the grid-env.(c)sh files will be: /usr/libexec
INFO
Sourcing the utilities in /opt/glite/yaim/functions/utils
INFO
Detecting environment
INFO
Executing function: config_mpi_wn_check
INFO
Executing function: config_ntp_check
INFO
Executing function: config_sysconfig_lcg_check
INFO
Executing function: config_globus_clients_check
INFO
Executing function: config_lcgenv_check
INFO
Executing function: config_users_check
INFO
Executing function: config_sw_dir_check
INFO
Executing function: config_amga_client_check
INFO
Executing function: config_wn_check
INFO
Executing function: config_vomsdir_check
INFO
Executing function: config_vomses_check
INFO
Executing function: config_glite_saga_check
INFO
Executing function: config_add_pool_env_check
INFO
Executing function: config_wn_info_check
INFO
Executing function: config_torque_client_check
INFO
Executing function: config_mpi_wn_setenv
INFO
Executing function: config_mpi_wn
INFO
Executing function: config_ldconf
INFO
config_ldconf: function not needed anymore, left empy waiting to be removed
INFO
Executing function: config_ntp_setenv
INFO
Executing function: config_ntp
INFO
Storing old ntp settings in /etc/ntp.conf.yaimold.20120424_115316
INFO
Executing function: config_sysconfig_edg
INFO
Executing function: config_sysconfig_globus
INFO
Executing function: config_sysconfig_lcg
INFO
Executing function: config_crl
INFO
Now updating the CRLs - this may take a few minutes...
Enabling periodic fetch-crl: [ OK ]
INFO
Executing function: config_rfio
INFO
Executing function: config_globus_clients_setenv
INFO
Executing function: config_globus_clients
INFO
Configure the globus service - not needed in EMI
INFO
Executing function: config_lcgenv
INFO
Executing function: config_users
INFO
Executing function: config_sw_dir_setenv
INFO
Executing function: config_sw_dir
INFO
Executing function: config_nfs_sw_dir_client
INFO
Variable $BASE_SW_DIR is not set!
INFO
The directory /opt/exp_soft won't be mounted with NFS!
INFO
Executing function: config_fts_client
INFO
Executing function: config_amga_client_setenv
INFO
Executing function: config_amga_client
INFO
Executing function: config_wn_setenv
INFO
Executing function: config_wn
INFO
Executing function: config_vomsdir_setenv
INFO
Executing function: config_vomsdir
INFO
Executing function: config_vomses
INFO
Executing function: config_glite_saga_setenv
INFO
SAGA configuration is not required
INFO
Executing function: config_glite_saga
INFO
SAGA configuration is not required
INFO
Executing function: config_add_pool_env_setenv
INFO
Executing function: config_add_pool_env
INFO
Executing function: config_wn_info
WARNING
No subcluster has been defined for the WN in the WN_LIST file /root/wn-list.conf
WARNING
YAIM will use the default subcluster id: CE_HOST -> cream-01.cnaf.infn.it
INFO
Executing function: config_torque_client
INFO
starting pbs_mom...
Shutting down TORQUE Mom: pbs_mom already stopped [ OK ] Starting TORQUE Mom: [ OK ]
INFO
Configuration Complete. [ OK ]
INFO
YAIM terminated succesfully.
>
>
# /opt/glite/yaim/bin/yaim -c -s site-info_batch.def -n MPI_WN -n WN_torque_noafs -n GLEXEC_wn
 
Changed:
<
<
<--/twistyPlugin-->
>
>
  -- PaoloVeronesi - 2012-05-30
 
This site is powered by the TWiki collaboration platformCopyright © 2008-2020 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback