Devel10 Work Log
WMS 3.2 patch #3621 + patch CDF
2009-07-08 (Fabio)
- System reinstalled
- WMS installed
WMS 3.1 patch 1251
2007-11-27 (Ale)
- Updates CAs to version 1.18-1
2007-11-12 (Danilo)
- Start testing fix for bug 28235
installing:
- glite-jdl-api-cpp-3.1.12-1.i386.rpm
- glite-wms-helper-3.1.16-2.i386.rpm
- glite-wms-matchmaking-3.1.6-1.i386.rpm
2007-08-29 (Ale)
- Restarting the services
- Use "jobdir" instead of "filelist" for the WM queue setting on the WorkloadManager section of glite_wms.conf file:
- DispatcherType = "jobdir";
- Input = "${GLITE_LOCATION_VAR}/workload_manager/jobdir";
- and creating the directory:
- /var/glite/workload_manager/jobdir/tmp
- /var/glite/workload_manager/jobdir/new
- /var/glite/workload_manager/jobdir/old
2007-08-27 (Ale)
- Shutdown due to maintenance to electrical distribution.
2007-08-21 (Ale)
- Update rpms using the official repository at Cern
- glite-lb-client (2.3.4-1 => 2.3.5-1)
- glite-yaim-core (3.1.0-2.3 => 3.1.1-8)
- Restarting services
2007-07-23 (Ale)
- Reopen the WMS to test users.
2007-07-17 (Ale)
- Close for update!
- Reinstall the machine from scratch
- Install a new WMS service following these instructions
- The rpms installed are the ones of patch #1251
- Intstalled glite-wms-ice-3.1.16-1 rpm
- Added this cron-job to clean periodically LB databases (you need to fix first a silly bug on script $GLITE_LOCATION/sbin/glite-lb-export.sh):
[root@devel10 etc]# cat /etc/cron.d/lb-purger.cron
#! /bin/sh
GLITE_LB_EXPORT_BKSERVER="devel12.cnaf.infn.it"
# run every wednesday and sunday at 01:00
0 1 * * wed,sun glite . /etc/profile.d/grid-env.sh ; $GLITE_LOCATION/sbin/glite-lb-export.sh >> /var/log/glite/lb_purger.log 2>&1
2007-07-12 (Ale)
- Update the following rpms:
- glite-ce-cream-client-api-c (1.7.14-0 => 1.7.15-1)
- glite-ce-monitor-client-api-c (1.7.12-0 => 1.7.13-1)
- glite-wms-configuration (3.1.7-1 => 3.1.9-1)
- glite-wms-ice (3.1.16-1 => 3.1.17-1)
- glite-wms-wmproxy (3.1.26-2 => 3.1.27-1)
- lcg-CA (1.14-1 => 1.15-1)
- Bug fixed:
- #27856
: Multiple subscriptions for the same user can occur
- #27724
: Wms configuration files overwritten when updating rpm (glite_wms.conf)
- #27708
: WMProxy missing dependency on glite-security-lcmaps-plugins-basic
- Restarted ICE service
2007-07-03 (Ale)
- Set MaxOutputSandboxSize = -1 on glite_wms.conf as workaround for bug #27215
- Restart workload manager
2007-06-29 (Ale)
- Update rpms using patch #1203
- The bugs that have been fixed with this update are:
- #25680
: job submission --nodes-resource option does not work for dags/collections
- #26857
: The "max-rank" selection algorithm for collection under some particular circumstances does not work properly
- #26913
: The MM does not use information about previous matches retrying the same CEs
- #26705
: The standalone purger does not work anymore, due to the lack of proxy file.
- #27042
: When ICE starts executes a lease update even if start_lease_updater is false
- #26537
: ICE Fails to build on SLC4
- #27215
: WM to set the maximum output sandbox size
- #27126
: generation of unique filenames in jobdir is not reliable
- #27042
: When ICE starts executes a lease update even if start_lease_updater is false
- #26952
: WMProxy server does check mandatory attributes for collection after returning jobid to client
- #26968
: There's a memory leak in a method that extract the proxy time left
- Create a new cron entry for the purge:
HOME=/
MAILTO=root@localhost
# Execute the 'purger' command at every day except on Sunday with a frequency of one hour
# if and only if the percentage of allocated blocks is greater than 40%
0 */6 * * mon-sat glite . /etc/glite/profile.d/glite_setenv.sh ; $GLITE_LOCATION/sbin/glite-wms-purgeStorage.sh -l $GLITE_LOCATION_LOG/glite-wms-purgeStorage.log -p /var/glite/SandboxDir -t 604800 -a 40 > /dev/null
# Execute the 'purger' command at 4:00 AM, 8:00 AM, 12:00 noon, 4:00 PM,
# and 8:00 PM (0 */4) on each Sunday (sun).
0 */4 * * sun glite . /etc/glite/profile.d/glite_setenv.sh ; $GLITE_LOCATION/sbin/glite-wms-purgeStorage.sh -l $GLITE_LOCATION_LOG/glite-wms-purgeStorage.log -p /var/glite/SandboxDir -t 604800 > /dev/null
HOME=/
MAILTO=root@localhost
0 */2 * * * root /usr/sbin/logrotate -v /opt/glite/etc/wmproxy_logrotate.conf > /var/log/glite/logrotate.logs
- On glite_wms_wmproxy_httpd.conf comment these lines
#CustomLog "|/usr/sbin/rotatelogs ${GLITE_LOCATION_LOG}/httpd-wmproxy-access_%Y-%m-%d-%H.log 50M" combined
#ErrorLog "|/usr/sbin/rotatelogs ${GLITE_LOCATION_LOG}/httpd-wmproxy-errors_%Y-%m-%d-%H:%M.log 100M"
- and uncomment these ones:
CustomLog ${GLITE_LOCATION_LOG}/httpd-wmproxy-access.log combined
ErrorLog ${GLITE_LOCATION_LOG}/httpd-wmproxy-errors.log
- Reopen the WMS to test users.
2007-06-26 (Ale)
- Stop the services to debug a problem with the LBserver.
2007-05-31 (Ale)
- Reopen the WMS to test users.
2007-05-30 (Ale)
- Stop the services to update the rpms.
- Starting from rpms on patch #1167
I update these rpms:
- glite-wms-common_R_3_1_14_1 and glite-wms-configuration_R_3_1_6_1 to fix bug #26432
- glite-wms-manager_R_3_1_27_1 to fix a compilation problem
- glite-wms-wmproxy_R_3_1_25_1 to fix bugs #26586
, #26737
and #26237
- glite-wms-ism_R_3_1_14_1 to fix bug #26654
- Update also glite-wms-ice_R_3_1_13_1, glite-ce-cream-client-api-c and glite-ce-monitor-client-api-c
- A new BDII which contains a CREAM-CE (prod-ce-02.pd.infn.it:8443/cream-lsf-creamusr2) is now used: egee-bdii.cnaf.infn.it
2007-05-14 (Ale)
- Update glite-wms-jobsubmission rpm using tag: glite-wms_R_3_1_60_1 (see bug #23401
)
- Reopen the WMS to test users.
2007-05-11 (Ale)
- It passed the usual test... Success > 99%
2007-05-10 (Ale)
- Update rpms using these tags:
- org.glite.lb.version = glite-lb_R_1_4_5_2
- org.glite.wms.version = glite-wms_R_3_1_59_1
- Bug Fixed
- #26269
: JC locks the filelist without giving to WM possibility to submit new requests
- #23401
: Job failure - gethostbyname error in condor submission
- #22795
: timer-log file are not removed by LM
- #26208
: LM stopped on bad SizeFile object
- #26157
: The WM dies while processing a collection with pending nodes
- #26213
: Handling of LB errors needs to be improved
- #25767
: Purging does not work on collections.
- #26267
: The purger does not work: creation of lb context always fails.
- #26250
: Wms client timeout approach does not work properly
- #25677
: the LB cannot handle decimal numbers in the quantity field used to log resource usage
2007-05-03 (Ale)
- Stop the services to update the rpms.
- Change apt source list: rpm http://goldrake.cnaf.infn.it:8080/ibrido/archives/glite_branch_3_1_0_continuous/repository
. i386 noarch
- Update rpms using these tags:
- org.glite.ce.version = glite-ce_R_1_7_13_0
- org.glite.jdl.version = glite-jdl_R_3_1_11_1
- org.glite.lb.version = glite-lb_R_1_4_4_1
- org.glite.security.version = glite-security_R_3_1_38_1
- org.glite.wms-utils.version = glite-wms-utils_R_3_1_8
- org.glite.wms.version = glite-wms_R_3_1_58_1
- The fixes introduced by the new LB tag are for these bugs:
- #25872
: glite-lb-bkserverd looping in malloc_consolidate()
- #25677
: the LB cannot handle decimal numbers in the quantity field used to log resource usage
- Removed old logs files and Sandboxes
2007-04-19 (Ale)
- Changed the vomses file (/opt/glite/etc/vomses) to add new voms server and the cms VO for proxy reneval:
"ops" "lcg-voms.cern.ch" "15009" "/C=CH/O=CERN/OU=GRID/CN=host/lcg-voms.cern.ch" "ops"
"dteam" "lcg-voms.cern.ch" "15004" "/C=CH/O=CERN/OU=GRID/CN=host/lcg-voms.cern.ch" "dteam"
"atlas" "lcg-voms.cern.ch" "15001" "/C=CH/O=CERN/OU=GRID/CN=host/lcg-voms.cern.ch" "atlas"
"cms" "lcg-voms.cern.ch" "15002" "/C=CH/O=CERN/OU=GRID/CN=host/lcg-voms.cern.ch" "cms"
"dteam" "voms101.cern.ch" "15004" "/DC=ch/DC=cern/OU=computers/CN=voms.cern.ch" "dteam"
"atlas" "voms101.cern.ch" "15001" "/DC=ch/DC=cern/OU=computers/CN=voms.cern.ch" "atlas"
"ops" "voms101.cern.ch" "15009" "/DC=ch/DC=cern/OU=computers/CN=voms.cern.ch" "ops"
"cms" "voms101.cern.ch" "15002" "/DC=ch/DC=cern/OU=computers/CN=voms.cern.ch" "cms"
2007-04-05 (Ale)
- Update the glite-wms-classad_plugin rpm (glite-wms-classad_plugin-3.1.5-1) to fix bug 25125
: problem with FQAN VOViews.
- Add the last fix for the InterLogger problem (glite-lb-logger-1.4.2-1).
- Update the glite-wms-ice rpm (glite-wms-ice-3.1.9-1) to fix bug 25275
: fails to build on SLC4.
- Change the frequency at which the purger in cron is running: from every 1hour to every 6 hours
- Do a preliminary test using only LCG-CE (Requirements = RegExp("\/blah-",other.GlueCEUniqueID)) submitting simple jobs (81 collections of 100 jobs each):
- Success: 7851 (96.9%)
- Aborted: 249 (3.1%) (Proxy expired)
- Reopen the WMS to test users.
2007-04-02 (Ale)
- Stop the services to update the rpms.
- Update rpms using these tags:
- glite-jdl_R_3_1_11_1
- glite-lb_R_1_4_1_1
- glite-security_R_3_1_35_1
- glite-wms-utils_R_3_1_8
- glite-wms_R_3_1_48_1
- Change the parameter in the WorkloadManager session: EnableBulkMM = true; (i.e. the wms is now dagless as devel09)
- The new LB is able to recognize the dagless-collection
2007-03-20
2007-03-16 (Ale)
- Stop the services to update the rpms.
- Give the commands apt-get update and apt-get dist-upgrade
- the ca_* rpms are update from 1.12-1 => 1.13-1
- Now the most interestings installed tags are:
- glite-jdl_R_3_1_10_1
- glite-jp_R_1_3_5_1
- glite-lb_R_1_3_7_3
- glite-security_R_3_1_33_1
- glite-wms-utils_R_3_1_8
- glite-wms_R_3_1_43_1
- Restarted the services
- Do a preliminary test using only LCG-CE (Requirements = RegExp("\/blah-",other.GlueCEUniqueID)) submitting 1700 simple jobs (17 collections):
- Reopen the WMS to test users.
2007-03-08 (Ale)
2007-03-06
- Installed normal 3.0 WMS
- Removed c-ares. Installed c-ares from the cern cert apt repository
- Inserted in apt source lists the goldrake repository: rpm http://goldrake.cnaf.infn.it:8080/ibrido/archives/glite_branch_3_1_0_continuous/repository
. i386 noarch
- apt-get remove glite-WMS; apt-get remove glite-wms-manager-ns-daemon; apt-get dist-upgrade
- replaced gacl and grid-mapfile
- cp /opt/c-ares/lib/* /lib
- Installed condor 6.8.4: rpm -Uvh condor-6.8.4-linux-x86-rhel3-dynamic-1.i386.rpm
- ln -s /opt/condor-6.8.4/ condor-c (and modify all conf files of condor and setenv to use this link)
- rpm -Uvh google-perftools-*
- Create file /etc/nospma to avoid SPMA downgrade
- Add following lines into /opt/condor-c/local.devel10/condor_config.local:
NEGOTIATOR_MATCHLIST_CACHING = False
GRIDMANAGER_TIMEOUT_MULTIPLIER = 3
SCHEDD_TIMEOUT_MULTIPLIER = 3
COLLECTOR_TIMEOUT_MULTIPLIER = 3
C_GAHP_TIMEOUT_MULTIPLIER = 3
C_GAHP_WORKER_THREAD_TIMEOUT_MULTIPLIER = 3
TOOL_TIMEOUT_MULTIPLIER = 3
GLITE_CONDORC_DEBUG_LEVEL = 2
GLITE_CONDORC_LOG_DIR = /var/tmp
- To apply the patch #1026
: "WMProxy memory allocation doesn't increase anymore" one needs to:
- Add "export GLITE_WMS_WMPROXY_MAX_SERVED_REQUESTS=50" into /etc/glite/profile.d/glite_setenv.sh
- Add "setenv GLITE_WMS_WMPROXY_MAX_SERVED_REQUESTS 50" into /etc/glite/profile.d/glite_setenv.csh
- In /opt/glite/etc/glite_wms_wmproxy_httpd.conf, replace FastCgiConfig line with: "FastCgiConfig -restart -restart-delay 5 -idle-timeout 3600 -maxProcesses 25 -maxClassProcesses 20 -minProcesses 2 -listen-queue-depth 200 -gainValue 0.75 -killInterval 240 -updateInterval 240 -singleThreshold 15 -initial-env GLITE_WMS_WMPROXY_MAX_SERVED_REQUESTS -initial-env LD_LIBRARY_PATH -initial-env GLITE_LOCATION_VAR -initial-env GLITE_LOCATION_LOG -initial-env GLITE_LOCATION_TMP -initial-env RGMA_HOME -initial-env GLITE_SD_VO -initial-env GLITE_SD_PLUGIN -initial-env LCG_GFAL_INFOSYS -initial-env HOSTNAME -initial-env GLITE_WMS_WMPROXY_WEIGHTS_UPPER_LIMIT"
- Add at then end of the PassEnv section of file /opt/glite/etc/glite_wms_wmproxy_httpd.conf the following two lines:
PassEnv GLITE_WMS_WMPROXY_MAX_SERVED_REQUESTS
PassEnv GLITE_PR_TIMEOUT
- In the WM section of glite_wms.conf, add
CeForwardParameters = { "GlueHostMainMemoryVirtualSize", "GlueHostMainMemoryRAMSize" };
- cp /opt/glite/etc/lcmaps/lcmaps.db.template /opt/glite/etc/lcmaps/lcmaps.db
- mkdir /var/glite/ice; mkdir /var/glite/icepersist_dir; chown -R glite.glite /var/glite/ice
- in /etc/cron.d/glite-wms-wmproxy-purge-proxycache.cron, change glite_wms_wmproxy_purge_proxycache to glite-wms-wmproxy-purge-proxycache.
- Modified some glite_wms.conf parameters to sync with cern services (see attached files)
- Restart all the services
--
AlessioGianelle - 13 Nov 2007