Tags:
, view all tags

WMS 3.4 EMI2 precertification report

Testing server on: devel09.cnaf.infn.it (SL5)

28/03/2012 Using repository http://etics-repository.cern.ch/repository/pm/volatile/repomd/id/d0831cfb-9bcf-4588-8a6d-edf8fbd7d3b3/sl5_x86_64_gcc412EPEL

23/02/2012 Using repository http://etics-repository.cern.ch/repository/pm/volatile/repomd/id/bbf07458-a777-4062-94e3-409a2b481cac/sl5_x86_64_gcc412EPEL

21/03/2012 Using repository http://etics-repository.cern.ch/repository/pm/volatile/repomd/id/73c67e19-8837-4ced-ab48-c7cadefdf589/sl5_x86_64_gcc412EPEL

Testing client on: devel15.cnaf.infn.it (SL5)

23/04/2012 Using repository http://etics-repository.cern.ch/repository/pm/volatile/repomd/id/1651ef71-bd32-48bb-9eb7-692297c56ad1/sl5_x86_64_gcc412EPEL

18/04/2012 Using repository http://etics-repository.cern.ch/repository/pm/volatile/repomd/id/9bca1e56-da1c-4b92-8a96-28bbe62a21b9/sl5_x86_64_gcc412EPEL

18/04/2012 UI Installation report (excerpts)

Upgraded from EMI-1 UI:

[root@devel15 mcecchi]# head -30 /etc/yum.repos.d/ui_emi2.repo 
# ETICS Automatic Repomd (Yum) Repository
#
# Submission ID: 9bca1e56-da1c-4b92-8a96-28bbe62a21b9
# Platform: sl5_x86_64_gcc412EPEL
# Date: 18/04/2012 11:28:33
#
# Project Name: emi
# Configuration Name: emi-wms-ui_B_3_4
# Configuration Version: 3.4.99-0
#
# Configuration Version: 3.4.99-0
#
# Build Reports: http://etics-repository.cern.ch/repository/reports/id/9bca1e56-da1c-4b92-8a96-28bbe62a21b9/sl5_x86_64_gcc412EPEL/-/reports/index.html
#
# Author: CN=Paolo Andreetto, L=Padova, OU=Personal Certificate, O=INFN, C=IT

[ETICS-volatile-build-9bca1e56-da1c-4b92-8a96-28bbe62a21b9-sl5_x86_64_gcc412EPEL]
name=ETICS build of emi-wms-ui_B_3_4 on sl5_x86_64_gcc412EPEL
baseurl=http://etics-repository.cern.ch/repository/pm/volatile/repomd/id/9bca1e56-da1c-4b92-8a96-28bbe62a21b9/sl5_x86_64_gcc412EPEL
protect=0
enabled=1
gpgcheck=0
priority=40

# 31 packages available in this repository:
#
# emi-delegation-interface (2.0.3-1.sl5)
# http://etics-repository.cern.ch/repository/uuid/volatile/c8fcf5d9-5667-4e34-b574-eaeea26fc503/emi-delegation-interface-2.0.3-1.sl5.noarch.rpm
#
# emi-delegation-java (2.2.0-2.sl5)

[root@devel15 mcecchi]# head -30 /etc/yum.repos.d/emi-2-rc4-sl5.repo 
[core]
name=name=SL 5 base
baseurl=http://linuxsoft.cern.ch/scientific/5x/$basearch/SL
   http://ftp.scientificlinux.org/linux/scientific/5x/$basearch/SL
        http://ftp1.scientificlinux.org/linux/scientific/5x/$basearch/SL
        http://ftp2.scientificlinux.org/linux/scientific/5x/$basearch/SL
protect=0

[extras]
name=epel
mirrorlist=http://mirrors.fedoraproject.org/mirrorlist?repo=epel-5&arch=$basearch
protect=0

[EGI-trustanchors]
name=EGI-trustanchors
baseurl=http://repository.egi.eu/sw/production/cas/1/current/
gpgkey=http://repository.egi.eu/sw/production/cas/1/GPG-KEY-EUGridPMA-RPM-3
gpgcheck=1
enabled=1

[EMI-2-RC4-base]
name=EMI 2 RC4 Base Repository
baseurl=http://emisoft.web.cern.ch/emisoft/dist/EMI/2/RC4/sl5/$basearch/base
gpgkey=http://emisoft.web.cern.ch/emisoft/dist/EMI/2/RPM-GPG-KEY-emi
priority=45
protect=0
enabled=1
gpgcheck=0

[EMI-2-RC4-third-party]

[root@devel15 mcecchi]# rpm -qa | grep glite-wms
glite-wms-utils-exception-3.3.0-2.sl5
glite-wms-ui-api-python-3.4.99-0.sl5
glite-wms-ui-commands-3.4.99-0.sl5
glite-wms-brokerinfo-access-lib-3.4.99-0.sl5
glite-wms-wmproxy-api-java-3.4.99-0.sl5
glite-wms-wmproxy-api-python-3.4.99-0.sl5
glite-wms-utils-classad-3.3.0-2.sl5
glite-wms-wmproxy-api-cpp-3.4.99-0.sl5
glite-wms-brokerinfo-access-3.4.99-0.sl5
[root@devel15 mcecchi]# 

Deployment/Configuration/Installation notes (for SERVER)

Condor version: WILL BE 7.8.0

- RPM successfully generated from patched source tarball

- Manually installed Condor 7.8.0 from. Good news, it is binary compatible with 7.4.2

Siteinfo

site_info.def: had to remove post config: <siteinfo-dir>/services/glite-wms which was injecting 'hidden' configuration values

LCMAPS needs configuration

In /etc/lcmaps/lcmaps.db.gridftp, /etc/lcmaps/lcmaps.db replace:

path = /usr/lib64/modules ---> path = /usr/lib64/lcmaps

and several misconfiguration errors in yaim-core (yaim for lcas-lcmaps-gt4-interface) that cause:

Mar 21 12:56:10 devel09 glite_wms_wmproxy_server: lcmaps: (null): LCMAPS
initialization failure
Mar 21 12:56:48 devel09 glite_wms_wmproxy_server: lcmaps:
/etc/lcmaps/lcmaps.db:53: [error] variable 'good' already defined at
line 9;
Mar 21 12:56:48 devel09 glite_wms_wmproxy_server: lcmaps:
/etc/lcmaps/lcmaps.db:53: [error] pervious value: 'lcmaps_dummy_good.mod'.
Mar 21 12:56:48 devel09 glite_wms_wmproxy_server: lcmaps:
/etc/lcmaps/lcmaps.db:56: [error] variable 'localaccount' already
defined at line 11;
Mar 21 12:56:48 devel09 glite_wms_wmproxy_server: lcmaps:
/etc/lcmaps/lcmaps.db:56: [error] pervious value:
'lcmaps_localaccount.mod -gridmapfile /etc/grid-security/grid-mapfile'.
Mar 21 12:56:48 devel09 glite_wms_wmproxy_server: lcmaps:
/etc/lcmaps/lcmaps.db:61: [error] variable 'poolaccount' already defined
at line 14;

LB

mode=both: PASS

mode=server: PASS

mcecchi 17/05/2012. Tried with devel07 as EMI-2 LB server.

[mcecchi@ui ~]$ glite-wms-job-submit -a --endpoint https://devel09.cnaf.infn.it:7443/glite_wms_wmproxy_server ls.jdl 

Connecting to the service https://devel09.cnaf.infn.it:7443/glite_wms_wmproxy_server


====================== glite-wms-job-submit Success ======================

The job has been successfully submitted to the WMProxy
Your job identifier is:

https://devel07.cnaf.infn.it:9000/YSm5s0kWCCZLjPjF2cp4vw

==========================================================================


[mcecchi@ui ~]$ glite-wms-job-logging-info https://devel07.cnaf.infn.it:9000/YSm5s0kWCCZLjPjF2cp4vw

===================== glite-job-logging-info Success =====================

LOGGING INFORMATION:

Printing info for the Job : https://devel07.cnaf.infn.it:9000/YSm5s0kWCCZLjPjF2cp4vw
 
   ---
Event: RegJob
- Source                     =    NetworkServer
- Timestamp                  =    Thu May 17 17:38:24 2012 CEST
   ---
Event: Accepted
- Source                     =    NetworkServer
- Timestamp                  =    Thu May 17 17:38:24 2012 CEST
   ---
Event: EnQueued
- Result                     =    START
- Source                     =    NetworkServer
- Timestamp                  =    Thu May 17 17:38:24 2012 CEST
   ---
Event: EnQueued
- Result                     =    OK
- Source                     =    NetworkServer
- Timestamp                  =    Thu May 17 17:38:24 2012 CEST
   ---
Event: DeQueued
- Source                     =    WorkloadManager
- Timestamp                  =    Thu May 17 17:38:25 2012 CEST
   ---
Event: Pending
- Source                     =    WorkloadManager
- Timestamp                  =    Thu May 17 17:38:26 2012 CEST
==========================================================================

Major changes (only for SERVER, under discussion)

* GLUE2 purchasing

- no storage attributes as of Feb 2012

- no backwards compatibility

- unico ISM e si procede nel modo seguente:

a) EnableIsmIiGlue13Purchasing== true & EnableIsmIiGlue20Purchasing == false: tutto resta com'č sempre stato

b) EnableIsmIiGlue13Purchasing = false & EnableIsmIiGlue20Purchasing = true: supporto G2 attivo solo JDL compliant con G2 potranno essere matchati

c) EnableIsmIiGlue13Purchasing = true & EnableIsmIiGlue13Purchasing = true:

- suggestion is not to enable both GLUE 1.3 and 2.0 purchasers at the same time. If done, WmsRequirements expression in the configuration must be revised in order tyo include, in or, also GLUE 2.0 queues.

il purchaser G13 DEVE essere eseguito prima dei quello G2. In questo modo quando parte il purchaser G2 ISM č gią pieno delle info G13 quindi parte il purchser per G2 al momento dell'inserimento dell'i-esimo Ad G2 si possono presentare i seguenti:

c1) esiste gią in ISM un G13Ad con id = G2Ad.id: allora si fa semplicemente un C13.Ad.Update(G2Ad) (di fatto merge). La risorsa risultante sarą matchabile con requirements G2 o G13 indipendentemente inoltre, in questo modo, anche la parte di storage continuerą a funzionare purchč eventuali attributi nel JDL riguardanti lo storage siano espressi in G13.

c2) in ISM non c'e' un AD con lo stesso G2Ad.id: in questo caso non facciamo alcun update ma un bel insert dell'ad che sarą matchabile solo con requirements G2 compliant nel JDL.

EnableIsmIiGlue13Purchasing = true/false;

EnableIsmIiGlue13Purchasing= true/false;

* Argus authZ for access control

for the moment only openssl old DN format is accepted.

* dagmanless DAGs

* Condor 7.8.0

It will fully enable submission to GRAM5 CEs. The feature requires testing submission to LCG-CE, ARC CE, OSG gram2 and gram5

* refactoring of authN/Z in wmp

requires to test authN/Z at large, not only for Argus

* support for RFC proxies (bug #88128)

LCG-CE not expected to support them. OSG, ARC and CREAM CEs will.

Conf. Changes

- removed DagmanLogLevel

- removed DagmanLogRotate

- removed DagmanMaxPre

- removed MaxDAGRunningNodes

- removed wmp tools

- removed par. bulkMM

- removed par. filelist

- removed par. locallogger in wmp

- removed asynch purchasing

- added EnableIsmIiGlue20Purchasing and EnableIsmIiGlue13Purchasing

- added filter ext g2 IsmIiG2LDAPCEFilterExt

- added attributes for .so in dlopen (helper, purchasers)

- enabled SbRetryDifferentProtocols by default

- par. MatchRetryPeriod reduced 600 -> 300. It now also indicates the time interval with which DAGs are evaluated by the WM engine

- SbRetryDifferentProtocols = true by default in WM conf

- WmsRequirements now has also the queue requirements, that were hard-coded before

LIST OF BUGS

Server

JobController logfile name is misspelled (bug #32611, gLite Middleware, ) Yes / Done
glite-wms-job-submit doesn't always pick up other WMProxy endpoints if load on WMS is high (bug #40370, gLite Middleware, ) Yes / Done
[wms] GlueServiceStatusInfo content is ugly (bug #48068, gLite Middleware, ) Yes / Done
[ yaim-wms ] CeForwardParameters should include several more parameters (bug #61315, gLite Middleware, ) Yes / Done
WMS needs cron job to kill stale GridFTP processes (bug #67489, gLite Middleware, ) Yes / Done
WMProxy code requires FQANs (bug #72169, gLite Middleware, ) HOPEFULLY FIXED
WMProxy limiter should log more at info level (bug #72280, gLite Middleware, ) Yes / Done
There's an un-catched out_of_range exception in the ICE component (bug #75099, gLite Middleware, ) Yes / Done
WMS information system usage is case-sensitive (bug #77282, gLite Middleware, ) No fix not yet implemented
ICE jobdir issue - 1 bad CE can block all jobs (bug #80751, gLite Middleware, ) Yes / Done
Cancellation of a dag's node doesn't work (bug #81651, gLite Middleware, )
Deregistration of a proxy (2) (bug #83453, gLite Middleware, )
Last LB event logged by ICE when job aborted for proxy expired should be ABORTED (bug #84839, gLite Middleware, ) Yes / Done
queryDb has 2 bugs handling user's options (see ggus ticket for more info) (bug #86267, gLite Middleware, ) Yes / Done
WMproxy GACLs do not support wildcards (as they used to do) (bug #87261, gLite Middleware, ) Yes / Done
Submission with rfc proxy doesn't work (bug #88128, gLite Middleware, ) Yes / Done
Semi-automated service backends configuration for WMS (task #23845, EMI Development Tracker, Done) Yes / Done
GlueServiceStatusInfo: ?? (bug #89435, gLite Middleware, ) Yes / Done
EMI WMS wmproxy rpm doesn't set execution permissions as it used to do in gLite (bug #89506, gLite Middleware, ) Yes / Done
EMI WMS WM might abort resubmitted jobs (bug #89508, gLite Middleware, )
EMI WMS wmproxy init.d script stop/start problems (bug #89577, gLite Middleware, ) Yes / Done
glite-wms-check-daemons.sh should not restart daemons under the admin's nose (bug #89674, gLite Middleware, ) Yes / Done
Wrong location for PID file (bug #89857, gLite Middleware, ) Yes / Done
WMS logs should keep track of the last 90 days (bug #89871, gLite Middleware, ) Yes / Done
LB failover mechanism in WMproxy needs to be reviewed (bug #90034, gLite Middleware, )
yaim-wms creates wms.proxy in wrong path (bug #90129, gLite Middleware, ) Yes / Done
cron job deletes /var/proxycache (bug #90640, gLite Middleware, ) Yes / Done
yaim-wms changes for Argus based authZ (bug #90760, gLite Middleware, ) Yes / Done
ICE should use env vars in its configuration (bug #90830, gLite Middleware, ) Yes / Done
ICE log verbosity should be reduced to 300 (bug #91078, gLite Middleware, ) Yes / Done
Make some WMS init scripts System V compatible (bug #91115, gLite Middleware, ) Yes / Done
move lcmaps.log from /var/log/glite to WMS_LOCATION_LOG (bug #91484, gLite Middleware, ) Yes / Done
WMS: use logrotate uniformly in ice, lm, jc, wm, wmp (bug #91486, gLite Middleware, ) Yes / Done
remove several dismissed parameters from the WMS configuration (bug #91488, gLite Middleware, ) Yes / Done
Pid file of ICE and WM has glite ownership (bug #91834, gLite Middleware, ) Yes / Done
The job replanner should be configurable (bug #91941, gLite Middleware, None) Yes / Done
some sensible information should be logged on syslog (bug #92657, gLite Middleware, ) Yes / Done
EMI-1 WMS does not propagate user job exit code (bug #92922, gLite Middleware, None) No tested with new jobsubmission, but fails

UI

glite-wms-job-status needs a better handing of purged-related error code. (bug #85063, gLite Middleware, ) Yes / Done
WMS UI depends on a buggy libtar (on SL5 at least) (bug #89443, gLite Middleware, ) Yes / Done mcecchi 18/04/12 WARNING: [mcecchi@devel15 ~]$ ldd /usr/bin/glite-wms-job-submit 'pipe' grep libtar, gives: libtar.so.1 => /usr/lib64/libtar.so.1 (0x000000350c000000), the dependency still exists
getaddrinfo() sorts results according to RFC3484, but random ordering is lost (bug #82779, gLite Middleware, ) Yes / Done
glite-wms-job-status needs a json-compliant format (bug #82995, gLite Middleware, ) Yes / Done
Files specified with absolute paths shouldn't be used with inputsandboxbaseuri (bug #74832, gLite Middleware, ) Yes / Done
Too much flexibility in JDL syntax (bug #75802, gLite Middleware, ) Yes / Done
glite-wms-job-list-match --help show an un-implemented (and useless) option "--default-jdl" (bug #87444, gLite Middleware, ) Yes / Done
WMS-UI: update "KNOWN PROBLEMS AND CAVEATS" section of WMPROXY guide (bug #90003, gLite Middleware, ) Yes / Done
WMS UI emi-wmproxy-api-cpp and emi-wms-ui-api-python still use use gethostbyaddr/gethostbyname (bug #89668, gLite Middleware, ) Yes / Done
pkg-config info for wmproxy-api-cpp should be enriched (bug #85799, gLite Middleware, ) Yes / Done

UI BASIC FUNCTIONALITY TESTS

23/04/2012:

[mcecchi@devel15 ~]$ glite-wms-job-logging-info --debug --logfile log.txt --output output.txt https://emitb1.ics.muni.cz:9000/q6tmTtMPfuXAN1xY8YPXGA 

VirtualOrganisation value :dteam
####
Configuration file loaded: //etc/glite_wmsui_cmd_var.conf 
 [
 ]
#### Mon Apr 23 12:33:07 2012 Debug Message ####
Selected Virtual Organisation name (from proxy certificate extension): dteam
VOMS configuration file successfully loaded:

 [
 ]
#### End Debug ####

**** Error: UI_GENERIC_ERROR_ON_JOB_ID ****  
Error retrieving information on JobID "https://emitb1.ics.muni.cz:9000/q6tmTtMPfuXAN1xY8YPXGA". 
Error description: Unable to retrieve the Job Events for: https://emitb1.ics.muni.cz:9000/q6tmTtMPfuXAN1xY8YPXGA
glite.lb.Exception: edg_wll_JobLog: No such file or directory: no matching events found
   at glite::lb::Job::log[./src/Job.cpp:123]



                           *** Log file created ***
Possible Errors and Debug messages have been printed in the following file:
/home/mcecchi/log.txt

[mcecchi@devel15 ~]$ glite-wms-job-status --debug https://emitb1.ics.muni.cz:9000/0GK5XFRiKmSEs8jBY_2Alg 

VirtualOrganisation value :dteam
####
Configuration file loaded: //etc/glite_wmsui_cmd_var.conf 
 [
 ]
#### Mon Apr 23 12:33:18 2012 Debug Message ####
Selected Virtual Organisation name (from proxy certificate extension): dteam
VOMS configuration file successfully loaded:

 [
 ]
#### End Debug ####

#### Mon Apr 23 12:33:18 2012 Debug API ####
The function 'Job::getStatus' has been called with the following parameter(s):
>> https://emitb1.ics.muni.cz:9000/0GK5XFRiKmSEs8jBY_2Alg
>> 1
#### End Debug ####

**** Error: API_NATIVE_ERROR ****  
Error while calling the "Job:getStatus" native api 
Unable to retrieve the status for: https://emitb1.ics.muni.cz:9000/0GK5XFRiKmSEs8jBY_2Alg
glite.lb.Exception: edg_wll_JobStatus: Operation not permitted: matching jobs found but authorization failed
   at glite::lb::Job::status[./src/Job.cpp:87]




                           *** Log file created ***
Possible Errors and Debug messages have been printed in the following file:
/tmp/glite-wms-job-status_505_7423_1335177198.log

18/04/2012:


[mcecchi@devel15 ~]$ voms-proxy-info --all
subject   : /C=IT/O=INFN/OU=Personal Certificate/L=CNAF/CN=Marco Cecchi/CN=proxy
issuer    : /C=IT/O=INFN/OU=Personal Certificate/L=CNAF/CN=Marco Cecchi
identity  : /C=IT/O=INFN/OU=Personal Certificate/L=CNAF/CN=Marco Cecchi
type      : proxy
strength  : 1024 bits
path      : /tmp/x509up_u505
timeleft  : 46:41:06
key usage : Digital Signature, Key Encipherment, Data Encipherment
=== VO dteam extension information ===
VO        : dteam
subject   : /C=IT/O=INFN/OU=Personal Certificate/L=CNAF/CN=Marco Cecchi
issuer    : /C=GR/O=HellasGrid/OU=hellasgrid.gr/CN=voms.hellasgrid.gr
attribute : /dteam/Role=NULL/Capability=NULL
attribute : /dteam/NGI_IT/Role=NULL/Capability=NULL
timeleft  : 22:41:06
uri       : voms.hellasgrid.gr:15004
[mcecchi@devel15 ~]$ 
[mcecchi@devel15 ~]$ glite-wms-job-delegate-proxy -d mcecchi --endpoint https://wms013.cnaf.infn.it:7443/glite_wms_wmproxy_server

Connecting to the service https://wms013.cnaf.infn.it:7443/glite_wms_wmproxy_server


================== glite-wms-job-delegate-proxy Success ==================

Your proxy has been successfully delegated to the WMProxy(s):
https://wms013.cnaf.infn.it:7443/glite_wms_wmproxy_server
with the delegation identifier: mcecchi

==========================================================================

[mcecchi@devel15 ~]$ glite-wms-job-info -d mcecchi --endpoint https://wms013.cnaf.infn.it:7443/glite_wms_wmproxy_server

Connecting to the service https://wms013.cnaf.infn.it:7443/glite_wms_wmproxy_server


======================= glite-wms-job-info Success =======================

Your proxy delegated to the endpoint https://wms013.cnaf.infn.it:7443/glite_wms_wmproxy_server
with delegationID mcecchi: 

Subject     : /C=IT/O=INFN/OU=Personal Certificate/L=CNAF/CN=Marco Cecchi/CN=proxy/CN=proxy
Issuer      : /C=IT/O=INFN/OU=Personal Certificate/L=CNAF/CN=Marco Cecchi/CN=proxy
Identity    : /C=IT/O=INFN/OU=Personal Certificate/L=CNAF/CN=Marco Cecchi/CN=proxy
Type        : proxy
Strength    : 512
StartDate   : 18 Apr 2012 - 14:38:44
Expiration  : 20 Apr 2012 - 13:03:44
Timeleft    : 1 days 22 hours 19 min 12 sec 
=== VO dteam extension information ===
VO          : dteam
Subject     : /C=IT/O=INFN/OU=Personal Certificate/L=CNAF/CN=Marco Cecchi
Issuer      : /C=GR/O=HellasGrid/OU=hellasgrid.gr/CN=voms.hellasgrid.gr
URI         : voms.hellasgrid.gr:15004
Attribute   : /dteam/Role=NULL/Capability=NULL
Attribute   : /dteam/NGI_IT/Role=NULL/Capability=NULL
StartTime   : 18 Apr 2012 - 13:04:36
Expiration  : 19 Apr 2012 - 13:04:36
Timeleft    : 22 hours 20 min 04 sec 

==========================================================================


[mcecchi@devel15 ~]$ cat zipped_isb.jdl 
[
Executable = "/bin/echo";
EnableZIppedISB=true;
a=[b=23];
Arguments = "Hello";
StdOutput = "out.log";
StdError = "err.log";
InputSandbox = {"Test.sh"};
OutputSandbox = {"out.log", "err.log"};
requirements = !RegExp("cream.*", other.GlueCEUniqueID);;
AllowZippedISB = true;
rank=a.b*3;
myproxyserver="";
#myproxyserver="myproxy.cnaf.infn.it";
RetryCount = 0;
ShallowRetryCount = -1;
]

[mcecchi@devel15 ~]$ glite-wms-job-submit --version

WMS User Interface version  3.3.3
Copyright (C) 2008 by ElsagDatamat SpA


Connecting to the service https://devel09.cnaf.infn.it:7443/glite_wms_wmproxy_server


WMProxy Version: 3.3.1

[mcecchi@devel15 ~]$ glite-wms-job-submit -d mcecchi --endpoint https://wms013.cnaf.infn.it:7443/glite_wms_wmproxy_server zipped_isb.jdl 

Connecting to the service https://wms013.cnaf.infn.it:7443/glite_wms_wmproxy_server


====================== glite-wms-job-submit Success ======================

The job has been successfully submitted to the WMProxy
Your job identifier is:

https://wms013.cnaf.infn.it:9000/hEQOXd73TkRGYlkCQDxOmA

==========================================================================

[mcecchi@devel15 ~]$ glite-wms-job-status --json https://wms013.cnaf.infn.it:9000/hEQOXd73TkRGYlkCQDxOmA

{ "result": "success" , "https://wms013.cnaf.infn.it:9000/hEQOXd73TkRGYlkCQDxOmA": { "Current Status": "Done(Success)", "Exit code": "0", "Status Reason": "Job terminated successfully", "Destination": "ceprod03.grid.hep.ph.ic.ac.uk:2119/jobmanager-sge-long", "Submitted": "Wed Apr 18 14:26:51 2012 CEST", "Done": "1334752083"}   }
[mcecchi@devel15 ~]$ glite-wms-job-status https://wms013.cnaf.infn.it:9000/hEQOXd73TkRGYlkCQDxOmA


======================= glite-wms-job-status Success =====================
BOOKKEEPING INFORMATION:

Status info for the Job : https://wms013.cnaf.infn.it:9000/hEQOXd73TkRGYlkCQDxOmA
Current Status:     Done(Success)
Exit code:          0
Status Reason:      Job terminated successfully
Destination:        ceprod03.grid.hep.ph.ic.ac.uk:2119/jobmanager-sge-long
Submitted:          Wed Apr 18 14:26:51 2012 CEST
==========================================================================
[mcecchi@devel15 ~]$ glite-wms-job-logging-info https://wms013.cnaf.infn.it:9000/hEQOXd73TkRGYlkCQDxOmA

===================== glite-wms-job-logging-info Success =====================

LOGGING INFORMATION:

Printing info for the Job : https://wms013.cnaf.infn.it:9000/hEQOXd73TkRGYlkCQDxOmA
 
   ---
Event: RegJob
- Source                     =    NetworkServer
- Timestamp                  =    Wed Apr 18 14:26:51 2012 CEST
   ---
Event: Accepted
- Source                     =    NetworkServer
- Timestamp                  =    Wed Apr 18 14:26:52 2012 CEST
   ---
Event: EnQueued
- Result                     =    START
- Source                     =    NetworkServer
- Timestamp                  =    Wed Apr 18 14:26:52 2012 CEST
   ---
Event: EnQueued
- Result                     =    OK
- Source                     =    NetworkServer
- Timestamp                  =    Wed Apr 18 14:26:52 2012 CEST
   ---
Event: DeQueued
- Source                     =    WorkloadManager
- Timestamp                  =    Wed Apr 18 14:26:53 2012 CEST
   ---
Event: Match
- Dest id                    =    ceprod03.grid.hep.ph.ic.ac.uk:2119/jobmanager-sge-long
- Source                     =    WorkloadManager
- Timestamp                  =    Wed Apr 18 14:26:53 2012 CEST
   ---
Event: UserTag
- Source                     =    WorkloadManager
- Timestamp                  =    Wed Apr 18 14:26:53 2012 CEST
   ---
Event: EnQueued
- Result                     =    START
- Source                     =    WorkloadManager
- Timestamp                  =    Wed Apr 18 14:26:53 2012 CEST
   ---
Event: EnQueued
- Result                     =    OK
- Source                     =    WorkloadManager
- Timestamp                  =    Wed Apr 18 14:26:53 2012 CEST
   ---
Event: DeQueued
- Source                     =    JobController
- Timestamp                  =    Wed Apr 18 14:26:54 2012 CEST
   ---
Event: Transfer
- Destination                =    LogMonitor
- Result                     =    START
- Source                     =    JobController
- Timestamp                  =    Wed Apr 18 14:26:54 2012 CEST
   ---
Event: Transfer
- Destination                =    LogMonitor
- Result                     =    OK
- Source                     =    JobController
- Timestamp                  =    Wed Apr 18 14:26:54 2012 CEST
   ---
Event: Accepted
- Source                     =    LogMonitor
- Timestamp                  =    Wed Apr 18 14:26:57 2012 CEST
   ---
Event: Transfer
- Destination                =    LRMS
- Result                     =    OK
- Source                     =    LogMonitor
- Timestamp                  =    Wed Apr 18 14:27:15 2012 CEST
   ---
Event: Running
- Source                     =    LogMonitor
- Timestamp                  =    Wed Apr 18 14:27:39 2012 CEST
   ---
Event: ReallyRunning
- Source                     =    LogMonitor
- Timestamp                  =    Wed Apr 18 14:28:03 2012 CEST
   ---
Event: Done
- Source                     =    LogMonitor
- Timestamp                  =    Wed Apr 18 14:28:03 2012 CEST
==========================================================================

[mcecchi@devel15 ~]$ glite-wms-job-output https://wms013.cnaf.infn.it:9000/hEQOXd73TkRGYlkCQDxOmA

Connecting to the service https://wms013.cnaf.infn.it:7443/glite_wms_wmproxy_server


================================================================================

         JOB GET OUTPUT OUTCOME

Output sandbox files for the job:
https://wms013.cnaf.infn.it:9000/hEQOXd73TkRGYlkCQDxOmA
have been successfully retrieved and stored in the directory:
/tmp/jobOutput/mcecchi_hEQOXd73TkRGYlkCQDxOmA

================================================================================


[mcecchi@devel15 ~]$ cat /tmp/jobOutput/mcecchi_hEQOXd73TkRGYlkCQDxOmA/
err.log  out.log  
[mcecchi@devel15 ~]$ cat /tmp/jobOutput/mcecchi_hEQOXd73TkRGYlkCQDxOmA/out.log 
Hello

[mcecchi@devel15 ~]$ glite-wms-job-list-match -a --endpoint https://wms013.cnaf.infn.it:7443/glite_wms_wmproxy_server ls.jdl |wc -l
179

[mcecchi@devel15 ~]$ glite-wms-job-submit -a --endpoint https://wms013.cnaf.infn.it:7443/glite_wms_wmproxy_server coll_1.jdl 

Connecting to the service https://wms013.cnaf.infn.it:7443/glite_wms_wmproxy_server


====================== glite-wms-job-submit Success ======================

The job has been successfully submitted to the WMProxy
Your job identifier is:

https://wms013.cnaf.infn.it:9000/ZPQkLewFTjI0eaJTwsFwJw

==========================================================================


[mcecchi@devel15 ~]$ glite-wms-job-submit -a --endpoint https://wms013.cnaf.infn.it:7443/glite_wms_wmproxy_server coll_1.jdl 

[mcecchi@devel15 ~]$ glite-wms-job-status https://wms013.cnaf.infn.it:9000/ZPQkLewFTjI0eaJTwsFwJw


======================= glite-wms-job-status Success =====================
BOOKKEEPING INFORMATION:

Status info for the Job : https://wms013.cnaf.infn.it:9000/ZPQkLewFTjI0eaJTwsFwJw
Current Status:     Waiting
Submitted:          Wed Apr 18 17:01:56 2012 CEST
==========================================================================

- Nodes information for: 
    Status info for the Job : https://wms013.cnaf.infn.it:9000/CS3gAQD4M3clf3DFdMpOew
    Current Status:     Scheduled
    Status Reason:      unavailable
    Destination:        cccreamceli08.in2p3.fr:8443/cream-sge-long
    Submitted:          Wed Apr 18 17:01:56 2012 CEST
==========================================================================
    

[mcecchi@devel15 ~]$ cat perusal.jdl 
Executable = "testperusal.sh";
StdOutput = "stdout";
StdError = "stderr";
InputSandbox = {"testperusal.sh"};
OutputSandbox = {"stdout", "stderr", "test"};
PerusalTimeInterval = 15;
PerusalFileEnable = true;
Requirements = true;


[mcecchi@devel15 ~]$ cat testperusal.sh 
#!/bin/sh
for i in `seq 1 100`; do
   sleep 10
   echo prova >> test
   err
done

[mcecchi@devel15 ~]$ glite-wms-job-perusal --get -f test --dir . https://wms013.cnaf.infn.it:9000/orrIQnq8aek3-jHFRxn5kA

Connecting to the service https://wms013.cnaf.infn.it:7443/glite_wms_wmproxy_server


====================== glite-wms-job-perusal Success ======================

No files to be retrieved for the job:
https://wms013.cnaf.infn.it:9000/orrIQnq8aek3-jHFRxn5kA

==========================================================================


[mcecchi@devel15 ~]$ date
Thu Apr 19 10:36:48 CEST 2012
[mcecchi@devel15 ~]$ date
Thu Apr 19 10:53:39 CEST 2012
[mcecchi@devel15 ~]$ glite-wms-job-perusal --get -f test --dir . https://wms013.cnaf.infn.it:9000/orrIQnq8aek3-jHFRxn5kA

Connecting to the service https://wms013.cnaf.infn.it:7443/glite_wms_wmproxy_server


====================== glite-wms-job-perusal Success ======================

The retrieved files have been successfully stored in:
/home/mcecchi

==========================================================================


--------------------------------------------------------------------------
file 1/1: test-20120419095258_1-20120419095258_1
--------------------------------------------------------------------------

SERVER BASIC FUNCTIONALITY TESTS

- Submit single jobs PASS

Both to CREAM and LCG_CE:

[mcecchi@ui ~]$ glite-wms-job-submit -a --endpoint https://devel09.cnaf.infn.it:7443/glite_wms_wmproxy_server ls_cream.jdl 

Connecting to the service https://devel09.cnaf.infn.it:7443/glite_wms_wmproxy_server


====================== glite-wms-job-submit Success ======================

The job has been successfully submitted to the WMProxy
Your job identifier is:

https://devel09.cnaf.infn.it:9000/U54BPTWXBRXQew-LT178cA

==========================================================================

[mcecchi@ui ~]$ glite-wms-job-status https://devel09.cnaf.infn.it:9000/U54BPTWXBRXQew-LT178cA 


======================= glite-wms-job-status Success =====================
BOOKKEEPING INFORMATION:

Status info for the Job : https://devel09.cnaf.infn.it:9000/U54BPTWXBRXQew-LT178cA
Current Status:     Done (Success)
Logged Reason(s):
    - Transfer to CREAM failed due to exception: Failed to create a delegation id for job https://devel09.cnaf.infn.it:9000/U54BPTWXBRXQew-LT178cA: reason is Received NULL fault; the error is due to another cause: FaultString=[] - FaultCode=[SOAP-ENV:Server.generalException] - FaultSubCode=[SOAP-ENV:Server.generalException] - FaultDetail=[<faultData><ns1:MethodName xmlns:ns1="http://glite.org/2007/11/ce/cream/types">invoke</ns1:MethodName><ns2:Timestamp xmlns:ns2="http://glite.org/2007/11/ce/cream/types">2012-03-23T12:22:47.135Z</ns2:Timestamp><ns3:ErrorCode xmlns:ns3="http://glite.org/2007/11/ce/cream/types">0</ns3:ErrorCode><ns4:Description xmlns:ns4="http://glite.org/2007/11/ce/cream/types">User CN=Marco Cecchi,L=CNAF,OU=Personal Certificate,O=INFN,C=IT not authorized for operation {http://www.gridsite.org/namespaces/delegation-2}getProxyReq</ns4:Description><ns5:FaultCause xmlns:ns5="http://glite.org/2007/11/ce/cream/types">User CN=Marco Cecchi,L=CNAF,OU=Personal Certificate,O=INFN,C=IT not authorized for operation {http://www.gridsite.org/namespaces/delegation-2}getProxyReq</ns5:FaultCause></faultData>]
    - job completed
    - Job Terminated Successfully
Exit code:          0
Status Reason:      Job Terminated Successfully
Destination:        ce206.cern.ch:8443/cream-lsf-grid_2nh_dteam
Submitted:          Fri Mar 23 13:22:34 2012 CET
==========================================================================


[mcecchi@ui ~]$ glite-wms-job-submit -a --endpoint https://devel09.cnaf.infn.it:7443/glite_wms_wmproxy_server ls_jc.jdl 

Connecting to the service https://devel09.cnaf.infn.it:7443/glite_wms_wmproxy_server


====================== glite-wms-job-submit Success ======================

The job has been successfully submitted to the WMProxy
Your job identifier is:

https://devel09.cnaf.infn.it:9000/9YCOglBRZ2OH9QyFGjIrwQ

==========================================================================


[mcecchi@ui ~]$ glite-wms-job-status https://devel09.cnaf.infn.it:9000/9YCOglBRZ2OH9QyFGjIrwQ
======================= glite-wms-job-status Success =====================
BOOKKEEPING INFORMATION:

Status info for the Job : https://devel09.cnaf.infn.it:9000/9YCOglBRZ2OH9QyFGjIrwQ
Current Status:     Done (Success)
Exit code:          0
Status Reason:      Job terminated successfully
Destination:        egee.irb.hr:2119/jobmanager-lcgpbs-mon
Submitted:          Fri Mar 23 13:13:58 2012 CET
==========================================================================

[mcecchi@devel15 ~]$ glite-wms-job-cancel https://wms013.cnaf.infn.it:9000/3ZIiK2Blh5T428odOzes6A

Are you sure you want to remove specified job(s) [y/n]y : y

Connecting to the service https://wms013.cnaf.infn.it:7443/glite_wms_wmproxy_server


============================= glite-wms-job-cancel Success =============================

The cancellation request has been successfully submitted for the following job(s):

- https://wms013.cnaf.infn.it:9000/3ZIiK2Blh5T428odOzes6A

========================================================================================

[mcecchi@devel15 ~]$ glite-wms-job-status https://wms013.cnaf.infn.it:9000/3ZIiK2Blh5T428odOzes6A


======================= glite-wms-job-status Success =====================
BOOKKEEPING INFORMATION:

Status info for the Job : https://wms013.cnaf.infn.it:9000/3ZIiK2Blh5T428odOzes6A
Current Status:     Cancelled
Logged Reason(s):
    - Aborted by user
Destination:        spacina-ce.scope.unina.it:2119/jobmanager-lcgpbs-cert
Submitted:          Wed Apr 18 14:38:59 2012 CEST
==========================================================================

- Resubmission PASS

18 May, 16:35:47 -I: [Info] operator()(/home/mcecchi/wms34/emi.wms.wms-manager/src/dispatcher_utils.cpp:227): new jobresubmit for https://devel07.cnaf.infn.it:9000/53qnhHmZ4TxQdrwtPUBgeA
18 May, 16:35:47 -D: [Debug] schedule_at(/home/mcecchi/wms34/emi.wms.wms-manager/src/events.cpp:156): timed event scheduled at 1337351748 with priority 20
18 May, 16:35:47 -D: [Debug] operator()(/home/mcecchi/wms34/emi.wms.wms-manager/src/submit_request.cpp:280): considering (re)submit of https://devel07.cnaf.infn.it:9000/53qnhHmZ4TxQdrwtPUBgeA
18 May, 16:35:47 -D: [Debug] operator()(/home/mcecchi/wms34/emi.wms.wms-manager/src/submit_request.cpp:672): found token number 0 for job https://devel07.cnaf.infn.it:9000/53qnhHmZ4TxQdrwtPUBgeA
18 May, 16:35:47 -I: [Info] checkRequirement(/home/mcecchi/wms34/emi.wms.wms-matchmaking/src/matchmakerISMImpl.cpp:105): MM for job: https://devel07.cnaf.infn.it:9000/53qnhHmZ4TxQdrwtPUBgeA (845/1145 [0] )
18 May, 16:35:47 -I: [Info] operator()(/home/mcecchi/wms34/emi.wms.wms-manager/src/submit_request.cpp:773): https://devel07.cnaf.infn.it:9000/53qnhHmZ4TxQdrwtPUBgeA delivered

- Submit a collection PASS

[mcecchi@ui ~]$ cat coll.jdl 
[
   type = "collection";
   VirtualOrganisation = "dteam";
   nodes = {
      [file ="ls.jdl";],
    [file ="ls.jdl";],
    [file ="ls.jdl";],
    [file ="ls.jdl";],
    [file ="ls.jdl";],
    [file ="ls.jdl";]
  };
] 
[mcecchi@ui ~]$ glite-wms-job-submit -a --endpoint https://devel09.cnaf.infn.it:7443/glite_wms_wmproxy_server coll.jdl 

Connecting to the service https://devel09.cnaf.infn.it:7443/glite_wms_wmproxy_server


====================== glite-wms-job-submit Success ======================

The job has been successfully submitted to the WMProxy
Your job identifier is:

https://devel09.cnaf.infn.it:9000/X7Es4Uwvw2NfKWVrHYpnmA

==========================================================================

======================= glite-wms-job-status Success =====================
BOOKKEEPING INFORMATION:

Status info for the Job : https://devel09.cnaf.infn.it:9000/X7Es4Uwvw2NfKWVrHYpnmA
Current Status:     Running 
Submitted:          Fri Mar 23 13:32:12 2012 CET
==========================================================================

- Nodes information for: 
    Status info for the Job : https://devel09.cnaf.infn.it:9000/CbMUUPjOSvy6HEv5O5Ef6A
    Current Status:     Running 
    Status Reason:      Job successfully submitted to Globus
    Destination:        atlasce01.na.infn.it:2119/jobmanager-lcgpbs-cert
    Submitted:          Fri Mar 23 13:32:12 2012 CET
==========================================================================
    
    Status info for the Job : https://devel09.cnaf.infn.it:9000/F2YEUJYYQKUF0kMY5htIAQ
    Current Status:     Running 
    Status Reason:      Job successfully submitted to Globus
    Destination:        ce-atlas.ipb.ac.rs:2119/jobmanager-pbs-dteam
    Submitted:          Fri Mar 23 13:32:12 2012 CET
==========================================================================
    
    Status info for the Job : https://devel09.cnaf.infn.it:9000/FNvTOawm0gZVI68_PzlayQ
    Current Status:     Running 
    Status Reason:      Job successfully submitted to Globus
    Destination:        ce01.dur.scotgrid.ac.uk:2119/jobmanager-lcgpbs-q2d
    Submitted:          Fri Mar 23 13:32:12 2012 CET
==========================================================================
    
    Status info for the Job : https://devel09.cnaf.infn.it:9000/Hka_ksPeEcS6zMvydbqMbw
    Current Status:     Running 
    Status Reason:      Job successfully submitted to Globus
    Destination:        ce-grid.obspm.fr:2119/jobmanager-pbs-dteam
    Submitted:          Fri Mar 23 13:32:12 2012 CET
==========================================================================
    
    Status info for the Job : https://devel09.cnaf.infn.it:9000/M8t3s7nJCFJkztWElfFozA
    Current Status:     Running 
    Status Reason:      Job successfully submitted to Globus
    Destination:        egee.irb.hr:2119/jobmanager-lcgpbs-mon
    Submitted:          Fri Mar 23 13:32:12 2012 CET
==========================================================================
    
    Status info for the Job : https://devel09.cnaf.infn.it:9000/yf13CQ8UvIUJzgIkI5t-kQ
    Current Status:     Running 
    Status Reason:      Job successfully submitted to Globus
    Destination:        ce-enmr.chemie.uni-frankfurt.de:2119/jobmanager-lcgpbs-long
    Submitted:          Fri Mar 23 13:32:12 2012 CET
==========================================================================

- Submit DAG jobs: IMPORTANT PASS

[ type = "dag"; VirtualOrganisation = "dteam"; nodes = [ nodeA = [file ="ls.jdl";]; nodeB = [file ="ls.jdl";]; Dependencies = { {nodeA, nodeB} } ]; ]

Dependency fulfilled:

======================= glite-wms-job-status Success =====================
BOOKKEEPING INFORMATION:

Status info for the Job : https://devel07.cnaf.infn.it:9000/mAf1L_q5stx4YlYnzt4Dtw
Current Status:     Running 
Destination:        dagman
Submitted:          Fri May 18 23:54:54 2012 CEST
==========================================================================

- Nodes information for: 
    Status info for the Job : https://devel07.cnaf.infn.it:9000/q2dBjsinbryF_H9R7glxEw
    Current Status:     Scheduled 
    Status Reason:      unavailable
    Destination:        phoebe.htc.biggrid.nl:8443/cream-pbs-medium
    Submitted:          Fri May 18 23:54:54 2012 CEST
==========================================================================
    Status info for the Job : https://devel07.cnaf.infn.it:9000/tGmm0h0EA0chivy52RSINg
    Current Status:     Submitted 
    Submitted:          Fri May 18 23:54:54 2012 CEST
==========================================================================

- Status of a DAG job PASS Also, DAG jobs are finally Aborted when one node Aborts!

======================= glite-wms-job-status Success =====================
BOOKKEEPING INFORMATION:

Status info for the Job : https://devel07.cnaf.infn.it:9000/ilhOCkYZ3mXIlMf4pSTjHg
Current Status:     Aborted 
Status Reason:      DAG completed with failed jobs
Destination:        dagman
Submitted:          Sat May 19 00:07:53 2012 CEST
==========================================================================

- Nodes information for: 
    Status info for the Job : https://devel07.cnaf.infn.it:9000/5-m_QV3ELj6VI6_5gSzl2A
    Current Status:     Aborted 
    Status Reason:      parents have aborted
    Submitted:          Sat May 19 00:07:53 2012 CEST
==========================================================================
    
    Status info for the Job : https://devel07.cnaf.infn.it:9000/b_0fxArwL63pzeoqxLn98Q
    Current Status:     Aborted 
    Logged Reason(s):
        - Cannot move ISB (retry_copy ${globus_transfer_cmd} gsiftp://devel09.cnaf.infn.it:2811/var/SandboxDir/b_/https_3a_2f_2fdevel07.cnaf.infn.it_3a9000_2fb_5f0fxArwL63pzeoqxLn98Q/input/Test.sh file:///pool/4644914.lcgbatch02.gridpp.rl.ac.uk/CREAM092148144/Test.sh): 
error: globus_ftp_client: the server responded with an error
500 500-Command failed. : globus_l_gfs_file_open failed.
500-globus_xio: Unable to open file /var/SandboxDir/b_/https_3a_2f_2fdevel07.cnaf.infn.it_3a9000_2fb_5f0fxArwL63pzeoqxLn98Q/input/Test.sh
500-globus_xio: System error in open: Permission denied
500-globus_xio: A system call failed: Permission denied
500 End.
        - pbs_reason=1
    Status Reason:       failed (LB query failed)
    Destination:        lcgce09.gridpp.rl.ac.uk:8443/cream-pbs-grid6000M
    Submitted:          Sat May 19 00:07:53 2012 CEST
==========================================================================

- Cancel PASS (overall status is Aborted because some nodes were aborted)

======================= glite-wms-job-status Success =====================
BOOKKEEPING INFORMATION:

Status info for the Job : https://devel09.cnaf.infn.it:9000/2ELy83Xoo99hGI-CHwRwMw
Current Status:     Aborted 
Logged Reason(s):
    - Aborted by user
Status Reason:      X509 proxy not found or I/O error (/var/SandboxDir/2E/https_3a_2f_2fdevel09.cnaf.infn.it_3a9000_2f2ELy83Xoo99hGI-CHwRwMw/user.proxy)
Destination:        dagman
Submitted:          Tue May 22 10:52:58 2012 CEST
==========================================================================

- Nodes information for: 
    Status info for the Job : https://devel09.cnaf.infn.it:9000/AGrC529w-NEfkr3A1P5XgA
    Current Status:     Cancelled 
    Logged Reason(s):
        - Cancelled by user
    Destination:        creamce01.ge.infn.it:8443/cream-lsf-cert
    Submitted:          Tue May 22 10:52:58 2012 CEST
==========================================================================
    
    Status info for the Job : https://devel09.cnaf.infn.it:9000/fEyWrMB0N8nT55iHH2zpDg
    Current Status:     Cancelled 
    Submitted:          Tue May 22 10:52:58 2012 CEST
==========================================================================
    
    Status info for the Job : https://devel09.cnaf.infn.it:9000/k19fxZAabsvBWnP_P1RUsA
    Current Status:     Aborted 
    Logged Reason(s):
        - Job got an error while in the CondorG queue.
    Status Reason:       failed (LB query failed)
    Destination:        ce3.itep.ru:2119/jobmanager-lcgpbs-dteam
    Submitted:          Tue May 22 10:52:58 2012 CEST
==========================================================================
    
    Status info for the Job : https://devel09.cnaf.infn.it:9000/kShzxpqr91qHCXCmHIeKgg
    Current Status:     Aborted 
    Logged Reason(s):
        - Transfer to CREAM failed due to exception: CREAM Register raised std::exception N5glite2ce16cream_client_api16cream_exceptions30JobSubmissionDisabledExceptionE
    Status Reason:       failed (LB query failed)
    Destination:        cccreamceli07.in2p3.fr:8443/cream-sge-medium
    Submitted:          Tue May 22 10:52:58 2012 CEST
==========================================================================
    
    Status info for the Job : https://devel09.cnaf.infn.it:9000/spKsWwVHtshg3IMYN5Y5Wg
    Current Status:     Cancelled 
    Logged Reason(s):
        - Cancelled by user
    Destination:        egice.polito.it:8443/cream-pbs-cert
    Submitted:          Tue May 22 10:52:58 2012 CEST
==========================================================================
    
    Status info for the Job : https://devel09.cnaf.infn.it:9000/tRcMUgad4ww1d-YjoFXqdQ
    Current Status:     Cancelled 
    Submitted:          Tue May 22 10:52:58 2012 CEST
==========================================================================

- Job with access to catalogues (mcecchi 25/5/12)

[mcecchi@ui ~]$ cat catalogue_access.jdl 
[
DataAccessProtocol = "gsiftp";
RetryCount = 1;
ShallowRetryCount = 2; 
Executable = "/bin/echo";
Arguments = "1000";
StdOutput = "std.out";
StdError = "std.err";
FuzzyRank = true;
InputSandbox = {"calc-pi.sh", "fileA", "prologue.sh"};
OutputSandbox = {"std.out", "std.err","out-PI.txt","out-e.txt"};
requirements = true;
DataRequirements = {
[
DataCatalogType = "DLI";
DataCatalog ="http://lfc.gridpp.rl.ac.uk:8085/"; 
InputData = { "lfn:/grid/t2k.org/nd280/raw/ND280/ND280/00005000_00005999/nd280_00005000_0002.daq.mid.gz" };
]
};
];

[mcecchi@ui ~]$ glite-wms-job-status https://devel09.cnaf.infn.it:9000/mB4YVNRzrseG_cRc-V8Ncw


======================= glite-wms-job-status Success =====================
BOOKKEEPING INFORMATION:

Status info for the Job : https://devel09.cnaf.infn.it:9000/mB4YVNRzrseG_cRc-V8Ncw
Current Status:     Running 
Status Reason:      unavailable
Destination:        ceprod07.grid.hep.ph.ic.ac.uk:8443/cream-sge-grid.q
Submitted:          Fri May 25 16:48:50 2012 CEST
==========================================================================

- Enable GLUE2 purchasers and check an ISM dump and MM

- Enable Argus authZ and check results FAIL

In site_info.def USE_ARGUS=yes ARGUS_PEPD_ENDPOINTS="https://argus01.lcg.cscs.ch:8154/authz https://argus02.lcg.cscs.ch:8154/authz https://argus03.lcg.cscs.ch:8154/authz"

22/03/2012 First problem: FIXED

argus-gsi-pep-callout missing from the MP, error while using gridftp

22/05/2012 New test:


22 May 2012, 16:38:56 -I- PID: 30454 (Debug) - Calling the WMProxy jobRegister service

Warning - Unable to register the job to the service: https://devel09.cnaf.infn.it:7443/glite_wms_wmproxy_server Argus denied authorization on jobRegister by ,C=IT,O=INFN,OU=Personal Certificate,L=CNAF,CN=Marco Cecchi Error code: SOAP-ENV:Server

PASS

TODO: let's try with one policy that let us pass

BUGS:

some sensible information should be logged on syslog (bug #92657, gLite Middleware, ) PRE-CERTIFIED (mecchi 29/03/12)

May 18 17:19:07 devel09 glite_wms_wmproxy_server: submission from ui.cnaf.infn.it, DN=/C=IT/O=INFN/OU=Personal Certificate/L=CNAF/CN=Marco Cecchi, FQAN=/dteam/Role=NULL/Capability=NULL, userid=18264 for jobid=https://devel07.cnaf.infn.it:9000/_K-LYpekDA1xk9sisW5hBA

May 18 17:19:36 devel09 glite-wms-workload_manager: jobid=https://devel07.cnaf.infn.it:9000/D6jb8a1zk8kfwLhm2PxAIw, destination=cmsrm-cream01.roma1.infn.it:8443/cream-lsf-cmsgcert
May 18 17:19:36 devel09 glite-wms-workload_manager: jobid=https://devel07.cnaf.infn.it:9000/mlMh39hZcZSHzLozxmSUXQ, destination=ce203.cern.ch:8443/cream-lsf-grid_2nh_dteam
May 18 17:19:36 devel09 glite-wms-workload_manager: jobid=https://devel07.cnaf.infn.it:9000/ToPUYBYlHPX8ucct9GMxJA, destination=ce207.cern.ch:8443/cream-lsf-grid_2nh_dteam
May 18 17:19:36 devel09 glite-wms-workload_manager: jobid=https://devel07.cnaf.infn.it:9000/eqK70rofnD_ssOJVVGzZFg, destination=ce04.esc.qmul.ac.uk:8443/cream-sge-lcg_long
May 18 17:19:36 devel09 glite-wms-workload_manager: jobid=https://devel07.cnaf.infn.it:9000/RWbT_HnCv_xhVIgxOz6Zew, destination=atlasce02.scope.unina.it:8443/cream-pbs-egeecert
May 18 17:19:37 devel09 glite-wms-workload_manager: jobid=https://devel07.cnaf.infn.it:9000/Uen9xJpMD_muv4DgUa7XHg, destination=ce01.eela.if.ufrj.br:8443/cream-pbs-dteam
May 18 17:19:37 devel09 glite-wms-workload_manager: jobid=https://devel07.cnaf.infn.it:9000/surSO68FUvkoMMJ5vwnslw, destination=ce-cr-02.ts.infn.it:8443/cream-lsf-cert
May 18 17:19:37 devel09 glite-wms-workload_manager: jobid=https://devel07.cnaf.infn.it:9000/K6OlQy5IR0a_bs3jyIcASQ, destination=ce-grisbi.cbib.u-bordeaux2.fr:8443/cream-pbs-dteam
May 18 17:19:37 devel09 glite-wms-workload_manager: jobid=https://devel07.cnaf.infn.it:9000/rQzNsXNGtVRtKjt0hW-n7w, destination=ce0.m3pec.u-bordeaux1.fr:8443/cream-pbs-dteam
May 18 17:19:37 devel09 glite-wms-workload_manager: jobid=https://devel07.cnaf.infn.it:9000/PVm7kNQu_h7CzrsFhGbIrQ, destination=cccreamceli06.in2p3.fr:8443/cream-sge-long

WMS UI emi-wmproxy-api-cpp and emi-wms-ui-api-python still use use gethostbyaddr/gethostbyname (bug #89668, gLite Middleware, ) PRE-CERTIFIED (alvise 28/03/12)

grep on source code

Submission with rfc proxy doesn't work (bug #88128, gLite Middleware, ) PRE-CERTIFIED (mcecchi 27/03/12)

authentication fails at the LCG-CE

27 Mar, 16:01:17 -I- EventGlobusSubmitFailed::process_event(): Got globus submit failed event. 27 Mar, 16:01:17 -I- EventGlobusSubmitFailed::process_event(): For cluster: 1363, reason: 7 authentication failed: GSS Major Status: Authentication Failed GSS Minor Status Error Chain: init.c:499: globus_gss_assist_init_sec_context_async: Error during context initialization init_sec_contex 27 Mar, 16:01:17 -I- EventGlobusSubmitFailed::process_event(): Job id = https://devel09.cnaf.infn.it:9000/kDLx5tj_Nxpt2ewI35AARQ 27 Mar, 16:01:17 -I- SubmitReader::internalRead(): Reading condor submit file of job https://devel09.cnaf.infn.it:9000/kDLx5tj_Nxpt2ewI35AARQ

EMI WMS wmproxy init.d script stop/start problems (bug #89577, gLite Middleware, ) PRE-CERTIFIED (mcecchi 23/03/12)

1. The restart command does not restart httpd, whereas stop + start does:

[root@devel09 ~]# /etc/init.d/glite-wms-wmproxy status
/usr/bin/glite_wms_wmproxy_server is running...
[root@devel09 ~]# ps aux | grep httpd
glite    11440  0.0  0.0  96440  2196 ?        S    11:31   0:00 /usr/sbin/httpd -k start -f /etc/glite-wms/glite_wms_wmproxy_httpd.conf
glite    11441  0.0  0.0  96440  2480 ?        S    11:31   0:00 /usr/sbin/httpd -k start -f /etc/glite-wms/glite_wms_wmproxy_httpd.conf
glite    11442  0.0  0.0  96440  2480 ?        S    11:31   0:00 /usr/sbin/httpd -k start -f /etc/glite-wms/glite_wms_wmproxy_httpd.conf
glite    11443  0.0  0.0  96440  2480 ?        S    11:31   0:00 /usr/sbin/httpd -k start -f /etc/glite-wms/glite_wms_wmproxy_httpd.conf
glite    11444  0.0  0.0  96440  2480 ?        S    11:31   0:00 /usr/sbin/httpd -k start -f /etc/glite-wms/glite_wms_wmproxy_httpd.conf
glite    11445  0.0  0.0  96440  2480 ?        S    11:31   0:00 /usr/sbin/httpd -k start -f /etc/glite-wms/glite_wms_wmproxy_httpd.conf
root     15789  0.0  0.0  61192   740 pts/0    R+   11:52   0:00 grep httpd
root     24818  0.0  0.1  96440  4716 ?        Ss   Mar19   0:04 /usr/sbin/httpd -k start -f /etc/glite-wms/glite_wms_wmproxy_httpd.conf
[root@devel09 ~]# /etc/init.d/glite-wms-wmproxy restart
Restarting /usr/bin/glite_wms_wmproxy_server... ok
[root@devel09 ~]# ps aux | grep httpd
glite    15889  0.0  0.0  96440  2196 ?        S    11:52   0:00 /usr/sbin/httpd -k start -f /etc/glite-wms/glite_wms_wmproxy_httpd.conf
glite    15890  0.0  0.0  96440  2488 ?        S    11:52   0:00 /usr/sbin/httpd -k start -f /etc/glite-wms/glite_wms_wmproxy_httpd.conf
glite    15891  0.0  0.0  96440  2480 ?        S    11:52   0:00 /usr/sbin/httpd -k start -f /etc/glite-wms/glite_wms_wmproxy_httpd.conf
glite    15892  0.0  0.0  96440  2480 ?        S    11:52   0:00 /usr/sbin/httpd -k start -f /etc/glite-wms/glite_wms_wmproxy_httpd.conf
glite    15893  0.0  0.0  96440  2480 ?        S    11:52   0:00 /usr/sbin/httpd -k start -f /etc/glite-wms/glite_wms_wmproxy_httpd.conf
glite    15894  0.0  0.0  96440  2480 ?        S    11:52   0:00 /usr/sbin/httpd -k start -f /etc/glite-wms/glite_wms_wmproxy_httpd.conf
root     15897  0.0  0.0  61196   772 pts/0    S+   11:53   0:00 grep httpd
root     24818  0.0  0.1  96440  4716 ?        Ss   Mar19   0:04 /usr/sbin/httpd -k start -f /etc/glite-wms/glite_wms_wmproxy_httpd.conf

2. A start immediately following a stop often fails and has to be repeated to get the service working again:

[root@devel09 ~]# /etc/init.d/glite-wms-wmproxy stop; /etc/init.d/glite-wms-wmproxy start
Stopping /usr/bin/glite_wms_wmproxy_server... ok
Starting /usr/bin/glite_wms_wmproxy_server... ok

3. The stop and start commands fail when invoked via ssh ([Sun Oct 09 17:08:01 2011] [warn] PassEnv variable HOSTNAME was undefined):

[mcecchi@cnaf ~]$ ssh root@devel09 '/etc/init.d/glite-wms-wmproxy stop;/etc/init.d/glite-wms-wmproxy start'
root@devel09's password: 
Stopping /usr/bin/glite_wms_wmproxy_server... ok
Starting /usr/bin/glite_wms_wmproxy_server... ok

Make some WMS init scripts System V compatible (bug #91115, gLite Middleware, ) PRE-CERTIFIED (mcecchi 23/03/12)

[root@devel09 ~]# grep -1 chkconfig /etc/init.d/glite-wms-ice 

# chkconfig: 345 95 06
# description: startup script for the ICE process
[root@devel09 ~]# grep -1 chkconfig /etc/init.d/glite-wms-wm

# chkconfig: 345 94 06 
# description: WMS processing engine

Semi-automated service backends configuration for WMS (task #23845, EMI Development Tracker, Done) PRE-CERTIFIED (mcecchi 23/03/12)

[root@devel09 ~]# cat /etc/my.cnf
[mysqld]
innodb_flush_log_at_trx_commit=2
innodb_buffer_pool_size=500M
!includedir /etc/mysql/conf.d/

innodb_flush_log_at_trx_commit=2 and innodb_buffer_pool_size=500M are what is expected to be present

WMproxy GACLs do not support wildcards (as they used to do) (bug #87261, gLite Middleware, ) PRE-CERTIFIED (mcecchi 23/03/12)

with:

<gacl version="0.0.1">
  <entry> <voms> <fqan>/dtea*</voms> <allow> <exec/> </allow> </entry>
</gacl>

GRANT

<gacl version="0.0.1">
  <entry> <voms> <fqan>/dteaM*</fqan></voms> <allow> <exec/> </allow> </entry>
</gacl>

DENY

<gacl version="0.0.1">
  <entry> <voms> <fqan>/dteam</fqan></voms> <allow> <exec/> </allow> </entry>
</gacl>

DENY

<gacl version="0.0.1">
  <entry> <voms> <fqan>/dteam</fqan></voms> <allow> <exec/> </allow> </entry>
  <entry> <voms> <fqan>/dteam/*</fqan></voms> <allow> <exec/> </allow> </entry>
</gacl>

GRANT

WMS logs should keep track of the last 90 days (bug #89871, gLite Middleware, ) PRE-CERTIFIED (mcecchi 22/03/12)

[root@devel09 ~]# grep -r rotate\ 90 /etc/logrotate.d/
/etc/logrotate.d/wm:       rotate 90
/etc/logrotate.d/globus-gridftp:    rotate 90
/etc/logrotate.d/globus-gridftp:    rotate 90
/etc/logrotate.d/lcmaps:       rotate 90
/etc/logrotate.d/lm:       rotate 90
/etc/logrotate.d/jc:       rotate 90
/etc/logrotate.d/glite-wms-purger:       rotate 90
/etc/logrotate.d/wmproxy:       rotate 90
/etc/logrotate.d/argus:       rotate 90
/etc/logrotate.d/ice:       rotate 90
[root@devel09 ~]# grep -r daily /etc/logrotate.d/
/etc/logrotate.d/wm:       daily
/etc/logrotate.d/kill-stale-ftp:    daily
/etc/logrotate.d/globus-gridftp:    daily
/etc/logrotate.d/globus-gridftp:    daily
/etc/logrotate.d/lcmaps:       daily
/etc/logrotate.d/lm:       daily
/etc/logrotate.d/jc:       daily
/etc/logrotate.d/glite-wms-purger:       daily
/etc/logrotate.d/wmproxy:       daily
/etc/logrotate.d/argus:       daily
/etc/logrotate.d/ice:       daily
/etc/logrotate.d/glite-lb-server:   daily
/etc/logrotate.d/bdii:    daily

/etc/logrotate.d/kill-stale-ftp has

rotate 30

but it should be

rotate 90

yaim-wms changes for Argus based authZ (bug #90760, gLite Middleware, ) NOT PRE-CERTIFIED (mcecchi 22/03/12)

Modified siteinfo.def with:

USE_ARGUS=yes

ARGUS_PEPD_ENDPOINTS="https://argus01.lcg.cscs.ch:8154/authz https://argus02.lcg.cscs.ch:8154/authz https://argus03.lcg.cscs.ch:8154/authz"

ran:

/opt/glite/yaim/bin/yaim -c -s siteinfo/site-info.def -n WMS

in glite_wms.conf:

ArgusAuthz = true;

ArgusPepdEndpoints = {"https://argus01.lcg.cscs.ch:8154/authz", "https://argus02.lcg.cscs.ch:8154/authz", "https://argus03.lcg.cscs.ch:8154/authz"};

glite-wms-check-daemons.sh should not restart daemons under the admin's nose (bug #89674, gLite Middleware, ) PRE-CERTIFIED (mcecchi 22/03/12)

[root@devel09 ~]# /etc/init.d/glite-wms-wm start
starting workload manager... ok
[root@devel09 ~]# /etc/init.d/glite-wms-wm status
/usr/bin/glite-wms-workload_manager (pid 486) is running...
[root@devel09 ~]# ll /var/run/glite-wms-w*
-rw-r--r-- 1 root root 4 Mar 22 10:02 /var/run/glite-wms-workload_manager.pid
[root@devel09 ~]# /etc/init.d/glite-wms-wm stop
stopping workload manager... ok
[root@devel09 ~]# date
Thu Mar 22 10:02:37 CET 2012
[root@devel09 ~]# ll /var/run/glite-wms-w*
ls: /var/run/glite-wms-w*: No such file or directory
[root@devel09 ~]# cat /etc/cron.d/glite-wms-check-daemons.cron 
HOME=/
MAILTO=SA3-italia

*/5 * * * * root . /usr/libexec/grid-env.sh ; sh /usr/libexec/glite-wms-check-daemons.sh > /dev/null 2>&1
[root@devel09 ~]#  sh /usr/libexec/glite-wms-check-daemons.sh
[root@devel09 ~]# /etc/init.d/glite-wms-wm status
/usr/bin/glite-wms-workload_manager is not running
[root@devel09 ~]# ps aux | grep workl
root       970  0.0  0.0  61192   756 pts/2    S+   10:03   0:00 grep workl
[root@devel09 ~]# /etc/init.d/glite-wms-wm start
starting workload manager... ok
[root@devel09 ~]# ps aux | grep workl
glite     1009 11.0  0.6 255060 26444 ?        Ss   10:04   0:00 /usr/bin/glite-wms-workload_manager --conf glite_wms.conf --daemon
root      1013  0.0  0.0  61196   764 pts/2    S+   10:04   0:00 grep workl
[root@devel09 ~]# kill -9 1009
[root@devel09 ~]#  sh /usr/libexec/glite-wms-check-daemons.sh
stopping workload manager... ok
starting workload manager... ok
[root@devel09 ~]# /etc/init.d/glite-wms-wm status
/usr/bin/glite-wms-workload_manager (pid 1196) is running...
[root@devel09 ~]# 

Wrong location for PID file (bug #89857, gLite Middleware, ) PRE-CERTIFIED (mcecchi 21/03/12)

[root@devel09 ~]# ls /var/run/*pid
/var/run/atd.pid            /var/run/crond.pid               /var/run/glite-wms-job_controller.pid    /var/run/gpm.pid        /var/run/klogd.pid       /var/run/ntpd.pid
/var/run/brcm_iscsiuio.pid  /var/run/exim.pid                /var/run/glite-wms-log_monitor.pid       /var/run/haldaemon.pid  /var/run/messagebus.pid  /var/run/sshd.pid
/var/run/condor_master.pid  /var/run/glite-wms-ice-safe.pid  /var/run/glite-wms-workload_manager.pid  /var/run/iscsid.pid     /var/run/nrpe.pid        /var/run/syslogd.pid

For ICE now the pid is correctly handled:

[root@devel09 ~]# /etc/init.d/glite-wms-ice status
/usr/bin/glite-wms-ice-safe (pid 16820) is running...
[root@devel09 ~]# ll /var/run/glite-wms-ice-safe.pid 
-rw-r--r-- 1 root root 6 Mar 27 14:41 /var/run/glite-wms-ice-safe.pid
[root@devel09 ~]# cat /var/run/glite-wms-ice-safe.pid
16820
[root@devel09 ~]# /etc/init.d/glite-wms-ice stop
stopping ICE... ok
[root@devel09 ~]# /etc/init.d/glite-wms-ice start
starting ICE... ok
[root@devel09 ~]# ll /var/run/glite-wms-ice-safe.pid 
-rw-r--r-- 1 root root 6 Mar 27 14:42 /var/run/glite-wms-ice-safe.pid
[root@devel09 ~]# cat /var/run/glite-wms-ice-safe.pid
16969
[root@devel09 ~]# /etc/init.d/glite-wms-ice status
/usr/bin/glite-wms-ice-safe (pid 16969) is running...
[root@devel09 ~]# /etc/init.d/glite-wms-ice restart
stopping ICE... ok
starting ICE... ok
[root@devel09 ~]# ps -ef|grep ice
root      2942     1  0 Mar26 ?        00:00:00 gpm -m /dev/input/mice -t exps2
glite    17074     1  0 14:43 ?        00:00:00 /usr/bin/glite-wms-ice-safe --conf glite_wms.conf --daemon /tmp/icepid
glite    17080 17074  0 14:43 ?        00:00:00 sh -c /usr/bin/glite-wms-ice --conf glite_wms.conf /var/log/wms/ice_console.log 2>&1
glite    17081 17080  0 14:43 ?        00:00:00 /usr/bin/glite-wms-ice --conf glite_wms.conf /var/log/wms/ice_console.log
root     17114 23643  0 14:43 pts/1    00:00:00 grep ice

The job replanner should be configurable (bug #91941, gLite Middleware, None) PRE-CERTIFIED (mcecchi 21/03/12)

EnableReplanner=true in the WM conf

21 Mar, 16:51:54 -I: [Info] main(/home/condor/execute/dir_2479/userdir/emi.wms.wms-manager/src/main.cpp:468): WM startup completed...
21 Mar, 16:51:54 -I: [Info] operator()(/home/condor/execute/dir_2479/userdir/emi.wms.wms-manager/src/replanner.cpp:288): replanner in action
21 Mar, 16:51:57 -W: [Warning] get_site_name(/home/condor/execute/dir_2479/userdir/emi.wms.wms-ism/src/purchaser/ldap-utils.cpp:162): Cannot find GlueSiteUniqueID assignment.

EnableReplanner=false in the WM conf

21 Mar, 16:54:02 -I: [Info] main(/home/condor/execute/dir_2479/userdir/emi.wms.wms-manager/src/main.cpp:468): WM startup completed...
21 Mar, 16:54:05 -W: [Warning] get_site_name(/home/condor/execute/dir_2479/userdir/emi.wms.wms-ism/src/purchaser/ldap-utils.cpp:162): Cannot find GlueSiteUniqueID assignment.

GlueServiceStatusInfo: ?? (bug #89435, gLite Middleware, ) PRE-CERTIFIED (mcecchi 21/03/12)

[root@devel09 ~]# /var/lib/bdii/gip/provider/glite-info-provider-service-wmproxy-wrapper|grep -i servicestatusinfo 
GlueServiceStatusInfo: /usr/bin/glite_wms_wmproxy_server is running...

WMProxy limiter should log more at info level (bug #72280, gLite Middleware, ) PRE-CERTIFIED (mcecchi 21/03/12).

In wmp conf:

jobRegister  =  "${WMS_LOCATION_SBIN}/glite_wms_wmproxy_load_monitor --oper jobRegister --load1 0 --load5 20 --load15 18 --memusage 99 --diskusage 95 --fdnum 1000 --jdnum 1500 --ftpconn 300";
LogLevel  =  5;

restart wmp, in wmproxy log:

21 Mar, 16:28:52 -S- PID: 875 - "wmputils::doExecv": Child failure, exit code: 256
21 Mar, 16:28:52 -I- PID: 875 - "wmpgsoapoperations::ns1__jobRegister": ------------------------------- Fault description --------------------------------
21 Mar, 16:28:52 -I- PID: 875 - "wmpgsoapoperations::ns1__jobRegister": Method: jobRegister
21 Mar, 16:28:52 -I- PID: 875 - "wmpgsoapoperations::ns1__jobRegister": Code: 1228
21 Mar, 16:28:52 -I- PID: 875 - "wmpgsoapoperations::ns1__jobRegister": Description: System load is too high:
Threshold for Load Average(1 min): 0 => Detected value for Load Average(1 min):  0.19

* EMI WMS wmproxy rpm doesn't set execution permissions as it used to do in gLite (bug #89506) PRE-CERTIFIED (mcecchi 15/5/2012)

after installing the wmproxy rpm and without running yaim:

[root@devel09 ~]# ll /usr/libexec/glite_wms_wmproxy_dirmanager 
-rwsr-xr-x 1 root root 18989 May 15 10:26 /usr/libexec/glite_wms_wmproxy_dirmanager
[root@devel09 ~]# ll /usr/sbin/glite_wms_wmproxy_load_monitor
-rwsr-xr-x 1 root root 22915 May 15 10:26 /usr/sbin/glite_wms_wmproxy_load_monitor

JobController logfile name is misspelled (bug #32611, gLite Middleware, https://savannah.cern.ch/bugs/?32611) PRE-CERTIFIED (alvise)

Verified that in the glite_wms.conf file the log file name was correct:


    [root@devel09 glite-wms]# grep jobcontroller glite_wms.conf
    LogFile  =  "${WMS_LOCATION_LOG}/jobcontroller_events.log";

glite-wms-job-submit doesn't always pick up other WMProxy endpoints if load on WMS is high (bug #40370, gLite Middleware, https://savannah.cern.ch/bugs/?40370) HOPEFULLY FIXED (alvise)

[wms] GlueServiceStatusInfo content is ugly (bug #48068, gLite Middleware, https://savannah.cern.ch/bugs/?48068) PRE-CERTIFIED (alvise).

Verified that the output of the command "/etc/init.d/glite-wms-wmproxy status" is as requested.

[ yaim-wms ] CeForwardParameters should include several more parameters (bug #61315, gLite Middleware, https://savannah.cern.ch/bugs/?61315) PRE-CERTIFIED (alvise):


[root@devel09 ~]# grep CeF /etc/glite-wms/glite_wms.conf 
    CeForwardParameters  =  {"GlueHostMainMemoryVirtualSize","GlueHostMainMemoryRAMSize",
                                               "GlueCEPolicyMaxCPUTime", "GlueCEPolicyMaxObtainableCPUTime", "GlueCEPolicyMaxObtainableWallClockTime", "GlueCEPolicyMaxWallClockTime" };

Files specified with absolute paths shouldn't be used with inputsandboxbaseuri (bug #74832, gLite Middleware, https://savannah.cern.ch/bugs/?74832) PRE-CERTIFIED (alvise).

Verified that JDL described in the comment is correctly handled by activating debug (--debug); the debug log showed that the file /etc/fstab is correctly staged from the UI node via gsiftp.

There's an un-catched out_of_range exception in the ICE component (bug #75099, gLite Middleware, https://savannah.cern.ch/bugs/?75099 ) PRE-CERTIFIED (alvise)

Tried on my build machine (able to run ICE without WM) submittinga JDL with empty "ReallyRunningToken" attribute. ICE didn't crash as before. There's not yet possibility to test all-in-one (WMProxy/WM/ICE) because of a problem with LCMAPS.

Too much flexibility in JDL syntax (bug #75802, gLite Middleware, https://savannah.cern.ch/bugs/?75802) PRE-CERTIFIED (alvise)

Verified with --debug that glite-wms-job-submit:

dorigoa@cream-01 11:00:47 ~/emi/wmsui_emi2>grep -i environment jdl2 
environment = "FOO=bar";

dorigoa@cream-01 11:50:02 ~/emi/wmsui_emi2>stage/usr/bin/glite-wms-job-submit --debug -a -c ~/JDLs/WMS/wmp_gridit.conf jdl2
[...]
-----------------------------------------
07 March 2012, 11:50:45 -I- PID: 3397 (Debug) - Registering JDL [ stdoutput = "out3.out"; SignificantAttributes = { "Requirements","Rank" }; DefaultNodeRetryCount = 5; executable = "ssh1.sh"; Type = "job"; Environment = { "FOO=bar" }; AllowZippedISB = false; VirtualOrganisation = "dteam"; JobType = "normal"; DefaultRank =  -other.GlueCEStateEstimatedResponseTime; outputsandbox = { "out3.out","err2.err","fstab","grid-mapfile","groupmapfile","  passwd" }; InputSandbox = { "file:///etc/fstab","grid-mapfile","groupmapfile","gsiftp://cream-38.pd.infn.it/etc/passwd","file:///home/dorigoa/ssh1.sh" }; stderror = "err2.err"; inputsandboxbaseuri = "gsiftp://cream-38.pd.infn.it/etc/grid-security"; rank =  -other.GlueCEStateEstimatedResponseTime; MyProxyServer = "myproxy.cern.ch"; requirements = other.GlueCEStateStatus == "Production" || other.GlueCEStateStatus == "testbedb" ]
[...]

So the mangling  Environment = "FOO=bar"; -> Environment = { "FOO=bar" };  occurs correctly.

getaddrinfo() sorts results according to RFC3484, but random ordering is lost (bug #82779, gLite Middleware, https://savannah.cern.ch/bugs/?82779) PRE-CERTIFIED (alvise).

I did a not deep test. A deep test needs an alias pointing to at least 3 or 4 different WM nodes. The alias provided in the bug's savannah page just is resolved to 2 different physical hosts, and I observed that both hosts are choosen by the UI while submitting several jobs. I did this test from my EMI2 WMSUI workarea as I do not have any WMS UI EMI2 machine to try on.

glite-wms-job-status needs a json-compliant format (bug #82995, gLite Middleware, https://savannah.cern.ch/bugs/?82995) PRE-CERTIFIED (alvise) from my WMSUI EMI2 workarea:


dorigoa@lxgrid05 14:20:13 ~/emi/wmsui_emi2>stage/usr/bin/glite-wms-job-status --json https://wms014.cnaf.infn.it:9000/pVQojatZbyoj_Pyab66_dw

{ "result": "success" , "https://wms014.cnaf.infn.it:9000/pVQojatZbyoj_Pyab66_dw": { "Current Status": "Done(Success)", "Logged Reason": {"0": "job completed","1": "Job Terminated Successfully"}, "Exit code": "0", "Status Reason": "Job Terminated Successfully", "Destination": "grive02.ibcp.fr:8443/cream-pbs-dteam", "Submitted": "Mon Mar 12 14:11:55 2012 CET", "Done": "1331558020"}   }

Last LB event logged by ICE when job aborted for proxy expired should be ABORTED (bug #84839, gLite Middleware, ) PRE_CERTIFIED (alvise)

Submitted to ICE (running from my workarea emi2) a job sleeping for 5 minutes with a proxy valid for 3 minutes (myproxyserver not set, so no proxy renewal). Last event logged by ICE (as shown in the ICE's log) is:


2012-03-13 10:49:33,616 INFO - iceLBLogger::logEvent() - Job Aborted Event, reason=[Proxy is expired; Job has been terminated (got SIGTERM)] - [GRIDJobID="https://grid005.pd.infn.it:9000/0001331632035.314183" CREAMJobID="https://cream-23.pd.infn.it:8443/CREAM017935418"]

glite-wms-job-status needs a better handing of purged-related error code. (bug #85063, gLite Middleware, https://savannah.cern.ch/bugs/?85063) HOPEFULLY-FIXED (alvise). Reproducing the scenario that triggered the problem is highly improbable.

pkg-config info for wmproxy-api-cpp should be enriched (bug #85799, gLite Middleware, ) PRE-CERTIFIED (alvise, 30/03/2012):

[root@devel09 ~]# rpm -ql glite-wms-wmproxy-api-cpp-devel
/usr/include/glite
/usr/include/glite/wms
/usr/include/glite/wms/wmproxyapi
/usr/include/glite/wms/wmproxyapi/wmproxy_api.h
/usr/include/glite/wms/wmproxyapi/wmproxy_api_utilities.h
/usr/lib64/libglite_wms_wmproxy_api_cpp.so
/usr/lib64/pkgconfig/wmproxy-api-cpp.pc
[root@devel09 ~]# cat /usr/lib64/pkgconfig/wmproxy-api-cpp.pc
prefix=/usr
exec_prefix=${prefix}
libdir=${exec_prefix}/lib64
includedir=${prefix}/include

Name: wmproxy api cpp
Description: WMProxy C/C++ APIs
Version: 3.3.3
Requires: emi-gridsite-openssl
Libs: -L${libdir} -lglite_wms_wmproxy_api_cpp
Cflags: -I${includedir}

queryDb has 2 handling user's options (see ggus ticket for more info) (bug #86267, gLite Middleware, https://savannah.cern.ch/bugs/?86267) PRE-CERTIFIED (alvise). The way to verify it has been the same described (by me) in the related ticket: https://ggus.eu/tech/ticket_show.php?ticket=73658.

glite-wms-job-list-match --help show an un-implemented (and useless) option "--default-jdl" (bug #87444, gLite Middleware, https://savannah.cern.ch/bugs/?87444) PRE-CERTIFIED (alvise)

The command glite-wms-job-list-match --help doesn't show that option anymore.

EMI WMS wmproxy rpm doesn't set execution permissions as it used to do in gLite (bug #89506, gLite Middleware, https://savannah.cern.ch/bugs/?89506) PRE-CERTIFIED (alvise):

[root@devel09 ~]# ll /usr/sbin/glite_wms_wmproxy_load_monitor /usr/bin/glite_wms_wmproxy_server /usr/bin/glite-wms-wmproxy-purge-proxycache /usr/libexec/glite_wms_wmproxy_dirmanager
-rwxr-xr-x 1 nobody nobody    1876 Mar  2 15:14 /usr/bin/glite-wms-wmproxy-purge-proxycache
-rwxr-xr-x 1 nobody nobody 3059020 Mar  2 15:14 /usr/bin/glite_wms_wmproxy_server
-rwsr-xr-x 1 nobody nobody   22637 Mar  2 15:14 /usr/libexec/glite_wms_wmproxy_dirmanager
-rwsr-xr-x 1 nobody nobody   22915 Mar  2 15:14 /usr/sbin/glite_wms_wmproxy_load_monitor

no root:root and not suid bit either

yaim-wms creates wms.proxy in wrong path (bug #90129, gLite Middleware, https://savannah.cern.ch/bugs/?90129) PRE-CERTIFIED (alvise)

the path of wms.proxy seems to be correct now:

[root@devel09 ~]# ll ${WMS_LOCATION_VAR}/glite/wms.proxy 
-r-------- 1 glite glite 2824 Mar 14 12:00 /var/glite/wms.proxy
[root@devel09 ~]# ll ${WMS_LOCATION_VAR}/wms.proxy 
ls: /var/wms.proxy: No such file or directory

ICE log verbosity should be reduced to 300 (bug #91078, gLite Middleware, https://savannah.cern.ch/bugs/?91078) PRE-CERTIFIED (alvise):

[root@devel09 etc]# grep ice_log_level /etc/glite-wms/glite_wms.conf
    ice_log_level   =   300;

move lcmaps.log from /var/log/glite to WMS_LOCATION_LOG (bug #91484, gLite Middleware, https://savannah.cern.ch/bugs/?91484) PRE-CERTIFIED (alvise):


[root@devel09 etc]# ll $WMS_LOCATION_LOG/lcmaps.log ; ll /var/log/glite/lcmaps.log
-rw-r--r-- 1 glite glite 588 Mar 19 09:41 /var/log/wms/lcmaps.log
ls: /var/log/glite/lcmaps.log: No such file or directory

WMS: use logrotate uniformly in ice, lm, jc, wm, wmp (bug #91486, gLite Middleware, ) PRE-CERTIFIED (22/03/12 mcecchi)

logrotate disappeared from crom jobs

[root@devel09 ~]# grep -r rotate /etc/cron.d
[root@devel09 ~]# 

because it is consistently managed here:

[root@devel09 ~]# ll /etc/logrotate.d/
total 96
-rw-r--r-- 1 root root 111 Mar 22 15:40 argus
-rw-r--r-- 1 root root 106 Mar 10 00:13 bdii
-rw-r--r-- 1 root root 109 Mar 22 15:39 fetch-crl
-rw-r--r-- 1 root root 194 Mar 10 11:23 glite-lb-server
-rw-r--r-- 1 root root 128 Mar 22 15:40 glite-wms-purger
-rw-r--r-- 1 root root 240 Nov 22 05:06 globus-gridftp
-rw-r--r-- 1 root root 167 Feb 27 20:09 httpd
-rw-r--r-- 1 root root 109 Mar 22 15:40 ice
-rw-r--r-- 1 root root 126 Mar 22 15:40 jc
-rw-r--r-- 1 root root  83 Mar 10 06:34 kill-stale-ftp
-rw-r--r-- 1 root root 112 Mar 22 15:40 lcmaps
-rw-r--r-- 1 root root 123 Mar 22 15:40 lm
-rw-r--r-- 1 root root 129 Mar 22 15:40 wm
-rw-r--r-- 1 root root 192 Mar 22 15:40 wmproxy

remove several dismissed parameters from the WMS configuration (bug #91488, gLite Middleware, https://savannah.cern.ch/bugs/?91488) PRE-CERTIFIED (alvise)

verified that the params cited in the savannah bug are missing in the glite_wms.conf (this command grep -E 'log_file_max_size|log_rotation_base_file|log_rotation_max_file_number|ice.input_type|wmp.input_type|wmp.locallogger|wm.dispatcher_type|wm.enable_bulk_mm|wm.ism_ii_ldapsearch_async' /etc/glite-wms/glite_wms.conf did't produce any output).

WMS needs cron job to kill stale GridFTP processes (bug #67489, gLite Middleware, DONE) PRE-CERTIFIED (alvise, 27/03/2012)

rebuilt the RPM kill-stale-ftp from branch; installed on devel09.

[root@devel09 ~]# cat /etc/cron.d/kill-stale-ftp.cron 
PATH=/sbin:/bin:/usr/sbin:/usr/bin
5,15,25,35,45,55 * * * * root /sbin/kill-stale-ftp.sh >> /var/log/kill-stale-ftp.log 2>&1
[root@devel09 ~]# ll /sbin/kill-stale-ftp.sh
-rwxr-xr-x 1 root root 841 Mar 27 12:29 /sbin/kill-stale-ftp.sh
now the path is correct with my commit of today (27/03/2012). Moreover now the script seems to be working (when invoked by the cron):
[root@devel09 ~]# tail -2 /var/log/kill-stale-ftp.log
=== START Tue Mar 27 14:05:01 CEST 2012 PID 6617
=== READY Tue Mar 27 14:05:01 CEST 2012 PID 6617

WMS UI depends on a buggy libtar (on SL5 at least) (bug #89443, gLite Middleware, ) PRE-CERTIFIED (alvise, 28/03/2012). Tried this JDL:

dorigoa@lxgrid05 16:01:39 ~/emi/wmsui_emi2>cat ~/JDLs/WMS/wms_test_tar_bug.jdl
[ 
AllowZippedISB = true;
Executable = "/bin/ls" ; 
Arguments = "-lha " ; 
Stdoutput = "ls.out" ; 
InputSandbox = {"isb1", "isb2","isb3", "temp/isb4"}; 
OutputSandbox = { ".BrokerInfo", "ls.out"} ; 
Retrycount = 2; 
ShallowRetryCount = -1; 
usertags = [ bug = "#82687" ]; 
VirtualOrganisation="dteam"; 
]
dorigoa@lxgrid05 16:01:41 ~/emi/wmsui_emi2>stage/usr/bin/glite-wms-job-submit --debug -a -e https://devel09.cnaf.infn.it:7443/glite_wms_wmproxy_server ~/JDLs/WMS/wms_test_tar_bug.jdl
[ ... ]
28 March 2012, 16:02:21 -I- PID: 14236 (Debug) - File Transfer (gsiftp) 
 Command: /usr/bin/globus-url-copy
Source: file:///tmp/ISBfiles_YaMkV2gJJbddD38QUNR5DA_0.tar.gz
Destination: gsiftp://devel09.cnaf.infn.it:2811/var/SandboxDir/p_/https_3a_2f_2fdevel09.cnaf.infn.it_3a9000_2fp_5fNucDphCF_5fynIE-0XKnxg/input/ISBfiles_YaMkV2gJJbddD38QUNR5DA_0.tar.gz
-----------------------------------------
-----------------------------------------
28 March 2012, 16:02:22 -I- PID: 14236 (Debug) - File Transfer (gsiftp) Transfer successfully done
[ ... ]
So the file .tar.gz has been correctly created/transferred/removed. Verified that the source code does not use anymore the libtar's functions:
dorigoa@lxgrid05 16:11:05 ~/emi/wmsui_emi2>grep -r libtar emi.wms-ui.wms-ui-commands/src/
emi.wms-ui.wms-ui-commands/src/utilities/options_utils.cpp:* of the archiving tool (libtar; if zipped feature is allowed).
emi.wms-ui.wms-ui-commands/src/utilities/options_utils.h:               * of the archiving tool (libtar; if zipped feature is allowed).
emi.wms-ui.wms-ui-commands/src/services/jobsubmit.cpp~://#include "libtar.h"
emi.wms-ui.wms-ui-commands/src/services/jobsubmit.cpp://#include "libtar.h"

Complete procedure to verify the bug and verify the fix (as reported in the savannah bug):

- Have root or sudo access to a UI with EMI1 installation 
- create the path /home/alex/J0 
- create the NON empty files: 
-bash-3.2# cd /home/alex/J0 
-bash-3.2# ls -l 
total 12 
-rw-r--r-- 1 root root 413 May 11 14:38 hoco_ltsh.e 
-rw-r--r-- 1 root root 413 May 11 14:38 ltsh.sh 
-rw-r--r-- 1 root root 413 May 11 14:38 plantilla_venus.dat 
(make sure they are world-readable) 
- Create this JDL file: 
[dorigoa@cream-12 ~]$ cat JDLs/WMS/JDL_bug_89443.jdl 
[ 
StdOutput = "myjob.out"; 
ShallowRetryCount = 10; 
SignificantAttributes = { "Requirements","Rank","FuzzyRank" }; 
RetryCount = 3; 
Executable = "ltsh.sh"; 
Type = "job"; 
Arguments = "hoco_ltsh.e 0 1 200 114611111"; 
AllowZippedISB = true; 
VirtualOrganisation = "gridit"; 
JobType = "normal"; 
DefaultRank = -other.GlueCEStateEstimatedResponseTime; 
ZippedISB = { "ISBfiles_rjKoznMzsjvH6Nuvp0AhMQ_0.tar.gz" }; 
OutputSandbox = { "myjob.out","myjob.err","out.tar.gz" }; 
InputSandbox = { "file:///home/alex/J0/plantilla_venu...","file:///home/alex/J0/ltsh.sh","file:///home/alex/J0/hoco_ltsh.e" }; 
StdError = "myjob.err"; 
rank = -other.GlueCEStateEstimatedResponseTime; 
MyProxyServer = "myproxy.cnaf.infn.it"; 
requirements = ( regexp("ng-ce.grid.unipg.it:8443/cream-pbs-grid",other.GlueCEUniqueID) )&& ( other.GlueCEStateStatus == "Production" ) 
] 
Do not change anything in it; it must be submitted "as is". 
Submit this JDL with this command: 
$ glite-wms-job-submit --register-only -a --debug -e https://prod-wms-01.ct.infn.it:7443... <YOUR_JDL_CREATED_IN_THE_PREVIOUS_STEP> >&! log 
>&! is to redirect out/err in a file with tcsh SHELL; change it accordingly to your SHELL. 

Then grep ZIP in the log just created: 
11 May 2012, 15:26:14 -I- PID: 11561 (Debug) - ISB ZIPPED file successfully created: /tmp/ISBfiles_ZT4DysizXpjHOT-hmzQf2A_0.tar.gz 
ISB ZIP file : /tmp/ISBfiles_ZT4DysizXpjHOT-hmzQf2A_0.tar.gz 
Decompress it: 
dorigoa@cream-12 15:29:34 ~/JDLs/WMS>gunzip /tmp/ISBfiles_QPGfonkfQyOTbXa6uDpnZQ_0.tar.gz 
dorigoa@cream-12 15:29:38 ~/JDLs/WMS>tar tvf /tmp/ISBfiles_QPGfonkfQyOTbXa6uDpnZQ_0.tar 
-rw-r--r-- root/root 413 2012-05-11 14:38:09 SandboxDir/nJ/https_3a_2f_2fprod-wms-01.ct.infn.it_3a9000_2fnJmsSZ3XIaff3gbwkF0TVQ/input/plantilla_venus.dat 
-rw-r--r-- root/root 413 2012-05-11 14:38:06 SandboxDir/nJ/https_3a_2f_2fprod-wms-01.ct.infn.it_3a9000_2fnJmsSZ3XIaff3gbwkF0TVQ/input/ltsh.sh 
-rw-r--r-- root/root 413 2012-05-11 14:38:12 SandboxDir/nJ/https_3a_2f_2fprod-wms-01.ct.infn.it_3a9000_2fnJmsSZ3XIaff3gbwkF0TVQ/input/hoco_ltsh. 
You can see that the filename hoco.ltsh.e has been truncated (hoco.ltsh.) in the archive. 
Repeat the same procedure on a UI EMI2; the output will change a bit for what concern the location of the file ISB.....tar.gz, but you will have again to unzip it and verify it with "tar tvf"; you should see that the last file is not truncated anymore.

ICE should use env vars in its configuration (bug #90830, gLite Middleware, ) PRE-CERTIFIED (alvise, 29/03/2012). Installed last RPM build of wms.yaim and chcked the glite_wms.conf:

[root@devel09 siteinfo]# grep -E 'persist_dir|Input|ice_host_cert|ice_host_key' /etc/glite-wms/glite_wms.conf
    ice_host_cert   =   "${GLITE_HOST_CERT}";
    Input   =   "${WMS_LOCATION_VAR}/ice/jobdir";
    persist_dir   =   "${WMS_LOCATION_VAR}/ice/persist_dir";
    ice_host_key   =   "${GLITE_HOST_KEY}";
cron job deletes /var/proxycache (bug #90640, gLite Middleware, ) PRE-CERTIFIED (alvise, 29/03/2012). Verified the usage of "-mindepth 1" as explained in the bug's comment on savannah:
[root@devel09 cron.d]# grep proxycache *
glite-wms-wmproxy-purge-proxycache.cron:0 */6 * * * root . /usr/libexec/grid-env.sh ; /usr/bin/glite-wms-wmproxy-purge-proxycache /var/proxycache > /var/log/wms/glite-wms-wmproxy-purge-proxycache.log 2>&1

[root@devel09 cron.d]# grep find /usr/bin/glite-wms-wmproxy-purge-proxycache
find $1 -mindepth 1 -cmin +60 > $tmp_file
ICE jobdir issue - 1 bad CE can block all jobs (bug #80751, gLite Middleware, ) PRE-CERTIFIED (alvise, 29/03/2012). This is HOPEFULLY FIXED; I verified that the source code fixing the problem is there, but it is very difficult to test it, because it is needed to simulate a CE continuously going in connection timeout.

Various issues found while testing

18/04/2012

1) [dorigoa@cream-12 ~]$ glite-wms-job-submit -c ~/JDLs/WMS/wmp_devel09.conf -a ~/JDLs/WMS/wms.jdl

Connecting to the service https://devel09.cnaf.infn.it:7443/glite_wms_wmproxy_server

Warning - Unable to submit the job to the service: https://devel09.cnaf.infn.it:7443/glite_wms_wmproxy_server

Proxy file doesn't exist or has bad permissions

Error code: SOAP-ENV:Server

FIXED 17/04/12, commit in wmproxyThe problem is due to the recent authN/Z restructuring and occurs on a jobSubmit operation (i.e. a job submitted without ISB)

May 4 2012:

2) load_monitor gives: Can't do setuid (cannot exec sperl)

FIXED added dep perl-suidperl in MP

TOCHECK

3) with 'AsyncJobStart = false;' wmp crashes every second submission. problem is wherever there is a:

if (conf.getAsyncJobStart()) { // \/ Copy environment and restore it right after FCGI_Finish char** backupenv = copyEnvironment(environ); FCGI_Finish(); // returns control to client environ = backupenv; // /_ From here on, execution is asynchronous }

/usr/bin/glite_wms_wmproxy_server
/lib64/libpthread.so.0
/lib64/libc.so.6(gsignal+0x35)
/lib64/libc.so.6(abort+0x110)
/lib64/libc.so.6
/lib64/libc.so.6
/lib64/libc.so.6(cfree+0x4b)
classad::ClassAd::~ClassAd()
glite::jdl::Ad::~Ad()
jobStart(jobStartResponse&, std::string const&, soap*)
ns1__jobStart(soap*, std::string, ns1__jobStartResponse&)
soap_serve_ns1__jobStart(soap*)
soap_serve_request(soap*)
glite::wms::wmproxy::server::WMProxyServe::wmproxy_soap_serve(soap*)
glite::wms::wmproxy::server::WMProxyServe::serve()
/usr/bin/glite_wms_wmproxy_server(main+0x667)
/lib64/libc.so.6(__libc_start_main+0xf4)
glite::wmsutils::exception::Exception::getStackTrace()

FIXED 14/05/12, commit in wmproxy (copyEnvironment)

4) submitting dag1.jdl getSandboxBulkDestURI(getSandboxBulkDestURIResponse&, std::string const&, std::string const&) ns1__getSandboxBulkDestURI(soap*, std::string, std::string, ns1__getSandboxBulkDestURIResponse&) soap_serve_ns1__getSandboxBulkDestURI(soap*) soap_serve_request(soap*) glite::wms::wmproxy::server::WMProxyServe::wmproxy_soap_serve(soap*) glite::wms::wmproxy::server::WMProxyServe::serve() /usr/bin/glite_wms_wmproxy_server(main+0x667) /lib64/libc.so.6(__libc_start_main+0xf4) glite::wmsutils::exception::Exception::getStackTrace()

FIXED 14/05/12 by an update in LB bkserver from the latest RC

[mcecchi@ui ~]$ glite-wms-job-submit -a --endpoint https://devel09.cnaf.infn.it:7443/glite_wms_wmproxy_server dag1.jdl 

Connecting to the service https://devel09.cnaf.infn.it:7443/glite_wms_wmproxy_server


====================== glite-wms-job-submit Success ======================

The job has been successfully submitted to the WMProxy
Your job identifier is:

https://devel09.cnaf.infn.it:9000/RDzM29cERl8imVCMnFoRyA

==========================================================================

5) WM DOES NOT MATCH ANYTHING

FIXED

fixed by moving requirements only on WmsRequirements on the JDL. No requirements on ce_ad and no need for symmetric_match anymore. The expression is now: WmsRequirements = ((ShortDeadlineJob ? TRUE ? RegExp(".*sdj$", other.GlueCEUniqueID) : RegExp(".*sdj$", other.GlueCEUniqueID)) && (other.GlueCEPolicyMaxTotalJobs = 0 || other.GlueCEStateTotalJobs < other.GlueCEPolicyMaxTotalJobs) && (EnableWmsFeedback =? TRUE ? RegExp("cream", other.GlueCEImplementationName, "i") : true) && (member(CertificateSubject,other.GlueCEAccessControlBaseRule) || member(strcat("VO:",VirtualOrganisation),other.GlueCEAccessControlBaseRule) || FQANmember(strcat("VOMS:", VOMS_FQAN),other.GlueCEAcc essControlBaseRule)) && FQANmember(strcat("DENY:",VOMS_FQAN),other.GlueCEAccessControlBaseRule) && (IsUndefined(other.OutputSE) || member(other.OutputSE,GlueCESEBindGroupSEUniqu eID)));

6) Load script on wmproxy has security issues:

Insecure $ENV{PATH} while running setuid at /usr/sbin/glite_wms_wmproxy_load_monitor line 26.

FIXED, commit in wmp in wmproxy, load script interpreted with perl -U

TESTED on 16/5/2012

16 May, 16:13:55 -D- PID: 6886 - "wmpcommon::callLoadScriptFile": Executing command:  /usr/sbin/glite_wms_wmproxy_load_monitor --oper jobRegister --load1 22 --load5 20 --load15 18 --memusage 99 --diskusage 95 --fdnum 1000 --jdnum 1500 --ftpconn 300
16 May, 16:13:55 -D- PID: 6886 - "wmpcommon::callLoadScriptFile": Executing load script file: /usr/sbin/glite_wms_wmproxy_load_monitor

22/05/12. INSTALLING FROM EMI2 RC4

7) [root@devel09 ~]# /usr/bin/glite-wms-workload_manager 22 May, 11:51:06 -I: [Info] main(main.cpp:289): This is the gLite Workload Manager, running with pid 2454 22 May, 11:51:06 -I: [Info] main(main.cpp:297): loading broker dll libglite_wms_helper_broker_ism.so cannot load dynamic library libglite_wms_helper_broker_ism.so: /usr/lib64/libgsoap++.so.0: undefined symbol: soap_faultstring

25/5/12 FIXED, commit in broker-info Makefile.am

22/05/12

8) slower MM after authZcheck by conf? (0/4310 [1] )

FIXED

for now 1 second looks ok.

9) Submission to CREAM DOES NOT WORK with collections and dags

        - Cannot move ISB (retry_copy ${globus_transfer_cmd}
gsiftp://devel09.cnaf.infn.it:2811/var/SandboxDir/tu/https_3a_2f_2fdevel09.cnaf.infn.it_3a9000_2ftutvLp_5fPTFrLUqQH4OSS-A/input/Test.sh
file:///scratch/9462489.1.medium/home_crm07_232749015/CREAM232749015/Test.sh):

error: globus_ftp_client: the server responded with an error
500 500-Command failed. : globus_l_gfs_file_open failed.
500-globus_xio: Unable to open file
/var/SandboxDir/tu/https_3a_2f_2fdevel09.cnaf.infn.it_3a9000_2ftutvLp_5fPTFrLUqQH4OSS-A/input/Test.sh
500-globus_xio: System error in open: Permission denied
500-globus_xio: A system call failed: Permission denied
500 End.
        - Cannot move ISB (retry_copy ${globus_transfer_cmd}
gsiftp://devel09.cnaf.infn.it:2811/var/SandboxDir/tu/https_3a_2f_2fdevel09.cnaf.infn.it_3a9000_2ftutvLp_5fPTFrLUqQH4OSS-A/input/Test.sh
file:///scratch/9462489.1.medium/home_crm07_232749015/CREAM232749015/Test.sh):
error: globus_ftp_client: the server responded with an error500
500-Command failed. : globus_l_gfs_file_open failed.500-globus_xio:
Unable to open file
/var/SandboxDir/tu/https_3a_2f_2fdevel09.cnaf.infn.it_3a9000_2ftutvLp_5fPTFrLUqQH4OSS-A/input/Test.sh500-globus_xio:
System error in open: Permission denied500-globus_xio: A system call
failed: Permission denied500 End.; Cannot move ISB (retry_copy
${globus_transfer_cmd}
gsiftp://devel09.cnaf.infn.it:2811/var/SandboxDir/tu/https_3a_2f_2fdevel09.cnaf.infn.it_3a9000_2ftutvLp_5fPTFrLUqQH4OSS-A/input/Test.sh
file:///scratch/9462489.1.medium/home_crm07_232749015/CREAM232749015/Test.sh):
error: globus_ftp_client: the server responded with an error 500
500-Command failed. : globus_l_gfs_file_open failed.  500-globus_xio:
Unable to open file
/var/SandboxDir/tu/https_3a_2f_2fdevel09.cnaf.infn.it_3a9000_2ftutvLp_5fPTFrLUqQH4OSS-A/input/Test.sh
 500-globus_xio: System error in open: Permission denied
500-globus_xio: A system call failed: Permission denied  500 End.
    Status Reason:       failed (LB query failed)

10)

Proxy exception: Unable to get Not Before date from Proxy

sometimes on submission

11) wms-wm stop doesn't delete pid file (and check-daemons kicks in)

FIXED by commit in wm init script

12) Configuration of Condor 7.8.0

mkdir -p /usr/man
condor_configure --owner=glite --install-dir=/usr
CONDOR_CONFIG=/etc/condor/condor_config.local
rm -rf /usr/man
mkdir -p /var/log/condor
chown glite:glite  /var/log/condor

FIXED by commit in yaim

13) fix NEEDED in condorg.pc to build from FHS condor

14) Restarting /usr/bin/glite_wms_wmproxy_server... -bash: condorc-initialize: command not found

in /opt/glite/yaim/functions/config_gliteservices_wms

Edit | Attach | PDF | History: r107 | r75 < r74 < r73 < r72 | Backlinks | Raw View | More topic actions...
Topic revision: r73 - 2012-06-05 - MarcoCecchi
 
  • Edit
  • Attach
This site is powered by the TWiki collaboration platformCopyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback