Tags:
, view all tags

Pre-certification of WMS 3.3.6

https://savannah.cern.ch/task/?27731

Tests:

BUG https://savannah.cern.ch/bugs/?92657

The pre-certification consists of simply submit a job to the WMS and scan the syslog file /var/log/message to see if the WMProxy and Manager logged the relevant information required by this bug. Simply log as root on the WMS machine and execute the command:

tail -f /var/log/messages|egrep "wmproxy|manager"

then log into an UI and submit a job (whatever JDL you like) to the WMS. 2 log lines should appear after few seconds in the console running the tail command:

May 18 14:23:07 devel11 glite_wms_wmproxy_server[32565]: submission from lxgrid05.pd.infn.it, DN=/C=IT/O=INFN/OU=Personal Certificate/L=Padova/CN=Alvise Dorigo, FQAN=/dteam/Role=NULL/Capability=NULL, userid=18702 for jobid=https://devel07.cnaf.infn.it:9000/rkYSfEe5IqsDc17_UpPu3Q

May 18 14:23:10 devel11 glite-wms-workload_manager: jobid https://devel07.cnaf.infn.it:9000/rkYSfEe5IqsDc17_UpPu3Q was matched to destination creamce.gina.sara.nl:8443/cream-pbs-infra

Note in particular the DN,FQAN,JobID information and the UI's hostname.

BUG https://savannah.cern.ch/bugs/?92922

The pre-certification I did was in 2 phases: verification of the bug with EMI1 installation, installation of the new RPM (that will be in the next EMI1 update 17) and verification that the bug disappeared.

In order to reproduce the bug it is sufficient to use this JDL:

[ 
Executable = "/bin/touch" ; 
Arguments = "/foo" ; 
Retrycount = 2; 
usertags = [ exe = "touch" ]; 
VirtualOrganisation="dteam"; 
requirements = ! RegExp("cream.*", other.GlueCEUniqueID); 
] 
and submit it to a WMS EMI1 that have not the fix. Note that this bug occurs when the job lands on a NON-CREAM CE (this is why the requirements attribute specification in the JDL). This should be the result:

glite-wms-job-status https://devel09.cnaf.infn.it:9000/U... 
======================= glite-wms-job-status Success ===================== 
BOOKKEEPING INFORMATION: 
Status info for the Job : https://devel09.cnaf.infn.it:9000/U... 
Current Status: Done(Success) 
Exit code: 0 
Status Reason: Job terminated successfully 
Destination: ce01.dur.scotgrid.ac.uk:2119/jobmanager-lcgpbs-q7d 
Submitted: Wed May 16 16:19:38 2012 CEST 
========================================================================== 
Even if the job should return an exit code 1 (cannot create a file in /... permission denied), the Exit Code reported by the status is 0 as shown above. After the applying the fix to this EMI1 WMS, the same JDL should produce the expected exit code (1)

glite-wms-job-status https://devel07.cnaf.infn.it:9000/f... 
======================= glite-wms-job-status Success ===================== 
BOOKKEEPING INFORMATION: 
Status info for the Job : https://devel07.cnaf.infn.it:9000/f... 
Current Status: Done(Exit Code !=0) 
Exit code: 1 
Status Reason: Warning: job exit code != 0 
Submitted: Wed May 16 16:54:25 2012 CEST 
========================================================================== 

-- MarcoCecchi - 2012-04-26

Edit | Attach | PDF | History: r16 | r10 < r9 < r8 < r7 | Backlinks | Raw View | More topic actions...
Topic revision: r8 - 2012-05-21 - AlviseDorigo
 
  • Edit
  • Attach
This site is powered by the TWiki collaboration platformCopyright © 2008-2020 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback