Tags:
, view all tags

TESTS

  • Normal jobs work: OK

  • Dag jobs work: OK

  • Perusal jobs work: OK

  • MPICH jobs work: OK
Modified mpirun: Executing command: /home/dteam029/globus-tmp.griditwn03.7486.0/https_3a_2f_2fdevel17.cnaf.infn.it_3a9000_2fOsslm3cw4T7lgR09qJTR4g/cpi
Process 0 of 1 on griditwn03.na.infn.it
pi is approximately 3.1415926544231341, Error is 0.0000000008333410
wall clock time = 10.001266

  • Submission of 270 collections of 100 jobs each (10 collections every 30 minutes), using 1 user and a fuzzy rank (used 90 lcg CEs):
    • Success > 99.99% OK
    • Cancelled about 1800 jobs due to a problem with the CEs at in2p3.fr

Check bugs:

  • BUG #21909: FIXED in the wmproxy startup script
         if ( /sbin/pidof $httpd ) >/dev/null 2>&1 ; then
          echo $httpd \(pid `/sbin/pidof $httpd`\) is running ....

  • BUG #23443: FIXED
    • Required documents are not put into the glite doc template in edms

    for edg_rm_command in $GLITE_LOCATION/bin/edg-rm \
                          $EDG_LOCATION/bin/edg-rm \
                          `which edg-rm 2>/dev/null`; do

  • BUG #24690: NOT COMPLETELY FIXED
    • The message error that you could find in the wmproxy log (also with level 5) is: edg_wll_JobStat GSSAPI Error
    • In any case now there is a dedicated cron script to renew host-proxy (e.g. it is not included in the cron-purger script)

  • BUG #26885: FIXED
    • Job wrongly kept in ICE cache with status UNKNOWN: checked with two subsequent submissions of 5 collections made of 50 nodes each. ICE does not leave any job with status UNKNOWN behind in the cache

  • BUG #27215: NOT COMPLETELY FIXED
[ale@cream-15 regression]$ ls -l /tmp/ale_StdrEDNZljNnxCLx45ILIw
total 8
-rw-rw-r--  1 ale ale 30 Jul  8 16:02 out1.tail
-rw-rw-r--  1 ale ale 70 Jul  8 16:02 out2
-rw-rw-r--  1 ale ale  0 Jul  8 16:02 test.err
-rw-rw-r--  1 ale ale  0 Jul  8 16:02 test.out
It is not fixed instead using a CREAM -CE

  • BUG #27797: Not Checked
    • This bug could not be checked because I'm not able to submit parametrics jobs

[ale@cream-15 UI]$ glite-wms-job-logging-info -v 2 https://devel17.cnaf.infn.it:9000/Hr_TRdWT9XZrBux4DyWQsw | grep -A 2 Match | grep Dest
- Dest id                    =    ce103.cern.ch:2119/jobmanager-lcglsf-grid_dteam
- Dest id                    =    ce103.cern.ch:2119/jobmanager-lcglsf-grid_dteam
- Dest id                    =    ce103.cern.ch:2119/jobmanager-lcglsf-grid_dteam

[ale@cream-15 UI]$ cat /tmp/ale_zngnB9uVCWKT7B7MkSlBtA/env.out  | grep LD_LIBRARY
LD_LIBRARY_PATH=.

Master node is: node72.grid.pg.infn.it
Is should run on the following nodes:
node72.grid.pg.infn.it
node72.grid.pg.infn.it
node71.grid.pg.infn.it
node71.grid.pg.infn.it
*************************************
Current working directory is: /home/dteamsgm003/globus-tmp.node72.24167.0/https_3a_2f_2fdevel17.cnaf.infn.it_3a9000_2f-6An9ZDwkvot3aOLSzScdg
List files on the working directory:
/home/dteamsgm003/globus-tmp.node72.24167.0/https_3a_2f_2fdevel17.cnaf.infn.it_3a9000_2f-6An9ZDwkvot3aOLSzScdg:
total 352
drwxr-xr-x  2 dteamsgm003 dteamsgm   4096 Jun 30 11:03 .
drwx------  5 dteamsgm003 dteamsgm   4096 Jun 30 11:03 ..
-rwxr-xr-x  1 dteamsgm003 dteamsgm    822 Jun 30 11:03 30308_exe.sh
-rw-r--r--  1 dteamsgm003 dteamsgm   3687 Jun 30 11:03 .BrokerInfo
-rw-r--r--  1 dteamsgm003 dteamsgm    218 Jun 30 11:03 https_3a_2f_2fdevel17.cnaf.infn.it_3a9000_2f-6An9ZDwkvot3aOLSzScdg.output
-rw-r--r--  1 dteamsgm003 dteamsgm 330910 Jun 30 11:03 mpitest
-rw-r--r--  1 dteamsgm003 dteamsgm      0 Jun 30 11:03 test.err
-rw-r--r--  1 dteamsgm003 dteamsgm    385 Jun 30 11:03 test.out
-rw-------  1 dteamsgm003 dteamsgm      0 Jun 30 11:03 tmp.rdgPL24747
*********************************

*************************************************************
BOOKKEEPING INFORMATION:

Status info for the Job : https://devel17.cnaf.infn.it:9000/LEzR7tTwyh3P-iYZrKlwxg
Current Status:     Aborted
Status Reason:      The maximum number of output sandbox files is reached
Submitted:          Tue Jul  8 16:12:10 2008 CEST
*************************************************************

Error - WMProxy Server Error
HTTP Error
<html><head>
<title>500 Internal Server Error</title>
</head><body>
<h1>Internal Server Error</h1>
<p>The server encountered an internal error or
misconfiguration and was unable to complete
your request.</p>
<p>Please contact the server administrator,
 [no address given] and inform them of the time the error occurred,
and anything you might have done that may have
caused the error.</p>
<p>More information about this error may be available
in the server error log.</p>
</body></html>

Error code: SOAP-ENV:Server

You should find the correct reason only on the log file of the wmproxy:

08 Jul, 16:37:55 -D- PID: 24293 - "wmpgsoapoperations::ns1__enableFilePerusal": ------------------------------- Fault Description --------------------------------
08 Jul, 16:37:55 -D- PID: 24293 - "wmpgsoapoperations::ns1__enableFilePerusal": Method: enableFilePerusal
08 Jul, 16:37:55 -D- PID: 24293 - "wmpgsoapoperations::ns1__enableFilePerusal": Code: 1226
08 Jul, 16:37:55 -D- PID: 24293 - "wmpgsoapoperations::ns1__enableFilePerusal": Description: The Operation is not allowed: The maximum number of perusal files is reached
08 Jul, 16:37:55 -D- PID: 24293 - "wmpgsoapoperations::ns1__enableFilePerusal": Stack:
08 Jul, 16:37:55 -D- PID: 24293 - "wmpgsoapoperations::ns1__enableFilePerusal": JobOperationException: The Operation is not allowed: The maximum number of perusal files is reached
        at wmpoperations::enableFilePerusal()[wmpoperations.cpp:1287]
08 Jul, 16:37:55 -D- PID: 24293 - "wmpgsoapoperations::ns1__enableFilePerusal":         at enableFilePerusal()[wmpoperations.cpp:1231]
08 Jul, 16:37:55 -D- PID: 24293 - "wmpgsoapoperations::ns1__enableFilePerusal": ----------------------------------------------------------------------------------

  # customization point
  if [ -n "${GLITE_LOCAL_CUSTOMIZATION_DIR}" ]; then
    if [ -f "${GLITE_LOCAL_CUSTOMIZATION_DIR}/cp_1_5.sh" ]; then
      . "${GLITE_LOCAL_CUSTOMIZATION_DIR}/cp_1_5.sh"
    fi
  fi

[ale@cream-15 UI]$ ls -l /tmp/ale_eRWc528nX8QpEcs7im-R7g
total 8
-rw-rw-r--  1 ale ale 50 Jul  8 12:06 out1
-rw-rw-r--  1 ale ale  0 Jul  8 12:06 out2.tail
-rw-rw-r--  1 ale ale 50 Jul  8 12:06 out3
-rw-rw-r--  1 ale ale  0 Jul  8 12:06 out4.tail
-rw-rw-r--  1 ale ale  0 Jul  8 12:06 test.err
-rw-rw-r--  1 ale ale  0 Jul  8 12:06 test.out

-- AlessioGianelle - 27 Jun 2008

Edit | Attach | PDF | History: r61 | r23 < r22 < r21 < r20 | Backlinks | Raw View | More topic actions...
Topic revision: r21 - 2008-07-28 - ElisabettaMolinari
 
  • Edit
  • Attach
This site is powered by the TWiki collaboration platformCopyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback