Difference: WmsTests3dot1dot100 (1 vs. 61)

Revision 612011-02-24 - AlessioGianelle

Line: 1 to 1
Changed:
<
<
META TOPICPARENT name="TestWokPlan"
>
>
META TOPICPARENT name="TestPage"
 

TESTS

  • Normal jobs work: OK

Revision 602008-09-09 - AlessioGianelle

Line: 1 to 1
 
META TOPICPARENT name="TestWokPlan"

TESTS

Line: 26 to 26
 
    • checked by Laurence Field and ARC developers

Added:
>
>
    • Set the two parameters, subscription_update_threshold_time and subscription_duration in the ICE section of the glite_wms.conf file to low values, such as the following:
       subscription_duration  =  300;
       subscription_update_threshold_time =  150;
      so that a subscription expires after 5 minutes
    • Re-start ICE by the script '/opt/glite/etc/init.d/glite-wms-ice'
    • Submit a job and check the status of the subscription by the following command of the cream client:
       CEMonitorSubscriberMgr <cert_proxy> <cert_path> <service_URL_address> 
 
Changed:
<
<
  • BUG #21909: FIXED in the wmproxy startup script
>
>
  • BUG #21909: FIXED
    • In the wmproxy startup script check if there are these lines:
  if ( /sbin/pidof $httpd ) >/dev/null 2>&1 ; then
Changed:
<
<
echo $httpd \(pid `/sbin/pidof $httpd`\) is running ....
>
>
echo $httpd \(pid `/sbin/pidof $httpd`\) is running ....
 
  • BUG #23443: FIXED
    • Required documents are not put into the glite doc template in edms

Changed:
<
<
>
>
    • Check if in the JobWrapper there are these lines:
  for edg_rm_command in $GLITE_LOCATION/bin/edg-rm $EDG_LOCATION/bin/edg-rm `which edg-rm 2>/dev/null`; do
Changed:
<
<
>
>
[...]
 
  • BUG #24690: NOT COMPLETELY FIXED
    • The message error that you could find in the wmproxy log (also with level 5) is: edg_wll_JobStat GSSAPI Error
    • In any case now there is a dedicated cron script to renew host-proxy (e.g. it is not included in the cron-purger script)

Changed:
<
<
    • Job wrongly kept in ICE cache with status UNKNOWN: checked with two subsequent submissions of 5 collections made of 50 nodes each. ICE does not leave any job with status UNKNOWN behind in the cache
>
>
    • Checked with two subsequent submissions of 5 collections made of 50 nodes each. ICE does not leave any job with status UNKNOWN behind in the cache
 
Changed:
<
<
  • BUG #27215: NOT COMPLETELY FIXED
[ale@cream-15 regression]$ ls -l /tmp/ale_StdrEDNZljNnxCLx45ILIw
 total 8
>
>
  • BUG #27215: FIXED (for a LCG-CE); NOT fixed for a CREAM-CE
    • Set the parameter MaxOutputSandboxSize in the WorkloadManager section of the configuration file /opt/glite/etc/glite_wms.conf on the WMS to 100 and restart the workload manager.
    • Submit a jdl like this:
       
      [
      Type = "Job";
      Executable = "27215_exe.sh";
      Arguments = "70";
      StdOutput = "test.out";
      StdError = "test.err";
      Environment = {"GLITE_LOCAL_MAX_OSB_SIZE=35"};
      InputSandbox = {"27215_exe.sh"};
      OutputSandbox = {"test.err","test.out","out2", "out1"};
      usertags = [ bug = "27215" ];
      ]
      where 27215_exe.sh contains
      #!/bin/sh
      MAX=$1
      i=0
      while [ $i -lt $MAX ]; do
                      echo -n "1" >> out1
                      echo -n "2" >> out2
          i=$[$i + 1]
      done
    • When Done retrieving the output files, this should be the result of an ls -l of the output dir:
 -rw-rw-r-- 1 ale ale 30 Jul 8 16:02 out1.tail -rw-rw-r-- 1 ale ale 70 Jul 8 16:02 out2 -rw-rw-r-- 1 ale ale 0 Jul 8 16:02 test.err
Changed:
<
<
-rw-rw-r-- 1 ale ale 0 Jul 8 16:02 test.out It is not fixed instead using a CREAM -CE
>
>
-rw-rw-r-- 1 ale ale 0 Jul 8 16:02 test.out
 
Added:
>
>
    • Submit a jdl like this one:
       
      [
        JobType = "parametric";
        Executable = "/usr/bin/env";
        Environment = {"MYPATH_PARAM_=$PATH:/bin:/usr/bin:$HOME"};
        StdOutput = "echo_PARAM_.out";
        StdError = "echo_PARAM_.err";
        OutputSandbox = {"echo_PARAM_.out","echo_PARAM_.err"};
        Parameters =  {test, 2};
       ]
    • The generated jdl should contains:
      [
      requirements = other.GlueCEStateStatus == "Production";
      nodes = [ dependencies = { };
      Node_test = [ ... ];
      Node_2 = [ ... ];
      [...]
      ]
 
Added:
>
>
    • Edit the configuration file in /opt/glite/etc//glite_wmsclient.conf by changing the virtualorganisation attribute in the JdlDefaultAttributes section to a different from the one used to generate the user proxy, as in the following:
    • Submit a job and check the generated .jdl has the right virtualorganisation defined, that is the same as the one used to generated the user proxy
 
Changed:
<
<
[ale@cream-15 UI]$ glite-wms-job-logging-info -v 2 https://devel17.cnaf.infn.it:9000/Hr_TRdWT9XZrBux4DyWQsw | grep -A 2 Match | grep Dest
- Dest id                    =    ce103.cern.ch:2119/jobmanager-lcglsf-grid_dteam
- Dest id                    =    ce103.cern.ch:2119/jobmanager-lcglsf-grid_dteam
- Dest id                    =    ce103.cern.ch:2119/jobmanager-lcglsf-grid_dteam
>
>
    • Change the jdl setting the name of an existing CE (in the requirements)
    • At the end of the jobs with this command: glite-wms-job-logging-info -v 2 "jobid" | grep -A 2 Match | grep Dest you should see 3 times the name of the previously choosen CE. (The job must be Aborted with reason: hit job shallow retry count (2))
 
  • BUG #28249: Hopefully fixed
    • bug posted by the developer
Line: 78 to 116
 
    • compilation error with gcc-4.x

Added:
>
>
    • Create a delegated proxy with glite-wms-job-delegate-proxy -d pippo on the wmproxy server of the wms machine
    • Submit a job to a cream CE via the wms using the previously created delegated proxy: glite-wms-job-submit myjob.jdl -d pippo
    • Delete records for the user DN you are submitting with from the delegationdb on the CREAM CE, like the following:
       delete from t_credential where dn like '%Elisabetta%';
       delete from t_credential_cache where dn like '%Elisabetta%';
    • Submit a new normal job using the same delegated proxy as above;
 
Changed:
<
<
[ale@cream-15 UI]$ cat /tmp/ale_zngnB9uVCWKT7B7MkSlBtA/env.out  | grep LD_LIBRARY
 LD_LIBRARY_PATH=.
>
>
    • Submit this jdl:
      [
      Executable = "/usr/bin/env" ;
      Stdoutput = "env.out" ;
      StdError =  "env.err" ;
      shallowretrycount = 2;
      InputSandbox = { "data/input.txt" };
      OutputSandbox = { "env.out" ,"env.err", "input.txt" } ;
      Environment={"LD_LIBRARY_PATH=."};
      usertags = [ bug = "28642" ];
      ]
    • Get the output of the job. In the output directory you should find the file input.txt, and the LD_LIBRARY_PATH should be set to "." into the file env.out.
 
Added:
>
>
    • Stop ICE: /opt/glite/etc/glite-wms-ice stop
    • Corrupt ICE database, e.g. by doing the following:
      For each file (all but *proxy*) in /var/glite/ice/persist_dir do:
          cat "pippo" > "file"
      done
    • Start ICE: /opt/glite/etc/glite-wms-ice start
    • In the ICE log file you should see something like:
      2008-07-29 12:44:00,537 FATAL - jobCache::jobCache() - Failed to
      initialize the jobDbManager object. Reason is: Db::open: Invalid argument
 
  • BUG #29182: Hopefully fixed
    • not easy to reproduce
Line: 97 to 158
 
    • Fixed by not using 'clog'

Changed:
<
<
Master node is: node72.grid.pg.infn.it
 Is should run on the following nodes:
node72.grid.pg.infn.it
 node72.grid.pg.infn.it
 node71.grid.pg.infn.it
 node71.grid.pg.infn.it
*************************************
Current working directory is: /home/dteamsgm003/globus-tmp.node72.24167.0/https_3a_2f_2fdevel17.cnaf.infn.it_3a9000_2f-6An9ZDwkvot3aOLSzScdg
 List files on the working directory:
/home/dteamsgm003/globus-tmp.node72.24167.0/https_3a_2f_2fdevel17.cnaf.infn.it_3a9000_2f-6An9ZDwkvot3aOLSzScdg:
total 352
 drwxr-xr-x  2 dteamsgm003 dteamsgm   4096 Jun 30 11:03 .
drwx------  5 dteamsgm003 dteamsgm   4096 Jun 30 11:03 ..
-rwxr-xr-x  1 dteamsgm003 dteamsgm    822 Jun 30 11:03 30308_exe.sh
-rw-r--r--  1 dteamsgm003 dteamsgm   3687 Jun 30 11:03 .BrokerInfo
-rw-r--r--  1 dteamsgm003 dteamsgm    218 Jun 30 11:03 https_3a_2f_2fdevel17.cnaf.infn.it_3a9000_2f-6An9ZDwkvot3aOLSzScdg.output
-rw-r--r--  1 dteamsgm003 dteamsgm 330910 Jun 30 11:03 mpitest
-rw-r--r--  1 dteamsgm003 dteamsgm      0 Jun 30 11:03 test.err
-rw-r--r--  1 dteamsgm003 dteamsgm    385 Jun 30 11:03 test.out
-rw-------  1 dteamsgm003 dteamsgm      0 Jun 30 11:03 tmp.rdgPL24747
*********************************
>
>
    • Submit this jdl:
      [
      requirements = ( other.GlueCEStateStatus == "Production" ) && Member("MPICH",other.GlueHostApplicationSoftwareRunTimeEnvironment) && ( other.GlueCEInfoTotalCPUs >= 4 ) && ( other.GlueCEInfoLRMSType == "torque" ||   RegExp("pbs",other.GlueCEInfoLRMSType) );
      Type = "Job";
      NodeNumber = 4;
      Executable = "30308_exe.sh";
      Arguments = "cpi 4";
      StdOutput = "test.out";
      StdError = "test.err";
      InputSandbox = {"30308_exe.sh", "exe/cpi"};
      OutputSandbox = {"test.err","test.out","executable.out"};
      usertags = [ bug = "30308" ];
      ]
      Where the 30308_exe.sh should be:
      #!/bin/sh
      # The first parameter is the binary to be executed
      EXE=$1
      # The second parameter is the number of CPU's to be reserved for parallel execution
      CPU_NEEDED=$2
      chmod 777 $EXE
      # prints the list of files in the working directory
      echo "List files on the working directory:"
      ls -alR `pwd`
      # execute the user job
      mpirun -np $CPU_NEEDED -machinefile $PBS_NODEFILE `pwd`/$EXE >& executable.out
    • When DONE retrieve the output and check that the directory .mpi should not be listed in the test.out output file.
 
  • BUG #30518: Hopefully fixed
    • not easy to reproduce

Changed:
<
<
>
>
    • Already fixed and working on the production wms using patch #1491
 
Changed:
<
<
*************************************************************
BOOKKEEPING INFORMATION:

Status info for the Job : https://devel17.cnaf.infn.it:9000/LEzR7tTwyh3P-iYZrKlwxg
 Current Status:     Aborted
 Status Reason:      The maximum number of output sandbox files is reached
 Submitted:          Tue Jul  8 16:12:10 2008 CEST
*************************************************************
>
>
    • Set maxInputSandboxFiles = 2; in the WorkloadManagerProxy section of the configuration file on the WMS, and restart the wmproxy.
      • Submit a job with more than 2 files listed in the InputSandbox parameter
      • Check if the job is immediately set as Aborted and if the reason of the status is: The Operation is not allowed: The maximum number of input sandbox files is reached
    • Set maxOutputSandboxFiles = 2; in the WorkloadManagerProxy section of the configuration file on the WMS and restart the wmproxy.
      • Submit a job with more than 2 files listed in the I=OutputSandbox= parameter
      • Check if the job is immediately set as Aborted and if the reason of the status is: The Operation is not allowed: The maximum number of output sandbox files is reached
 
Changed:
<
<
>
>
    • Default value for MinPerusalTimeInterval should be check on the configuration file of the WMS.
    • Set MaxPerusalFiles = 1; on the WorkloadManagerProxy of the configuration file on the WMS and restart the wmproxy.
    • After the submission of the jdl give this command: glite-wms-job-perusal --set -f perusal.out -f perusal.err "jobid" The answer should be:
 Error - WMProxy Server Error The Operation is not allowed: The maximum number of perusal files is reached
Changed:
<
<
Method: enableFilePerusal

>
>
Method: enableFilePerusal
 
  • BUG #31006: Hopefully FIXED
    • Not easy to reproduce

Added:
>
>
    • Simply check the /opt/glite/etc/templates/template.sh file on a WMS
 
Added:
>
>
    • Using the command glite-wms-job-info --jdl "jobid" | grep -i requirements check if the expression RegExp(".*sdj$",other.GlueCEniqueID); is present (the exact expression should be found in the configuration file on the WMS, section: WorkloadManagerProxy, parameter: SDJRequirements)
    • Setting ShortDeadlineJob=false; in the jdl, the previous command should contain the expression !RegExp(".*sdj$",other.GlueCEUniqueID)
 
Added:
>
>
    • Set on the WMS conf file: II_Contact  =  "lcg-bdii.cern.ch";
    • Do a list-match using this jdl:
      [
        Requirements = RegExp(".manchester.ac.uk:2119.*",other.GlueCEUniqueID) && anyMatch(other.storage.CloseSEs,target.GlueSEStatus == "unset");
        Executable = "/bin/ls";
        prologue = "/bin/false";
      ]
      the output should be:
        - ce01.tier2.hep.manchester.ac.uk:2119/jobmanager-lcgpbs-dteam
 
Changed:
<
<
    • reproduced the problem by inserting a 500 sec sleep in the dirmanager and killing it by hand while unzipping the ISB. The job stays in status 'waiting' and is not forwarded to the WM.
>
>
    • Reproduced the problem by inserting a 500 sec sleep in the dirmanager and killing it by hand while unzipping the ISB. The job stays in status 'waiting' and is not forwarded to the WM.
 
Changed:
<
<
>
>
    • Check if in the file /opt/glite/etc/templates/template.sh on a WMS there are these lines:
  # customization point if [ -n "${GLITE_LOCAL_CUSTOMIZATION_DIR}" ]; then if [ -f "${GLITE_LOCAL_CUSTOMIZATION_DIR}/cp_1_5.sh" ]; then . "${GLITE_LOCAL_CUSTOMIZATION_DIR}/cp_1_5.sh" fi
Changed:
<
<
fi
>
>
fi
 
Added:
>
>
    • Set a very low timeout for the BDII on the WMS conf file: II_Timeout  =  3;
    • Now setting on the WMS conf file: IsmIILDAPSearchAsync = false;
    • You should see in the log file of the workload_manager (if yuo use a populate BDII):
          [Warning] fetch_bdii_ce_info(ldap-utils.cpp:640): Timed out
          [Warning] fetch_bdii_se_info(ldap-utils.cpp:308): Timed out
          [Debug] do_purchase(ism-ii-purchaser.cpp:176): BDII fetching completed in 4 seconds
          [Info] do_purchase(ism-ii-purchaser.cpp:193): Total VO_Views entries in ISM : 0
          [Info] do_purchase(ism-ii-purchaser.cpp:194): Total SE entries in ISM : 0
    • Setting: IsmIILDAPSearchAsync = true: you should obtain more (>0) VO_Views entries (e.g.):
          [Debug] fetch_bdii_ce_info(ldap-utils-asynch.cpp:628): #1652 LDAP entries received in 5 seconds
          [Debug] fetch_bdii_ce_info(ldap-utils-asynch.cpp:781): ClassAd reppresentation built in 0 seconds
          [Debug] fetch_bdii_se_info(ldap-utils-asynch.cpp:444): #2381 LDAP entries received in 5 seconds
          [Debug] fetch_bdii_se_info(ldap-utils-asynch.cpp:504): ClassAd reppresentation built in 0 seconds
          [Debug] do_purchase(ism-ii-purchaser.cpp:176): BDII fetching completed in 10 seconds
          [Info] do_purchase(ism-ii-purchaser.cpp:193): Total VO_Views entries in ISM : 53
          [Info] do_purchase(ism-ii-purchaser.cpp:194): Total SE entries in ISM : 61
 
Changed:
<
<
>
>
 
Added:
>
>
    • Submit a jdl
    • Look into the SandBox dir of the job (on the WMS) until you see the Maradona file
    • Put the condor job (equivalent to your job previously submitted) on hold, this should trigger a resubmission
    • When the job has been resubmitted check if the old Maradona file has been removed
 
Added:
>
>
    • Set the II_Timeout parameter in the NetworkServr section of the glite_wms.conf file on the WMS to a very low value, as for ex.: II_Timeout  =  2;
    • Rre-start the WM and check the $GLITE_WMS_LOCATION_VAR/workload_manager/ismdump.fl does not get emptied
    • Perform some job-list-match operation checking that it gets some match results
 
Added:
>
>
    • Add this parameter to section WorkloadManager of the glite_wms.conf configuration file (using for example vo "cms" as filter): IsmIILDAPCEFilterExt = "(|(GlueCEAccessControlBaseRule=VO:cms)(GlueCEAccessControlBaseRule=VOMS:/cms/*))"
    • Restart the WM
    • Doing a list-match using a voms proxy of a different VO (e.g. dteam) you should obtain "no resource available".
 
  • BUG #33140: Hopefully FIXED
    • Not easy to reproduce

Added:
>
>
    • Removed if present the directory $GLITE_WMS_LOCATION_VAR/workload_manager/jobdir on the WMS
    • Restart the wm and check id the previous directory is recreated.
 
Added:
>
>
    • Stop the WM on the WMS.
    • Submit a collection
    • Restart the WM
    • Check if the status of the collection changes to Running
 
Added:
>
>
    • Set the "ExpiryPeriod" parameter in the glite_wms.conf configuration file to a very low value, such as the following: ExpiryPeriod  =  2;
    • Overload the wms, by submitting several collections sequentially, like for example: 10 collections by 100 nodes
    • Check the jobstatus of the last submitted collections and keep submitting until the status of the parent node is aborted because of the following:
      *************************************************************
       BOOKKEEPING INFORMATION:
      
       Status info for the Job : https://devel17.cnaf.infn.it:9000/qQe68ESYiRNDNXZPNsG-AA
       Current Status:     Aborted
       Status Reason:      request expired
       Submitted:          Wed Jul 30 11:23:49 2008 CEST
      
      *************************************************************
    • Stop submitted and check if the status of all the children nodes is aborted as well
 
Added:
>
>
    • Check if the proxy file name is hardcoded on $GLITE_WMS_LOCATION/sbin/glite-wms-purgeStorage.sh
 
Added:
>
>
    • Test it for the filelist input method by setting the following two parameters in the glite_wms.conf WMS configuration file, workload manager section:
                   DispatcherType  =  "filelist";
                   Input  =  "${GLITE_LOCATION_VAR}/workload_manager/input.fl";
      • Re-start the WM and submit a DAG
      • Check if it is successful
    • Test it for the jobdir input method by setting the following two parameters in the glite_wms.conf WMS configuration file, workload manager section:
                    DispatcherType  =  "jobdir";
                    Input = "${GLITE_LOCATION_VAR}/workload_manager/jobdir";
      • Re-start the WM and submit a DAG
      • Check if it is successful
 
  • BUG #35878: FIXED
    • compilation error with gcc-4.x

Added:
>
>
    • Set the following two parameters in the glite_wms.conf WMS configuration file, workload manager section:
                    DispatcherType  =  "jobdir";
                    Input = "${GLITE_LOCATION_VAR}/workload_manager/jobdir";
      • Re-start the WM and submit a DAG
      • Check if it is successful
 
  • BUG #36341: Hopefully fixed
    • bug posted by the developer
Line: 210 to 336
 
    • bug posted by the developer

Added:
>
>
    • Consider this /opt/glite/etc/glite_wms_wmproxy.gacl file:
      <?xml version="1.0"?>
      <gacl version="0.0.1">
      <entry>
      <any-user/>
      <allow><exec/></allow>
      </entry>
      </gacl>
  • Restart wmproxy: /opt/glite/etc/init.d/glite-wms-wmproxy restart
  • Try to issue some commands (e.g. glite-wms-job-list-match, glite-wms-job-submit, glite-wms-job-delegate-proxy, etc...) towards that WMS They should succeed considering any proxy
 
Changed:
<
<
    • submitted a normal job
    • waited until finished successfully
    • checked the job record is in the LBProxy mysql DB
    • retrieved the output via 'glite-wms-job-output'
    • checked the job record is no more in the LBProxy mysql DB
>
>
    • Submitted a normal job
    • Waited until finished successfully
    • Checked the job record is in the LBProxy mysql DB (e.g.: mysql# select * from jobs where jobid like '%hLrG4YYebvYB0xsrPO4q8A%'; where https://devel17.cnaf.infn.it:9000/hLrG4YYebvYB0xsrPO4q8A is the jobid)
    • Retrieved the output via 'glite-wms-job-output'
    • Checked the job record is no more in the LBProxy mysql DB (e.g.: the previous query should return: Empty set)
 

Added:
>
>
    • Check if in the syslog there are lines like:
      May 14 12:37:12 trinity glite_wms_wmproxy_server[3633]: ts=2008-05-14T12:37:12 : event=wms.wmpserver_setJobFileSystem() : userid=502 jobid=https://devel15.cnaf.infn.it:9000/J...
      e.g. userid is specified
 
  • BUG #36870: FIXED
    • Fixed by removing the spec file
Line: 229 to 367
 
    • bug posted by the developer

Added:
>
>
    • Check if in the WMS there is this file: =/etc/cron.d/glite-wms-create-host-proxy.cron=
      HOME=/
      MAILTO=SA3-italia
      
      0 */6 * * * glite . /opt/glite/etc/profile.d/grid-env.sh ; /opt/glite/sbin/glite-wms-create-proxy.sh /opt/glite/var/wms.proxy /opt/glite/var/log/create_proxy.log
 
  • BUG #36907: Hopefully fixed
    • Not easy to reproduce
Line: 246 to 389
 
    • Tested using a short proxy to submit a longer job and ICE does not resubmit it, but afterwards the status is not updated to Done by ICE, due to another bug #39807

Changed:
<
<
[root@wms008 init.d]# grep GLITE_LOCATION glite-wms-ice
GLITE_LOCATION=${GLITE_LOCATION:-/opt/glite}
>
>
    • Do this check:
      [root@wms008 init.d]# grep GLITE_LOCATION glite-wms-ice
      GLITE_LOCATION=${GLITE_LOCATION:-/opt/glite}
 
  • BUG #37916: Hopefully fixed
    • bug posted by the developer

Changed:
<
<
[ale@cream-15 UI]$ ls -l /tmp/ale_eRWc528nX8QpEcs7im-R7g
 total 8
>
>
    • Set the parameter MaxOutputSandboxSize in the WorkloadManager section of the configuration file /opt/glite/etc/glite_wms.conf on the WMS to 100 and restart the workload manager.
    • Submit this jdl:
      [
      Type = "Job";
      Executable = "38359_exe.sh";
      Arguments = "50";
      StdOutput = "test.out";
      StdError = "test.err";
      InputSandbox = {"38359_exe.sh"};
      OutputSandbox = {"test.err","test.out","out3", "out1", "out4", "out2"};
      usertags = [ bug = "38359" ];
      ]
      where 38359_exe.sh is:
      #!/bin/sh
      MAX=$1
      i=0
      while [ $i -lt $MAX ]; do
          echo -n "1" >> out1
                      echo -n "2" >> out2
                      echo -n "3" >> out3
                      echo -n "4" >> out4
          i=$[$i + 1]
      done
      i=200
      while [ $i -lt 100 ]; do
          echo -n "1" >> out1
          echo -n "2" >> out2
          echo -n "3" >> out3
          echo -n "4" >> out4
          i=$[$i + 1]
      done
    • When Done retrieve the output files, this should be the result of an ls -l of the output dir:
 -rw-rw-r-- 1 ale ale 50 Jul 8 12:06 out1 -rw-rw-r-- 1 ale ale 0 Jul 8 12:06 out2.tail -rw-rw-r-- 1 ale ale 50 Jul 8 12:06 out3 -rw-rw-r-- 1 ale ale 0 Jul 8 12:06 out4.tail -rw-rw-r-- 1 ale ale 0 Jul 8 12:06 test.err
Changed:
<
<
-rw-rw-r-- 1 ale ale 0 Jul 8 12:06 test.out
>
>
-rw-rw-r-- 1 ale ale 0 Jul 8 12:06 test.out
 
Added:
>
>
    • Log on the WMS. Stop the workload manager. Put in the directory $GLITE_WMS_LOCATION_VAR/workload_manager/jobdir/new/ this list-match request:
      [root@devel19 glite]# cat /var/glite/workload_manager/jobdir/tmp/20080625T133135.906497_3085874880
      [ arguments = [ ad = [ requirements = ( other.GlueCEStateStatus =="Production" || other.GlueCEStateStatus == "CREAMPreCertTests" ) &&
      !RegExp(".*sdj$",other.GlueCEUniqueID); RetryCount = 3; Arguments = "/tmp"; MyProxyServer = "myproxy.cnaf.infn.it"; AllowZippedISB = true; JobType =
      "normal"; InputSandboxDestFileName = { "pippo","pluto" }; SignificantAttributes = { "Requirements","Rank" }; FuzzyRank = true;
      Executable = "/bin/ls"; CertificateSubject = "/C=IT/O=INFN/OU=Personal Certificate/L=Padova/CN=Alessio Gianelle"; X509UserProxy =
      "/tmp/user.proxy.6056.20080625153135905"; Stdoutput = "ls.out"; VOMS_FQAN = "/dteam/Role=NULL/Capability=NULL"; OutputSandbox = { "ls.out" };
      VirtualOrganisation = "dteam"; usertags = [ exe = "ls" ]; rank =-other.GlueCEStateEstimatedResponseTime; Type = "job"; ShallowRetryCount = 3;
      InputSandbox = {"protocol://address/input/pippo","protocol://address/input/pluto" }; Fuzzyparameter = 1.000000000000000E-01 ]; include_brokerinfo = false; file =
      "/tmp/6056.20080625153135905"; number_of_results = -1 ]; command = "match"; version = "1.0.0" ]
    • Start the workload manager and look if it works.
 

Added:
>
>
    • Consider this /opt/glite/etc/glite_wms_wmproxy.gacl file:
      <?xml version="1.0"?>
      <gacl version="0.0.1">
      <entry>
      <any-user>
      </any-user>
      <deny><exec/></deny>
      </entry>
      <entry>
      <voms>
      <fqan>dteam</fqan>
      </voms>
      <deny><exec/></deny>
      </entry>
      <entry>
      <person>
      <dn>/C=IT/O=INFN/OU=Personal Certificate/L=Padova/CN=Massimo Sgaravatto</dn>
      </person>
      <allow><exec/></allow>
      </entry>
      </gacl>
      replacing "/C=IT/O=INFN/OU=Personal Certificate/L=Padova/CN=Massimo Sgaravatto" with your user DN
    • Try to issue some commands (e.g. glite-wms-job-list-match, glite-wms-job-submit, glite-wms-job-delegate-proxy, etc...) towards that WMS with your dteam VO proxy. They should succeed
 
Added:
>
>
    • Premises:
      • Current memory usage by CREAM is logged in ice log file ($GLITE_WMS_LOCATION_VAR/log/ice.log) is rows such as this one:
        2008-07-28 16:13:23,068 DEBUG - glite-wms-ice::main() - Used RSS Memory: 9780
      • Memory threshold is defined in the ICE section of the WMS conf file by the attribue max_ice_mem
      • When 'current memory' > ' Memory threshold' the suicidal patch is triggered
    • Try to trigger several times the suicidal patch, editing the WMS conf file and setting a low enough value for max_ice_mem
    • Restart ice: /opt/glite/etc/init.d/glite-wms-ice restart
    • When the suicidal patch is triggered, verify that ICE is properly shut down. You will see something like that in the log file:
      2008-07-28 16:45:27,591 FATAL - glite-wms-ice::main() - glite-wms-ice::main -
      Max memory reached [10532 kB] ! EXIT!
    • Then verify that after a while (5 min) ICE restarts
 
Added:
>
>
    • Procede as in the previous bug: #38816
 
Added:
>
>
    • You need to check the code of $GLITE_WMS_LOCATION/sbin/glite-wms-purgeStorage.sh as specified in the bug
 

Added:
>
>
    • Submit a job thorugh ICE (use this requirements: Requirements =  RegExp("cream",other.GlueCEUniqueID);
    • Remove the job directory from the WMS
    • Check in the log if ICE figure out that the proxy is disappeared.
 
Changed:
<
<
>
>
  -- AlessioGianelle - 27 Jun 2008 \ No newline at end of file

Revision 592008-09-08 - AlessioGianelle

Line: 1 to 1
 
META TOPICPARENT name="TestWokPlan"

TESTS

Line: 218 to 218
 
    • retrieved the output via 'glite-wms-job-output'
    • checked the job record is no more in the LBProxy mysql DB
Added:
>
>
 

Revision 582008-09-04 - AlessioGianelle

Line: 1 to 1
 
META TOPICPARENT name="TestWokPlan"

TESTS

Line: 281 to 281
 
Added:
>
>
 -- AlessioGianelle - 27 Jun 2008 \ No newline at end of file

Revision 572008-09-03 - AlessioGianelle

Line: 1 to 1
 
META TOPICPARENT name="TestWokPlan"

TESTS

Line: 277 to 277
 
Added:
>
>
 

-- AlessioGianelle - 27 Jun 2008

Revision 562008-09-01 - AlessioGianelle

Line: 1 to 1
 
META TOPICPARENT name="TestWokPlan"

TESTS

Line: 140 to 140
 ***********************************************************
Changed:
<
<
>
>
 
Line: 194 to 194
 
Changed:
<
<
  • BUG #35156: Not fully fixed. The scenario when HostProxyFile is not in the conf file is not managed
>
>
 

Revision 552008-08-25 - AlessioGianelle

Line: 1 to 1
 
META TOPICPARENT name="TestWokPlan"

TESTS

Line: 71 to 71
 - Dest id = ce103.cern.ch:2119/jobmanager-lcglsf-grid_dteam
Changed:
<
<
>
>
  • BUG #28249: Hopefully fixed
    • bug posted by the developer
 
  • BUG #28498: FIXED
    • compilation error with gcc-4.x
Line: 89 to 90
 
  • BUG #29182: Hopefully fixed
    • not easy to reproduce
Changed:
<
<
>
>
  • BUG #29538: Hopefully fixed
    • bug posted by the developer
 
  • BUG #30289: FIXED
    • Fixed by not using 'clog'
Line: 201 to 203
 
Changed:
<
<
>
>
  • BUG #36341: Hopefully fixed
    • bug posted by the developer
 
Changed:
<
<
>
>
  • BUG #36466: Hopefully fixed
    • bug posted by the developer
 
Line: 219 to 223
 
  • BUG #36870: FIXED
    • Fixed by removing the spec file
Changed:
<
<
>
>
  • BUG #36876: Hopefully fixed
    • bug posted by the developer
 
Changed:
<
<
>
>
  • BUG #36907: Hopefully fixed
    • Not easy to reproduce
 

Changed:
<
<
>
>
 
  • BUG #37674: Hopefully FIXED
    • Not easy to reproduce

  • BUG #37756: NOT COMPLETELY FIXED
Changed:
<
<
>
>
    • Tested using a short proxy to submit a longer job and ICE does not resubmit it, but afterwards the status is not updated to Done by ICE, due to another bug #39807
 
Line: 243 to 249
 GLITE_LOCATION=${GLITE_LOCATION:-/opt/glite}
Changed:
<
<
>
>
  • BUG #37916: Hopefully fixed
    • bug posted by the developer
 

Revision 542008-08-20 - AlessioGianelle

Line: 1 to 1
 
META TOPICPARENT name="TestWokPlan"

TESTS

Line: 270 to 270
 
Changed:
<
<
* BUG #39501: TO BE CHECKED
>
>
 -- AlessioGianelle - 27 Jun 2008 \ No newline at end of file

Revision 532008-08-20 - AlessioGianelle

Line: 1 to 1
 
META TOPICPARENT name="TestWokPlan"

TESTS

Line: 223 to 223
 
Added:
>
>
 

Line: 268 to 270
 
Added:
>
>
* BUG #39501: TO BE CHECKED
 -- AlessioGianelle - 27 Jun 2008

Revision 522008-08-20 - FrancescoGiacomini

Line: 1 to 1
 
META TOPICPARENT name="TestWokPlan"

TESTS

Line: 151 to 151
 
Changed:
<
<
>
>
  • BUG #31006: Hopefully FIXED
    • Not easy to reproduce
 
Line: 222 to 223
 
Changed:
<
<
>
>
 

Revision 512008-08-08 - FrancescoGiacomini

Line: 1 to 1
 
META TOPICPARENT name="TestWokPlan"

TESTS

Line: 22 to 22
 

Check bugs:

Changed:
<
<
>
>
  • BUG #13494: FIXED
    • checked by Laurence Field and ARC developers
 

Revision 502008-08-08 - ElisabettaMolinari

Line: 1 to 1
 
META TOPICPARENT name="TestWokPlan"

TESTS

Line: 230 to 230
 
  • BUG #37674: Hopefully FIXED
    • Not easy to reproduce
Changed:
<
<
>
>
 

Revision 492008-08-08 - ElisabettaMolinari

Line: 1 to 1
 
META TOPICPARENT name="TestWokPlan"

TESTS

Line: 137 to 137
 ***********************************************************
Changed:
<
<
>
>
 
Added:
>
>
 Error - WMProxy Server Error
Changed:
<
<
HTTP Error 500 Internal Server Error

Internal Server Error

The server encountered an internal error or misconfiguration and was unable to complete your request.

Please contact the server administrator, [no address given] and inform them of the time the error occurred, and anything you might have done that may have caused the error.

More information about this error may be available in the server error log.

<-- GOOGLEANALYTICSPLUGIN -->

Error code: SOAP-ENV:Server

You should find the correct reason only on the log file of the wmproxy:

08 Jul, 16:37:55 -D- PID: 24293 - "wmpgsoapoperations::ns1__enableFilePerusal": ------------------------------- Fault Description --------------------------------
08 Jul, 16:37:55 -D- PID: 24293 - "wmpgsoapoperations::ns1__enableFilePerusal": Method: enableFilePerusal
 08 Jul, 16:37:55 -D- PID: 24293 - "wmpgsoapoperations::ns1__enableFilePerusal": Code: 1226
 08 Jul, 16:37:55 -D- PID: 24293 - "wmpgsoapoperations::ns1__enableFilePerusal": Description: The Operation is not allowed: The maximum number of perusal files is reached
 08 Jul, 16:37:55 -D- PID: 24293 - "wmpgsoapoperations::ns1__enableFilePerusal": Stack:
08 Jul, 16:37:55 -D- PID: 24293 - "wmpgsoapoperations::ns1__enableFilePerusal": JobOperationException: The Operation is not allowed: The maximum number of perusal files is reached
        at wmpoperations::enableFilePerusal()[wmpoperations.cpp:1287]
08 Jul, 16:37:55 -D- PID: 24293 - "wmpgsoapoperations::ns1__enableFilePerusal":         at enableFilePerusal()[wmpoperations.cpp:1231]
08 Jul, 16:37:55 -D- PID: 24293 - "wmpgsoapoperations::ns1__enableFilePerusal": ----------------------------------------------------------------------------------
>
>
The Operation is not allowed: The maximum number of perusal files is reached

Method: enableFilePerusal

 

Revision 482008-08-08 - ElisabettaMolinari

Line: 1 to 1
 
META TOPICPARENT name="TestWokPlan"

TESTS

Line: 58 to 58
  It is not fixed instead using a CREAM -CE
Changed:
<
<
  • BUG #27797: Not Checked
    • This bug could not be checked because I'm not able to submit parametrics jobs
>
>
 

Revision 472008-08-06 - ElisabettaMolinari

Line: 1 to 1
 
META TOPICPARENT name="TestWokPlan"

TESTS

Line: 239 to 239
 
    • retrieved the output via 'glite-wms-job-output'
    • checked the job record is no more in the LBProxy mysql DB
Changed:
<
<
  • BUG #36558: CHECKING IN PROGRESS
>
>
 
  • BUG #36870: FIXED
    • Fixed by removing the spec file

Revision 462008-08-06 - ElisabettaMolinari

Line: 1 to 1
 
META TOPICPARENT name="TestWokPlan"

TESTS

Line: 254 to 254
 
Changed:
<
<
>
>
  • BUG #37674: Hopefully FIXED
    • Not easy to reproduce
 

Revision 452008-08-06 - ElisabettaMolinari

Line: 1 to 1
 
META TOPICPARENT name="TestWokPlan"

TESTS

Line: 186 to 186
 

Added:
>
>
    • reproduced the problem by inserting a 500 sec sleep in the dirmanager and killing it by hand while unzipping the ISB. The job stays in status 'waiting' and is not forwarded to the WM.
 

Revision 442008-08-05 - ElisabettaMolinari

Line: 1 to 1
 
META TOPICPARENT name="TestWokPlan"

TESTS

Line: 24 to 24
 
Changed:
<
<
>
>
 
  • BUG #21909: FIXED in the wmproxy startup script
Line: 185 to 185
 
Changed:
<
<
>
>
 

Revision 432008-08-05 - FrancescoGiacomini

Line: 1 to 1
 
META TOPICPARENT name="TestWokPlan"

TESTS

Line: 73 to 73
 
Added:
>
>
  • BUG #28498: FIXED
    • compilation error with gcc-4.x
 

Line: 217 to 220
 
Added:
>
>
  • BUG #35878: FIXED
    • compilation error with gcc-4.x
 

Revision 422008-08-05 - ElisabettaMolinari

Line: 1 to 1
 
META TOPICPARENT name="TestWokPlan"

TESTS

Line: 83 to 83
 
Changed:
<
<
>
>
  • BUG #29182: Hopefully fixed
    • not easy to reproduce
 
Line: 115 to 116
 *******************************
Changed:
<
<
>
>
  • BUG #30518: Hopefully fixed
    • not easy to reproduce
 

Revision 412008-08-05 - FrancescoGiacomini

Line: 1 to 1
 
META TOPICPARENT name="TestWokPlan"

TESTS

Line: 178 to 178
 
Changed:
<
<
>
>
 

Revision 402008-08-05 - ElisabettaMolinari

Line: 1 to 1
 
META TOPICPARENT name="TestWokPlan"

TESTS

Line: 237 to 237
 
Changed:
<
<
>
>
 
Changed:
<
<
>
>
 
Line: 280 to 280
 
Changed:
<
<
  • BUG #39215: CHECKING IN PROGRESS
>
>
  -- AlessioGianelle - 27 Jun 2008

Revision 392008-08-04 - ElisabettaMolinari

Line: 1 to 1
 
META TOPICPARENT name="TestWokPlan"

TESTS

Line: 22 to 22
 

Check bugs:

Added:
>
>

 
  • BUG #21909: FIXED in the wmproxy startup script
         if ( /sbin/pidof $httpd ) >/dev/null 2>&1 ; then
Line: 67 to 71
 - Dest id = ce103.cern.ch:2119/jobmanager-lcglsf-grid_dteam
Added:
>
>
 

Line: 79 to 85
 
Added:
>
>
 
  • BUG #30289: FIXED
    • Fixed by not using 'clog'
Line: 194 to 202
 
Changed:
<
<
>
>
  • BUG #33140: Hopefully FIXED
    • Not easy to reproduce
 
Line: 206 to 215
 
Added:
>
>

 

Line: 217 to 232
 
  • BUG #36558: CHECKING IN PROGRESS
Added:
>
>
  • BUG #36870: FIXED
    • Fixed by removing the spec file

 

[root@wms008 init.d]# grep GLITE_LOCATION glite-wms-ice
GLITE_LOCATION=${GLITE_LOCATION:-/opt/glite}
Added:
>
>
 
Line: 246 to 280
 
Added:
>
>
  • BUG #39215: CHECKING IN PROGRESS
 -- AlessioGianelle - 27 Jun 2008 \ No newline at end of file

Revision 382008-08-04 - ElisabettaMolinari

Line: 1 to 1
 
META TOPICPARENT name="TestWokPlan"

TESTS

Line: 190 to 190
 
Changed:
<
<
  • BUG #33026: Checking in progress
>
>
 

Revision 372008-08-01 - ElisabettaMolinari

Line: 1 to 1
 
META TOPICPARENT name="TestWokPlan"

TESTS

Line: 184 to 184
  fi
Changed:
<
<
>
>
 

Revision 362008-07-31 - ElisabettaMolinari

Line: 1 to 1
 
META TOPICPARENT name="TestWokPlan"

TESTS

Line: 190 to 190
 
Changed:
<
<
>
>
  • BUG #33026: Checking in progress
 
Line: 238 to 238
 
Added:
>
>
 

Revision 352008-07-31 - ElisabettaMolinari

Line: 1 to 1
 
META TOPICPARENT name="TestWokPlan"

TESTS

Line: 75 to 75
  LD_LIBRARY_PATH=.
Added:
>
>
 

Line: 107 to 109
 
Changed:
<
<
>
>
 
Line: 161 to 164
 
Deleted:
<
<
 

Revision 342008-07-31 - ElisabettaMolinari

Line: 1 to 1
 
META TOPICPARENT name="TestWokPlan"

TESTS

Line: 67 to 67
 - Dest id = ce103.cern.ch:2119/jobmanager-lcglsf-grid_dteam
Changed:
<
<
  • BUG #28637: Checking in progress
>
>
 

Revision 332008-07-31 - MassimoSgaravatto

Line: 1 to 1
 
META TOPICPARENT name="TestWokPlan"

TESTS

Line: 237 to 237
 
Changed:
<
<
  • BUG #38739: To be rechecked when wmproxy 3.1.36-1 is in the repo
>
>
 

Revision 322008-07-31 - ElisabettaMolinari

Line: 1 to 1
 
META TOPICPARENT name="TestWokPlan"

TESTS

Line: 197 to 197
 
Changed:
<
<
  • BUG #34508: FIXED, checking in progress...
>
>
 

Revision 312008-07-30 - MassimoSgaravatto

Line: 1 to 1
 
META TOPICPARENT name="TestWokPlan"

TESTS

Line: 201 to 201
 
Added:
>
>
  • BUG #35156: Not fully fixed. The scenario when HostProxyFile is not in the conf file is not managed
 

Revision 302008-07-30 - ElisabettaMolinari

Line: 1 to 1
 
META TOPICPARENT name="TestWokPlan"

TESTS

Line: 197 to 197
 
Changed:
<
<
>
>
  • BUG #34508: FIXED, checking in progress...
 

Revision 292008-07-30 - ElisabettaMolinari

Line: 1 to 1
 
META TOPICPARENT name="TestWokPlan"

TESTS

Line: 199 to 199
 
Added:
>
>
 

Revision 282008-07-30 - ElisabettaMolinari

Line: 1 to 1
 
META TOPICPARENT name="TestWokPlan"

TESTS

Line: 57 to 57
 
  • BUG #27797: Not Checked
    • This bug could not be checked because I'm not able to submit parametrics jobs
Changed:
<
<
>
>
 
Line: 67 to 67
 - Dest id = ce103.cern.ch:2119/jobmanager-lcglsf-grid_dteam
Added:
>
>
  • BUG #28637: Checking in progress
 
[ale@cream-15 UI]$ cat /tmp/ale_zngnB9uVCWKT7B7MkSlBtA/env.out  | grep LD_LIBRARY

Revision 272008-07-29 - ElisabettaMolinari

Line: 1 to 1
 
META TOPICPARENT name="TestWokPlan"

TESTS

Line: 201 to 201
 
Added:
>
>
  • BUG #36536: FIXED
    • submitted a normal job
    • waited until finished successfully
    • checked the job record is in the LBProxy mysql DB
    • retrieved the output via 'glite-wms-job-output'
    • checked the job record is no more in the LBProxy mysql DB
 
  • BUG #36558: CHECKING IN PROGRESS

Revision 262008-07-29 - ElisabettaMolinari

Line: 1 to 1
 
META TOPICPARENT name="TestWokPlan"

TESTS

Line: 75 to 75
 
Added:
>
>
  • BUG #30289: FIXED
    • Fixed by not using 'clog'
 
Master node is: node72.grid.pg.infn.it

Revision 252008-07-29 - MassimoSgaravatto

Line: 1 to 1
 
META TOPICPARENT name="TestWokPlan"

TESTS

Line: 156 to 156
 
Added:
>
>
 

Revision 242008-07-29 - ElisabettaMolinari

Line: 1 to 1
 
META TOPICPARENT name="TestWokPlan"

TESTS

Line: 198 to 198
 
  • BUG #36558: CHECKING IN PROGRESS
Added:
>
>

[root@wms008 init.d]# grep GLITE_LOCATION glite-wms-ice
GLITE_LOCATION=${GLITE_LOCATION:-/opt/glite}
 

Revision 232008-07-29 - ElisabettaMolinari

Line: 1 to 1
 
META TOPICPARENT name="TestWokPlan"

TESTS

Line: 196 to 196
 
Added:
>
>
  • BUG #36558: CHECKING IN PROGRESS
 

Revision 222008-07-28 - MassimoSgaravatto

Line: 1 to 1
 
META TOPICPARENT name="TestWokPlan"

TESTS

Line: 17 to 17
 

  • Submission of 270 collections of 100 jobs each (10 collections every 30 minutes), using 1 user and a fuzzy rank (used 90 lcg CEs):
Changed:
<
<
    • Success > 99.99% OK
>
>
    • Success > 99.99% OK
 
    • Cancelled about 1800 jobs due to a problem with the CEs at in2p3.fr

Check bugs:

Line: 33 to 33
 
Changed:
<
<
for edg_rm_command in $GLITE_LOCATION/bin/edg-rm $EDG_LOCATION/bin/edg-rm `which edg-rm 2>/dev/null`; do
>
>
for edg_rm_command in $GLITE_LOCATION/bin/edg-rm $EDG_LOCATION/bin/edg-rm `which edg-rm 2>/dev/null`; do
 

  • BUG #24690: NOT COMPLETELY FIXED
Line: 198 to 194
 
Added:
>
>
 
Added:
>
>
 
[ale@cream-15 UI]$ ls -l /tmp/ale_eRWc528nX8QpEcs7im-R7g
total 8
Line: 210 to 209
 -rw-rw-r-- 1 ale ale 0 Jul 8 12:06 test.out
Deleted:
<
<
 
Added:
>
>
  • BUG #38739: To be rechecked when wmproxy 3.1.36-1 is in the repo

 
Added:
>
>
  -- AlessioGianelle - 27 Jun 2008 \ No newline at end of file

Revision 212008-07-28 - ElisabettaMolinari

Line: 1 to 1
 
META TOPICPARENT name="TestWokPlan"

TESTS

Line: 196 to 196
 
Changed:
<
<
  • BUG #35250: CHECKING IN PROGRESS
>
>
 

Revision 202008-07-25 - AlessioGianelle

Line: 1 to 1
 
META TOPICPARENT name="TestWokPlan"

TESTS

Added:
>
>
  • Normal jobs work: OK

  • Dag jobs work: OK
 
  • Perusal jobs work: OK

  • MPICH jobs work: OK
Line: 176 to 180
  fi
Changed:
<
<
>
>
 
Line: 184 to 188
 
Changed:
<
<
>
>
 

Revision 192008-07-25 - ElisabettaMolinari

Line: 1 to 1
 
META TOPICPARENT name="TestWokPlan"

TESTS

Line: 192 to 192
 
Added:
>
>
  • BUG #35250: CHECKING IN PROGRESS
 
[ale@cream-15 UI]$ ls -l /tmp/ale_eRWc528nX8QpEcs7im-R7g

Revision 182008-07-24 - AlessioGianelle

Line: 1 to 1
 
META TOPICPARENT name="TestWokPlan"

TESTS

Line: 24 to 24
  echo $httpd \(pid `/sbin/pidof $httpd`\) is running ....
Changed:
<
<
>
>
 
    • Required documents are not put into the glite doc template in edms
Deleted:
<
<
    • For the R6 (JDL howto) document a broken link is given
 
Line: 42 to 41
 
  • BUG #26885: FIXED
    • Job wrongly kept in ICE cache with status UNKNOWN: checked with two subsequent submissions of 5 collections made of 50 nodes each. ICE does not leave any job with status UNKNOWN behind in the cache
Changed:
<
<
>
>
  • BUG #27215: NOT COMPLETELY FIXED
 
[ale@cream-15 regression]$ ls -l /tmp/ale_StdrEDNZljNnxCLx45ILIw
total 8
Line: 51 to 50
 -rw-rw-r-- 1 ale ale 0 Jul 8 16:02 test.err -rw-rw-r-- 1 ale ale 0 Jul 8 16:02 test.out
Added:
>
>
It is not fixed instead using a CREAM -CE
 

Revision 172008-07-24 - ElisabettaMolinari

Line: 1 to 1
 
META TOPICPARENT name="TestWokPlan"

TESTS

Line: 39 to 39
 
    • The message error that you could find in the wmproxy log (also with level 5) is: edg_wll_JobStat GSSAPI Error
    • In any case now there is a dedicated cron script to renew host-proxy (e.g. it is not included in the cron-purger script)
Added:
>
>
  • BUG #26885: FIXED
    • Job wrongly kept in ICE cache with status UNKNOWN: checked with two subsequent submissions of 5 collections made of 50 nodes each. ICE does not leave any job with status UNKNOWN behind in the cache
 
[ale@cream-15 regression]$ ls -l /tmp/ale_StdrEDNZljNnxCLx45ILIw

Revision 162008-07-23 - AlessioGianelle

Line: 1 to 1
 
META TOPICPARENT name="TestWokPlan"

TESTS

Line: 173 to 173
  fi
Changed:
<
<
>
>
 

Revision 152008-07-23 - AlessioGianelle

Line: 1 to 1
 
META TOPICPARENT name="TestWokPlan"

TESTS

Line: 181 to 181
 
Changed:
<
<
>
>
 

Revision 142008-07-23 - AlessioGianelle

Line: 1 to 1
 
META TOPICPARENT name="TestWokPlan"

TESTS

Line: 187 to 187
 
Changed:
<
<
>
>
 

Revision 132008-07-22 - AlessioGianelle

Line: 1 to 1
 
META TOPICPARENT name="TestWokPlan"

TESTS

Line: 183 to 183
 
Added:
>
>

 

Revision 122008-07-09 - AlessioGianelle

Line: 1 to 1
 
META TOPICPARENT name="TestWokPlan"

TESTS

Line: 175 to 175
 
Added:
>
>

 

Revision 112008-07-08 - AlessioGianelle

Line: 1 to 1
 
META TOPICPARENT name="TestWokPlan"

TESTS

Changed:
<
<
  • MPICH jobs work: OK
>
>
  • Perusal jobs work: OK

  • MPICH jobs work: OK
 
Modified mpirun: Executing command: /home/dteam029/globus-tmp.griditwn03.7486.0/https_3a_2f_2fdevel17.cnaf.infn.it_3a9000_2fOsslm3cw4T7lgR09qJTR4g/cpi
Process 0 of 1 on griditwn03.na.infn.it
Line: 37 to 39
 
    • The message error that you could find in the wmproxy log (also with level 5) is: edg_wll_JobStat GSSAPI Error
    • In any case now there is a dedicated cron script to renew host-proxy (e.g. it is not included in the cron-purger script)
Changed:
<
<
  • BUG #27215: NOT FIXED
    • If the sum of the dimensions of the first two files is exactly equal to the limit, the job stays in state RUNNING for ever and ever.
    • When the dimension of the tailed file is under a threshold it is not considered significant so the file is not transfered. But the user is not advised of this fact.
In this example in the JDL we have specified: OutputSandbox = {"test.err","test.out","out2","out1","out3","out4"}; (limit is 100), but the output directory contains:
>
>
 
Changed:
<
<
[ale@cream-15 UI]$ ls -l /tmp/ale_sWaZC4TQEoPFty124JU_dg
>
>
[ale@cream-15 regression]$ ls -l /tmp/ale_StdrEDNZljNnxCLx45ILIw
 total 8
Changed:
<
<
-rw-rw-r-- 1 ale ale 49 Jun 27 15:39 out1 -rw-rw-r-- 1 ale ale 49 Jun 27 15:39 out2 -rw-rw-r-- 1 ale ale 0 Jun 27 15:39 test.err -rw-rw-r-- 1 ale ale 0 Jun 27 15:39 test.out Only from Maradona file you can see:
OSB quota exceeded for out3, truncating needed
Not enough room for a significant truncation on file out3, not sending
OSB quota exceeded for out4, truncating needed
Not enough room for a significant truncation on file out4, not sending
>
>
-rw-rw-r-- 1 ale ale 30 Jul 8 16:02 out1.tail -rw-rw-r-- 1 ale ale 70 Jul 8 16:02 out2 -rw-rw-r-- 1 ale ale 0 Jul 8 16:02 test.err -rw-rw-r-- 1 ale ale 0 Jul 8 16:02 test.out
 
Line: 109 to 101
 
Changed:
<
<
  • BUG #30896: FIXED
    • The bug is fixed but the user obtains only this message:
>
>

 
Changed:
<
<
Error - Operation failed Unable to start the job: HTTP Error
>
>
*********************************************************** BOOKKEEPING INFORMATION:

Status info for the Job : https://devel17.cnaf.infn.it:9000/LEzR7tTwyh3P-iYZrKlwxg Current Status: Aborted Status Reason: The maximum number of output sandbox files is reached Submitted: Tue Jul 8 16:12:10 2008 CEST ***********************************************************

Error - WMProxy Server Error
HTTP Error
  500 Internal Server Error
Line: 132 to 138
 Error code: SOAP-ENV:Server

Changed:
<
<
The only meaning error message should be found on the wmproxy.log file:
>
>
You should find the correct reason only on the log file of the wmproxy:
 
Changed:
<
<
30 Jun, 17:09:17 -D- PID: 9842 - "wmpgsoapoperations::ns1__jobStart": ------------------------------- Fault Description -------------------------------- 30 Jun, 17:09:17 -D- PID: 9842 - "wmpgsoapoperations::ns1__jobStart": Method: jobStart 30 Jun, 17:09:17 -D- PID: 9842 - "wmpgsoapoperations::ns1__jobStart": Code: 1226 30 Jun, 17:09:17 -D- PID: 9842 - "wmpgsoapoperations::ns1__jobStart": Description: The Operation is not allowed: The maximum number of input sandbox files is reached 30 Jun, 17:09:17 -D- PID: 9842 - "wmpgsoapoperations::ns1__jobStart": Stack: 30 Jun, 17:09:17 -D- PID: 9842 - "wmpgsoapoperations::ns1__jobStart": JobOperationException: The Operation is not allowed: The maximum number of input sandbox files is reached at wmpcoreoperations::submit()[wmpcoreoperations.cpp:1344] 30 Jun, 17:09:17 -D- PID: 9842 - "wmpgsoapoperations::ns1__jobStart": at submit()[wmpcoreoperations.cpp:1951] 30 Jun, 17:09:17 -D- PID: 9842 - "wmpgsoapoperations::ns1__jobStart": at jobStart()[wmpcoreoperations.cpp:1105] 30 Jun, 17:09:17 -D- PID: 9842 - "wmpgsoapoperations::ns1__jobStart": ---------------------------------------------------------------------------------- 30 Jun, 17:09:17 -I- PID: 9842 - "wmpgsoapoperations::ns1__jobStart": jobStart operation completed
>
>
08 Jul, 16:37:55 -D- PID: 24293 - "wmpgsoapoperations::ns1__enableFilePerusal": ------------------------------- Fault Description -------------------------------- 08 Jul, 16:37:55 -D- PID: 24293 - "wmpgsoapoperations::ns1__enableFilePerusal": Method: enableFilePerusal 08 Jul, 16:37:55 -D- PID: 24293 - "wmpgsoapoperations::ns1__enableFilePerusal": Code: 1226 08 Jul, 16:37:55 -D- PID: 24293 - "wmpgsoapoperations::ns1__enableFilePerusal": Description: The Operation is not allowed: The maximum number of perusal files is reached 08 Jul, 16:37:55 -D- PID: 24293 - "wmpgsoapoperations::ns1__enableFilePerusal": Stack: 08 Jul, 16:37:55 -D- PID: 24293 - "wmpgsoapoperations::ns1__enableFilePerusal": JobOperationException: The Operation is not allowed: The maximum number of perusal files is reached at wmpoperations::enableFilePerusal()[wmpoperations.cpp:1287] 08 Jul, 16:37:55 -D- PID: 24293 - "wmpgsoapoperations::ns1__enableFilePerusal": at enableFilePerusal()[wmpoperations.cpp:1231] 08 Jul, 16:37:55 -D- PID: 24293 - "wmpgsoapoperations::ns1__enableFilePerusal": ----------------------------------------------------------------------------------
 
Added:
>
>

  # customization point
  if [ -n "${GLITE_LOCAL_CUSTOMIZATION_DIR}" ]; then
    if [ -f "${GLITE_LOCAL_CUSTOMIZATION_DIR}/cp_1_5.sh" ]; then
      . "${GLITE_LOCAL_CUSTOMIZATION_DIR}/cp_1_5.sh"
    fi
  fi

 

Revision 102008-07-08 - AlessioGianelle

Line: 1 to 1
 
META TOPICPARENT name="TestWokPlan"

TESTS

Line: 148 to 148
 
Added:
>
>
[ale@cream-15 UI]$ ls -l /tmp/ale_eRWc528nX8QpEcs7im-R7g
total 8
-rw-rw-r--  1 ale ale 50 Jul  8 12:06 out1
-rw-rw-r--  1 ale ale  0 Jul  8 12:06 out2.tail
-rw-rw-r--  1 ale ale 50 Jul  8 12:06 out3
-rw-rw-r--  1 ale ale  0 Jul  8 12:06 out4.tail
-rw-rw-r--  1 ale ale  0 Jul  8 12:06 test.err
-rw-rw-r--  1 ale ale  0 Jul  8 12:06 test.out
 

Revision 92008-07-08 - AlessioGianelle

Line: 1 to 1
 
META TOPICPARENT name="TestWokPlan"

TESTS

Line: 149 to 149
 
Changed:
<
<
>
>
 

-- AlessioGianelle - 27 Jun 2008

Revision 82008-06-30 - AlessioGianelle

Line: 1 to 1
 
META TOPICPARENT name="TestWokPlan"

TESTS

Line: 61 to 61
 
  • BUG #27797: Not Checked
    • This bug could not be checked because I'm not able to submit parametrics jobs
Changed:
<
<
>
>
 
Line: 77 to 77
 LD_LIBRARY_PATH=.
Changed:
<
<
>
>
 
Line: 104 to 104
 *******************************
Added:
>
>

  • BUG #30896: FIXED
    • The bug is fixed but the user obtains only this message:
Error - Operation failed
Unable to start the job: HTTP Error
<html><head>
<title>500 Internal Server Error</title>
</head><body>
<h1>Internal Server Error</h1>
<p>The server encountered an internal error or
misconfiguration and was unable to complete
your request.</p>
<p>Please contact the server administrator,
 [no address given] and inform them of the time the error occurred,
and anything you might have done that may have
caused the error.</p>
<p>More information about this error may be available
in the server error log.</p>
</body></html>

Error code: SOAP-ENV:Server

The only meaning error message should be found on the wmproxy.log file:

30 Jun, 17:09:17 -D- PID: 9842 - "wmpgsoapoperations::ns1__jobStart": ------------------------------- Fault Description --------------------------------
30 Jun, 17:09:17 -D- PID: 9842 - "wmpgsoapoperations::ns1__jobStart": Method: jobStart
30 Jun, 17:09:17 -D- PID: 9842 - "wmpgsoapoperations::ns1__jobStart": Code: 1226
30 Jun, 17:09:17 -D- PID: 9842 - "wmpgsoapoperations::ns1__jobStart": Description: The Operation is not allowed: The maximum number of input sandbox files is reached
30 Jun, 17:09:17 -D- PID: 9842 - "wmpgsoapoperations::ns1__jobStart": Stack:
30 Jun, 17:09:17 -D- PID: 9842 - "wmpgsoapoperations::ns1__jobStart": JobOperationException: The Operation is not allowed: The maximum number of input sandbox files is reached
        at wmpcoreoperations::submit()[wmpcoreoperations.cpp:1344]
30 Jun, 17:09:17 -D- PID: 9842 - "wmpgsoapoperations::ns1__jobStart":   at submit()[wmpcoreoperations.cpp:1951]
30 Jun, 17:09:17 -D- PID: 9842 - "wmpgsoapoperations::ns1__jobStart":   at jobStart()[wmpcoreoperations.cpp:1105]
30 Jun, 17:09:17 -D- PID: 9842 - "wmpgsoapoperations::ns1__jobStart": ----------------------------------------------------------------------------------
30 Jun, 17:09:17 -I- PID: 9842 - "wmpgsoapoperations::ns1__jobStart": jobStart operation completed

 

Revision 72008-06-30 - AlessioGianelle

Line: 1 to 1
 
META TOPICPARENT name="TestWokPlan"

TESTS

Changed:
<
<
  • MPICH jobs work:
>
>
  • MPICH jobs work: OK
 
Modified mpirun: Executing command: /home/dteam029/globus-tmp.griditwn03.7486.0/https_3a_2f_2fdevel17.cnaf.infn.it_3a9000_2fOsslm3cw4T7lgR09qJTR4g/cpi
Process 0 of 1 on griditwn03.na.infn.it

Revision 62008-06-30 - AlessioGianelle

Line: 1 to 1
 
META TOPICPARENT name="TestWokPlan"

TESTS

Added:
>
>
  • MPICH jobs work:
Modified mpirun: Executing command: /home/dteam029/globus-tmp.griditwn03.7486.0/https_3a_2f_2fdevel17.cnaf.infn.it_3a9000_2fOsslm3cw4T7lgR09qJTR4g/cpi
Process 0 of 1 on griditwn03.na.infn.it
pi is approximately 3.1415926544231341, Error is 0.0000000008333410
wall clock time = 10.001266
 
  • Submission of 270 collections of 100 jobs each (10 collections every 30 minutes), using 1 user and a fuzzy rank (used 90 lcg CEs):
    • Success > 99.99% OK
    • Cancelled about 1800 jobs due to a problem with the CEs at in2p3.fr

Revision 52008-06-30 - AlessioGianelle

Line: 1 to 1
 
META TOPICPARENT name="TestWokPlan"

TESTS

Line: 64 to 64
 

Added:
>
>
[ale@cream-15 UI]$ cat /tmp/ale_zngnB9uVCWKT7B7MkSlBtA/env.out  | grep LD_LIBRARY
LD_LIBRARY_PATH=.

Master node is: node72.grid.pg.infn.it
Is should run on the following nodes:
node72.grid.pg.infn.it
node72.grid.pg.infn.it
node71.grid.pg.infn.it
node71.grid.pg.infn.it
*************************************
Current working directory is: /home/dteamsgm003/globus-tmp.node72.24167.0/https_3a_2f_2fdevel17.cnaf.infn.it_3a9000_2f-6An9ZDwkvot3aOLSzScdg
List files on the working directory:
/home/dteamsgm003/globus-tmp.node72.24167.0/https_3a_2f_2fdevel17.cnaf.infn.it_3a9000_2f-6An9ZDwkvot3aOLSzScdg:
total 352
drwxr-xr-x  2 dteamsgm003 dteamsgm   4096 Jun 30 11:03 .
drwx------  5 dteamsgm003 dteamsgm   4096 Jun 30 11:03 ..
-rwxr-xr-x  1 dteamsgm003 dteamsgm    822 Jun 30 11:03 30308_exe.sh
-rw-r--r--  1 dteamsgm003 dteamsgm   3687 Jun 30 11:03 .BrokerInfo
-rw-r--r--  1 dteamsgm003 dteamsgm    218 Jun 30 11:03 https_3a_2f_2fdevel17.cnaf.infn.it_3a9000_2f-6An9ZDwkvot3aOLSzScdg.output
-rw-r--r--  1 dteamsgm003 dteamsgm 330910 Jun 30 11:03 mpitest
-rw-r--r--  1 dteamsgm003 dteamsgm      0 Jun 30 11:03 test.err
-rw-r--r--  1 dteamsgm003 dteamsgm    385 Jun 30 11:03 test.out
-rw-------  1 dteamsgm003 dteamsgm      0 Jun 30 11:03 tmp.rdgPL24747
*********************************
 

-- AlessioGianelle - 27 Jun 2008

Revision 42008-06-27 - AlessioGianelle

Line: 1 to 1
 
META TOPICPARENT name="TestWokPlan"

TESTS

Line: 29 to 29
 
    • The message error that you could find in the wmproxy log (also with level 5) is: edg_wll_JobStat GSSAPI Error
    • In any case now there is a dedicated cron script to renew host-proxy (e.g. it is not included in the cron-purger script)
Changed:
<
<
>
>
  • BUG #27215: NOT FIXED
    • If the sum of the dimensions of the first two files is exactly equal to the limit, the job stays in state RUNNING for ever and ever.
    • When the dimension of the tailed file is under a threshold it is not considered significant so the file is not transfered. But the user is not advised of this fact.
In this example in the JDL we have specified: OutputSandbox = {"test.err","test.out","out2","out1","out3","out4"}; (limit is 100), but the output directory contains:
 
Changed:
<
<
>
>
[ale@cream-15 UI]$ ls -l /tmp/ale_sWaZC4TQEoPFty124JU_dg total 8 -rw-rw-r-- 1 ale ale 49 Jun 27 15:39 out1 -rw-rw-r-- 1 ale ale 49 Jun 27 15:39 out2 -rw-rw-r-- 1 ale ale 0 Jun 27 15:39 test.err -rw-rw-r-- 1 ale ale 0 Jun 27 15:39 test.out Only from Maradona file you can see:
OSB quota exceeded for out3, truncating needed
Not enough room for a significant truncation on file out3, not sending
OSB quota exceeded for out4, truncating needed
Not enough room for a significant truncation on file out4, not sending
 
Added:
>
>
 
  • BUG #27797: Not Checked
    • This bug could not be checked because I'm not able to submit parametrics jobs

Line: 37 to 53
 
  • BUG #27797: Not Checked
    • This bug could not be checked because I'm not able to submit parametrics jobs

Added:
>
>
 
[ale@cream-15 UI]$ glite-wms-job-logging-info -v 2 https://devel17.cnaf.infn.it:9000/Hr_TRdWT9XZrBux4DyWQsw | grep -A 2 Match | grep Dest
Line: 45 to 63
 - Dest id = ce103.cern.ch:2119/jobmanager-lcglsf-grid_dteam
Added:
>
>
 -- AlessioGianelle - 27 Jun 2008

Revision 32008-06-27 - AlessioGianelle

Line: 1 to 1
 
META TOPICPARENT name="TestWokPlan"

TESTS

Line: 16 to 17
 
  • BUG #23443: NOT FIXED
    • Required documents are not put into the glite doc template in edms
    • For the R6 (JDL howto) document a broken link is given
Added:
>
>
    for edg_rm_command in $GLITE_LOCATION/bin/edg-rm \
                          $EDG_LOCATION/bin/edg-rm \
                          `which edg-rm 2>/dev/null`; do

  • BUG #24690: NOT COMPLETELY FIXED
    • The message error that you could find in the wmproxy log (also with level 5) is: edg_wll_JobStat GSSAPI Error
    • In any case now there is a dedicated cron script to renew host-proxy (e.g. it is not included in the cron-purger script)
 

Added:
>
>
  • BUG #27797: Not Checked
    • This bug could not be checked because I'm not able to submit parametrics jobs

 
[ale@cream-15 UI]$ glite-wms-job-logging-info -v 2 https://devel17.cnaf.infn.it:9000/Hr_TRdWT9XZrBux4DyWQsw | grep -A 2 Match | grep Dest

Revision 22008-06-27 - AlessioGianelle

Line: 1 to 1
 
META TOPICPARENT name="TestWokPlan"
Deleted:
<
<
 

TESTS

Added:
>
>
  • Submission of 270 collections of 100 jobs each (10 collections every 30 minutes), using 1 user and a fuzzy rank (used 90 lcg CEs):
    • Success > 99.99% OK
    • Cancelled about 1800 jobs due to a problem with the CEs at in2p3.fr

Check bugs:

  • BUG #21909: FIXED in the wmproxy startup script
         if ( /sbin/pidof $httpd ) >/dev/null 2>&1 ; then
          echo $httpd \(pid `/sbin/pidof $httpd`\) is running ....
  • BUG #23443: NOT FIXED
    • Required documents are not put into the glite doc template in edms
    • For the R6 (JDL howto) document a broken link is given
  • BUG #27215: FIXED

[ale@cream-15 UI]$ glite-wms-job-logging-info -v 2 https://devel17.cnaf.infn.it:9000/Hr_TRdWT9XZrBux4DyWQsw | grep -A 2 Match | grep Dest
- Dest id                    =    ce103.cern.ch:2119/jobmanager-lcglsf-grid_dteam
- Dest id                    =    ce103.cern.ch:2119/jobmanager-lcglsf-grid_dteam
- Dest id                    =    ce103.cern.ch:2119/jobmanager-lcglsf-grid_dteam
  -- AlessioGianelle - 27 Jun 2008 \ No newline at end of file

Revision 12008-06-27 - AlessioGianelle

Line: 1 to 1
Added:
>
>
META TOPICPARENT name="TestWokPlan"

TESTS

-- AlessioGianelle - 27 Jun 2008

 
This site is powered by the TWiki collaboration platformCopyright © 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback