Difference: WmsTests3dot1dot100 (59 vs. 60)

Revision 602008-09-09 - AlessioGianelle

Line: 1 to 1
 
META TOPICPARENT name="TestWokPlan"

TESTS

Line: 26 to 26
 
    • checked by Laurence Field and ARC developers

Added:
>
>
    • Set the two parameters, subscription_update_threshold_time and subscription_duration in the ICE section of the glite_wms.conf file to low values, such as the following:
       subscription_duration  =  300;
       subscription_update_threshold_time =  150;
      so that a subscription expires after 5 minutes
    • Re-start ICE by the script '/opt/glite/etc/init.d/glite-wms-ice'
    • Submit a job and check the status of the subscription by the following command of the cream client:
       CEMonitorSubscriberMgr <cert_proxy> <cert_path> <service_URL_address> 
 
Changed:
<
<
  • BUG #21909: FIXED in the wmproxy startup script
>
>
  • BUG #21909: FIXED
    • In the wmproxy startup script check if there are these lines:
  if ( /sbin/pidof $httpd ) >/dev/null 2>&1 ; then
Changed:
<
<
echo $httpd \(pid `/sbin/pidof $httpd`\) is running ....
>
>
echo $httpd \(pid `/sbin/pidof $httpd`\) is running ....
 
  • BUG #23443: FIXED
    • Required documents are not put into the glite doc template in edms

Changed:
<
<
>
>
    • Check if in the JobWrapper there are these lines:
  for edg_rm_command in $GLITE_LOCATION/bin/edg-rm $EDG_LOCATION/bin/edg-rm `which edg-rm 2>/dev/null`; do
Changed:
<
<
>
>
[...]
 
  • BUG #24690: NOT COMPLETELY FIXED
    • The message error that you could find in the wmproxy log (also with level 5) is: edg_wll_JobStat GSSAPI Error
    • In any case now there is a dedicated cron script to renew host-proxy (e.g. it is not included in the cron-purger script)

Changed:
<
<
    • Job wrongly kept in ICE cache with status UNKNOWN: checked with two subsequent submissions of 5 collections made of 50 nodes each. ICE does not leave any job with status UNKNOWN behind in the cache
>
>
    • Checked with two subsequent submissions of 5 collections made of 50 nodes each. ICE does not leave any job with status UNKNOWN behind in the cache
 
Changed:
<
<
  • BUG #27215: NOT COMPLETELY FIXED
[ale@cream-15 regression]$ ls -l /tmp/ale_StdrEDNZljNnxCLx45ILIw
 total 8
>
>
  • BUG #27215: FIXED (for a LCG-CE); NOT fixed for a CREAM-CE
    • Set the parameter MaxOutputSandboxSize in the WorkloadManager section of the configuration file /opt/glite/etc/glite_wms.conf on the WMS to 100 and restart the workload manager.
    • Submit a jdl like this:
       
      [
      Type = "Job";
      Executable = "27215_exe.sh";
      Arguments = "70";
      StdOutput = "test.out";
      StdError = "test.err";
      Environment = {"GLITE_LOCAL_MAX_OSB_SIZE=35"};
      InputSandbox = {"27215_exe.sh"};
      OutputSandbox = {"test.err","test.out","out2", "out1"};
      usertags = [ bug = "27215" ];
      ]
      where 27215_exe.sh contains
      #!/bin/sh
      MAX=$1
      i=0
      while [ $i -lt $MAX ]; do
                      echo -n "1" >> out1
                      echo -n "2" >> out2
          i=$[$i + 1]
      done
    • When Done retrieving the output files, this should be the result of an ls -l of the output dir:
 -rw-rw-r-- 1 ale ale 30 Jul 8 16:02 out1.tail -rw-rw-r-- 1 ale ale 70 Jul 8 16:02 out2 -rw-rw-r-- 1 ale ale 0 Jul 8 16:02 test.err
Changed:
<
<
-rw-rw-r-- 1 ale ale 0 Jul 8 16:02 test.out It is not fixed instead using a CREAM -CE
>
>
-rw-rw-r-- 1 ale ale 0 Jul 8 16:02 test.out
 
Added:
>
>
    • Submit a jdl like this one:
       
      [
        JobType = "parametric";
        Executable = "/usr/bin/env";
        Environment = {"MYPATH_PARAM_=$PATH:/bin:/usr/bin:$HOME"};
        StdOutput = "echo_PARAM_.out";
        StdError = "echo_PARAM_.err";
        OutputSandbox = {"echo_PARAM_.out","echo_PARAM_.err"};
        Parameters =  {test, 2};
       ]
    • The generated jdl should contains:
      [
      requirements = other.GlueCEStateStatus == "Production";
      nodes = [ dependencies = { };
      Node_test = [ ... ];
      Node_2 = [ ... ];
      [...]
      ]
 
Added:
>
>
    • Edit the configuration file in /opt/glite/etc//glite_wmsclient.conf by changing the virtualorganisation attribute in the JdlDefaultAttributes section to a different from the one used to generate the user proxy, as in the following:
    • Submit a job and check the generated .jdl has the right virtualorganisation defined, that is the same as the one used to generated the user proxy
 
Changed:
<
<
[ale@cream-15 UI]$ glite-wms-job-logging-info -v 2 https://devel17.cnaf.infn.it:9000/Hr_TRdWT9XZrBux4DyWQsw | grep -A 2 Match | grep Dest
- Dest id                    =    ce103.cern.ch:2119/jobmanager-lcglsf-grid_dteam
- Dest id                    =    ce103.cern.ch:2119/jobmanager-lcglsf-grid_dteam
- Dest id                    =    ce103.cern.ch:2119/jobmanager-lcglsf-grid_dteam
>
>
    • Change the jdl setting the name of an existing CE (in the requirements)
    • At the end of the jobs with this command: glite-wms-job-logging-info -v 2 "jobid" | grep -A 2 Match | grep Dest you should see 3 times the name of the previously choosen CE. (The job must be Aborted with reason: hit job shallow retry count (2))
 
  • BUG #28249: Hopefully fixed
    • bug posted by the developer
Line: 78 to 116
 
    • compilation error with gcc-4.x

Added:
>
>
    • Create a delegated proxy with glite-wms-job-delegate-proxy -d pippo on the wmproxy server of the wms machine
    • Submit a job to a cream CE via the wms using the previously created delegated proxy: glite-wms-job-submit myjob.jdl -d pippo
    • Delete records for the user DN you are submitting with from the delegationdb on the CREAM CE, like the following:
       delete from t_credential where dn like '%Elisabetta%';
       delete from t_credential_cache where dn like '%Elisabetta%';
    • Submit a new normal job using the same delegated proxy as above;
 
Changed:
<
<
[ale@cream-15 UI]$ cat /tmp/ale_zngnB9uVCWKT7B7MkSlBtA/env.out  | grep LD_LIBRARY
 LD_LIBRARY_PATH=.
>
>
    • Submit this jdl:
      [
      Executable = "/usr/bin/env" ;
      Stdoutput = "env.out" ;
      StdError =  "env.err" ;
      shallowretrycount = 2;
      InputSandbox = { "data/input.txt" };
      OutputSandbox = { "env.out" ,"env.err", "input.txt" } ;
      Environment={"LD_LIBRARY_PATH=."};
      usertags = [ bug = "28642" ];
      ]
    • Get the output of the job. In the output directory you should find the file input.txt, and the LD_LIBRARY_PATH should be set to "." into the file env.out.
 
Added:
>
>
    • Stop ICE: /opt/glite/etc/glite-wms-ice stop
    • Corrupt ICE database, e.g. by doing the following:
      For each file (all but *proxy*) in /var/glite/ice/persist_dir do:
          cat "pippo" > "file"
      done
    • Start ICE: /opt/glite/etc/glite-wms-ice start
    • In the ICE log file you should see something like:
      2008-07-29 12:44:00,537 FATAL - jobCache::jobCache() - Failed to
      initialize the jobDbManager object. Reason is: Db::open: Invalid argument
 
  • BUG #29182: Hopefully fixed
    • not easy to reproduce
Line: 97 to 158
 
    • Fixed by not using 'clog'

Changed:
<
<
Master node is: node72.grid.pg.infn.it
 Is should run on the following nodes:
node72.grid.pg.infn.it
 node72.grid.pg.infn.it
 node71.grid.pg.infn.it
 node71.grid.pg.infn.it
*************************************
Current working directory is: /home/dteamsgm003/globus-tmp.node72.24167.0/https_3a_2f_2fdevel17.cnaf.infn.it_3a9000_2f-6An9ZDwkvot3aOLSzScdg
 List files on the working directory:
/home/dteamsgm003/globus-tmp.node72.24167.0/https_3a_2f_2fdevel17.cnaf.infn.it_3a9000_2f-6An9ZDwkvot3aOLSzScdg:
total 352
 drwxr-xr-x  2 dteamsgm003 dteamsgm   4096 Jun 30 11:03 .
drwx------  5 dteamsgm003 dteamsgm   4096 Jun 30 11:03 ..
-rwxr-xr-x  1 dteamsgm003 dteamsgm    822 Jun 30 11:03 30308_exe.sh
-rw-r--r--  1 dteamsgm003 dteamsgm   3687 Jun 30 11:03 .BrokerInfo
-rw-r--r--  1 dteamsgm003 dteamsgm    218 Jun 30 11:03 https_3a_2f_2fdevel17.cnaf.infn.it_3a9000_2f-6An9ZDwkvot3aOLSzScdg.output
-rw-r--r--  1 dteamsgm003 dteamsgm 330910 Jun 30 11:03 mpitest
-rw-r--r--  1 dteamsgm003 dteamsgm      0 Jun 30 11:03 test.err
-rw-r--r--  1 dteamsgm003 dteamsgm    385 Jun 30 11:03 test.out
-rw-------  1 dteamsgm003 dteamsgm      0 Jun 30 11:03 tmp.rdgPL24747
*********************************
>
>
    • Submit this jdl:
      [
      requirements = ( other.GlueCEStateStatus == "Production" ) && Member("MPICH",other.GlueHostApplicationSoftwareRunTimeEnvironment) && ( other.GlueCEInfoTotalCPUs >= 4 ) && ( other.GlueCEInfoLRMSType == "torque" ||   RegExp("pbs",other.GlueCEInfoLRMSType) );
      Type = "Job";
      NodeNumber = 4;
      Executable = "30308_exe.sh";
      Arguments = "cpi 4";
      StdOutput = "test.out";
      StdError = "test.err";
      InputSandbox = {"30308_exe.sh", "exe/cpi"};
      OutputSandbox = {"test.err","test.out","executable.out"};
      usertags = [ bug = "30308" ];
      ]
      Where the 30308_exe.sh should be:
      #!/bin/sh
      # The first parameter is the binary to be executed
      EXE=$1
      # The second parameter is the number of CPU's to be reserved for parallel execution
      CPU_NEEDED=$2
      chmod 777 $EXE
      # prints the list of files in the working directory
      echo "List files on the working directory:"
      ls -alR `pwd`
      # execute the user job
      mpirun -np $CPU_NEEDED -machinefile $PBS_NODEFILE `pwd`/$EXE >& executable.out
    • When DONE retrieve the output and check that the directory .mpi should not be listed in the test.out output file.
 
  • BUG #30518: Hopefully fixed
    • not easy to reproduce

Changed:
<
<
>
>
    • Already fixed and working on the production wms using patch #1491
 
Changed:
<
<
*************************************************************
BOOKKEEPING INFORMATION:

Status info for the Job : https://devel17.cnaf.infn.it:9000/LEzR7tTwyh3P-iYZrKlwxg
 Current Status:     Aborted
 Status Reason:      The maximum number of output sandbox files is reached
 Submitted:          Tue Jul  8 16:12:10 2008 CEST
*************************************************************
>
>
    • Set maxInputSandboxFiles = 2; in the WorkloadManagerProxy section of the configuration file on the WMS, and restart the wmproxy.
      • Submit a job with more than 2 files listed in the InputSandbox parameter
      • Check if the job is immediately set as Aborted and if the reason of the status is: The Operation is not allowed: The maximum number of input sandbox files is reached
    • Set maxOutputSandboxFiles = 2; in the WorkloadManagerProxy section of the configuration file on the WMS and restart the wmproxy.
      • Submit a job with more than 2 files listed in the I=OutputSandbox= parameter
      • Check if the job is immediately set as Aborted and if the reason of the status is: The Operation is not allowed: The maximum number of output sandbox files is reached
 
Changed:
<
<
>
>
    • Default value for MinPerusalTimeInterval should be check on the configuration file of the WMS.
    • Set MaxPerusalFiles = 1; on the WorkloadManagerProxy of the configuration file on the WMS and restart the wmproxy.
    • After the submission of the jdl give this command: glite-wms-job-perusal --set -f perusal.out -f perusal.err "jobid" The answer should be:
 Error - WMProxy Server Error The Operation is not allowed: The maximum number of perusal files is reached
Changed:
<
<
Method: enableFilePerusal

>
>
Method: enableFilePerusal
 
  • BUG #31006: Hopefully FIXED
    • Not easy to reproduce

Added:
>
>
    • Simply check the /opt/glite/etc/templates/template.sh file on a WMS
 
Added:
>
>
    • Using the command glite-wms-job-info --jdl "jobid" | grep -i requirements check if the expression RegExp(".*sdj$",other.GlueCEniqueID); is present (the exact expression should be found in the configuration file on the WMS, section: WorkloadManagerProxy, parameter: SDJRequirements)
    • Setting ShortDeadlineJob=false; in the jdl, the previous command should contain the expression !RegExp(".*sdj$",other.GlueCEUniqueID)
 
Added:
>
>
    • Set on the WMS conf file: II_Contact  =  "lcg-bdii.cern.ch";
    • Do a list-match using this jdl:
      [
        Requirements = RegExp(".manchester.ac.uk:2119.*",other.GlueCEUniqueID) && anyMatch(other.storage.CloseSEs,target.GlueSEStatus == "unset");
        Executable = "/bin/ls";
        prologue = "/bin/false";
      ]
      the output should be:
        - ce01.tier2.hep.manchester.ac.uk:2119/jobmanager-lcgpbs-dteam
 
Changed:
<
<
    • reproduced the problem by inserting a 500 sec sleep in the dirmanager and killing it by hand while unzipping the ISB. The job stays in status 'waiting' and is not forwarded to the WM.
>
>
    • Reproduced the problem by inserting a 500 sec sleep in the dirmanager and killing it by hand while unzipping the ISB. The job stays in status 'waiting' and is not forwarded to the WM.
 
Changed:
<
<
>
>
    • Check if in the file /opt/glite/etc/templates/template.sh on a WMS there are these lines:
  # customization point if [ -n "${GLITE_LOCAL_CUSTOMIZATION_DIR}" ]; then if [ -f "${GLITE_LOCAL_CUSTOMIZATION_DIR}/cp_1_5.sh" ]; then . "${GLITE_LOCAL_CUSTOMIZATION_DIR}/cp_1_5.sh" fi
Changed:
<
<
fi
>
>
fi
 
Added:
>
>
    • Set a very low timeout for the BDII on the WMS conf file: II_Timeout  =  3;
    • Now setting on the WMS conf file: IsmIILDAPSearchAsync = false;
    • You should see in the log file of the workload_manager (if yuo use a populate BDII):
          [Warning] fetch_bdii_ce_info(ldap-utils.cpp:640): Timed out
          [Warning] fetch_bdii_se_info(ldap-utils.cpp:308): Timed out
          [Debug] do_purchase(ism-ii-purchaser.cpp:176): BDII fetching completed in 4 seconds
          [Info] do_purchase(ism-ii-purchaser.cpp:193): Total VO_Views entries in ISM : 0
          [Info] do_purchase(ism-ii-purchaser.cpp:194): Total SE entries in ISM : 0
    • Setting: IsmIILDAPSearchAsync = true: you should obtain more (>0) VO_Views entries (e.g.):
          [Debug] fetch_bdii_ce_info(ldap-utils-asynch.cpp:628): #1652 LDAP entries received in 5 seconds
          [Debug] fetch_bdii_ce_info(ldap-utils-asynch.cpp:781): ClassAd reppresentation built in 0 seconds
          [Debug] fetch_bdii_se_info(ldap-utils-asynch.cpp:444): #2381 LDAP entries received in 5 seconds
          [Debug] fetch_bdii_se_info(ldap-utils-asynch.cpp:504): ClassAd reppresentation built in 0 seconds
          [Debug] do_purchase(ism-ii-purchaser.cpp:176): BDII fetching completed in 10 seconds
          [Info] do_purchase(ism-ii-purchaser.cpp:193): Total VO_Views entries in ISM : 53
          [Info] do_purchase(ism-ii-purchaser.cpp:194): Total SE entries in ISM : 61
 
Changed:
<
<
>
>
 
Added:
>
>
    • Submit a jdl
    • Look into the SandBox dir of the job (on the WMS) until you see the Maradona file
    • Put the condor job (equivalent to your job previously submitted) on hold, this should trigger a resubmission
    • When the job has been resubmitted check if the old Maradona file has been removed
 
Added:
>
>
    • Set the II_Timeout parameter in the NetworkServr section of the glite_wms.conf file on the WMS to a very low value, as for ex.: II_Timeout  =  2;
    • Rre-start the WM and check the $GLITE_WMS_LOCATION_VAR/workload_manager/ismdump.fl does not get emptied
    • Perform some job-list-match operation checking that it gets some match results
 
Added:
>
>
    • Add this parameter to section WorkloadManager of the glite_wms.conf configuration file (using for example vo "cms" as filter): IsmIILDAPCEFilterExt = "(|(GlueCEAccessControlBaseRule=VO:cms)(GlueCEAccessControlBaseRule=VOMS:/cms/*))"
    • Restart the WM
    • Doing a list-match using a voms proxy of a different VO (e.g. dteam) you should obtain "no resource available".
 
  • BUG #33140: Hopefully FIXED
    • Not easy to reproduce

Added:
>
>
    • Removed if present the directory $GLITE_WMS_LOCATION_VAR/workload_manager/jobdir on the WMS
    • Restart the wm and check id the previous directory is recreated.
 
Added:
>
>
    • Stop the WM on the WMS.
    • Submit a collection
    • Restart the WM
    • Check if the status of the collection changes to Running
 
Added:
>
>
    • Set the "ExpiryPeriod" parameter in the glite_wms.conf configuration file to a very low value, such as the following: ExpiryPeriod  =  2;
    • Overload the wms, by submitting several collections sequentially, like for example: 10 collections by 100 nodes
    • Check the jobstatus of the last submitted collections and keep submitting until the status of the parent node is aborted because of the following:
      *************************************************************
       BOOKKEEPING INFORMATION:
      
       Status info for the Job : https://devel17.cnaf.infn.it:9000/qQe68ESYiRNDNXZPNsG-AA
       Current Status:     Aborted
       Status Reason:      request expired
       Submitted:          Wed Jul 30 11:23:49 2008 CEST
      
      *************************************************************
    • Stop submitted and check if the status of all the children nodes is aborted as well
 
Added:
>
>
    • Check if the proxy file name is hardcoded on $GLITE_WMS_LOCATION/sbin/glite-wms-purgeStorage.sh
 
Added:
>
>
    • Test it for the filelist input method by setting the following two parameters in the glite_wms.conf WMS configuration file, workload manager section:
                   DispatcherType  =  "filelist";
                   Input  =  "${GLITE_LOCATION_VAR}/workload_manager/input.fl";
      • Re-start the WM and submit a DAG
      • Check if it is successful
    • Test it for the jobdir input method by setting the following two parameters in the glite_wms.conf WMS configuration file, workload manager section:
                    DispatcherType  =  "jobdir";
                    Input = "${GLITE_LOCATION_VAR}/workload_manager/jobdir";
      • Re-start the WM and submit a DAG
      • Check if it is successful
 
  • BUG #35878: FIXED
    • compilation error with gcc-4.x

Added:
>
>
    • Set the following two parameters in the glite_wms.conf WMS configuration file, workload manager section:
                    DispatcherType  =  "jobdir";
                    Input = "${GLITE_LOCATION_VAR}/workload_manager/jobdir";
      • Re-start the WM and submit a DAG
      • Check if it is successful
 
  • BUG #36341: Hopefully fixed
    • bug posted by the developer
Line: 210 to 336
 
    • bug posted by the developer

Added:
>
>
    • Consider this /opt/glite/etc/glite_wms_wmproxy.gacl file:
      <?xml version="1.0"?>
      <gacl version="0.0.1">
      <entry>
      <any-user/>
      <allow><exec/></allow>
      </entry>
      </gacl>
  • Restart wmproxy: /opt/glite/etc/init.d/glite-wms-wmproxy restart
  • Try to issue some commands (e.g. glite-wms-job-list-match, glite-wms-job-submit, glite-wms-job-delegate-proxy, etc...) towards that WMS They should succeed considering any proxy
 
Changed:
<
<
    • submitted a normal job
    • waited until finished successfully
    • checked the job record is in the LBProxy mysql DB
    • retrieved the output via 'glite-wms-job-output'
    • checked the job record is no more in the LBProxy mysql DB
>
>
    • Submitted a normal job
    • Waited until finished successfully
    • Checked the job record is in the LBProxy mysql DB (e.g.: mysql# select * from jobs where jobid like '%hLrG4YYebvYB0xsrPO4q8A%'; where https://devel17.cnaf.infn.it:9000/hLrG4YYebvYB0xsrPO4q8A is the jobid)
    • Retrieved the output via 'glite-wms-job-output'
    • Checked the job record is no more in the LBProxy mysql DB (e.g.: the previous query should return: Empty set)
 

Added:
>
>
    • Check if in the syslog there are lines like:
      May 14 12:37:12 trinity glite_wms_wmproxy_server[3633]: ts=2008-05-14T12:37:12 : event=wms.wmpserver_setJobFileSystem() : userid=502 jobid=https://devel15.cnaf.infn.it:9000/J...
      e.g. userid is specified
 
  • BUG #36870: FIXED
    • Fixed by removing the spec file
Line: 229 to 367
 
    • bug posted by the developer

Added:
>
>
    • Check if in the WMS there is this file: =/etc/cron.d/glite-wms-create-host-proxy.cron=
      HOME=/
      MAILTO=SA3-italia
      
      0 */6 * * * glite . /opt/glite/etc/profile.d/grid-env.sh ; /opt/glite/sbin/glite-wms-create-proxy.sh /opt/glite/var/wms.proxy /opt/glite/var/log/create_proxy.log
 
  • BUG #36907: Hopefully fixed
    • Not easy to reproduce
Line: 246 to 389
 
    • Tested using a short proxy to submit a longer job and ICE does not resubmit it, but afterwards the status is not updated to Done by ICE, due to another bug #39807

Changed:
<
<
[root@wms008 init.d]# grep GLITE_LOCATION glite-wms-ice
GLITE_LOCATION=${GLITE_LOCATION:-/opt/glite}
>
>
    • Do this check:
      [root@wms008 init.d]# grep GLITE_LOCATION glite-wms-ice
      GLITE_LOCATION=${GLITE_LOCATION:-/opt/glite}
 
  • BUG #37916: Hopefully fixed
    • bug posted by the developer

Changed:
<
<
[ale@cream-15 UI]$ ls -l /tmp/ale_eRWc528nX8QpEcs7im-R7g
 total 8
>
>
    • Set the parameter MaxOutputSandboxSize in the WorkloadManager section of the configuration file /opt/glite/etc/glite_wms.conf on the WMS to 100 and restart the workload manager.
    • Submit this jdl:
      [
      Type = "Job";
      Executable = "38359_exe.sh";
      Arguments = "50";
      StdOutput = "test.out";
      StdError = "test.err";
      InputSandbox = {"38359_exe.sh"};
      OutputSandbox = {"test.err","test.out","out3", "out1", "out4", "out2"};
      usertags = [ bug = "38359" ];
      ]
      where 38359_exe.sh is:
      #!/bin/sh
      MAX=$1
      i=0
      while [ $i -lt $MAX ]; do
          echo -n "1" >> out1
                      echo -n "2" >> out2
                      echo -n "3" >> out3
                      echo -n "4" >> out4
          i=$[$i + 1]
      done
      i=200
      while [ $i -lt 100 ]; do
          echo -n "1" >> out1
          echo -n "2" >> out2
          echo -n "3" >> out3
          echo -n "4" >> out4
          i=$[$i + 1]
      done
    • When Done retrieve the output files, this should be the result of an ls -l of the output dir:
 -rw-rw-r-- 1 ale ale 50 Jul 8 12:06 out1 -rw-rw-r-- 1 ale ale 0 Jul 8 12:06 out2.tail -rw-rw-r-- 1 ale ale 50 Jul 8 12:06 out3 -rw-rw-r-- 1 ale ale 0 Jul 8 12:06 out4.tail -rw-rw-r-- 1 ale ale 0 Jul 8 12:06 test.err
Changed:
<
<
-rw-rw-r-- 1 ale ale 0 Jul 8 12:06 test.out
>
>
-rw-rw-r-- 1 ale ale 0 Jul 8 12:06 test.out
 
Added:
>
>
    • Log on the WMS. Stop the workload manager. Put in the directory $GLITE_WMS_LOCATION_VAR/workload_manager/jobdir/new/ this list-match request:
      [root@devel19 glite]# cat /var/glite/workload_manager/jobdir/tmp/20080625T133135.906497_3085874880
      [ arguments = [ ad = [ requirements = ( other.GlueCEStateStatus =="Production" || other.GlueCEStateStatus == "CREAMPreCertTests" ) &&
      !RegExp(".*sdj$",other.GlueCEUniqueID); RetryCount = 3; Arguments = "/tmp"; MyProxyServer = "myproxy.cnaf.infn.it"; AllowZippedISB = true; JobType =
      "normal"; InputSandboxDestFileName = { "pippo","pluto" }; SignificantAttributes = { "Requirements","Rank" }; FuzzyRank = true;
      Executable = "/bin/ls"; CertificateSubject = "/C=IT/O=INFN/OU=Personal Certificate/L=Padova/CN=Alessio Gianelle"; X509UserProxy =
      "/tmp/user.proxy.6056.20080625153135905"; Stdoutput = "ls.out"; VOMS_FQAN = "/dteam/Role=NULL/Capability=NULL"; OutputSandbox = { "ls.out" };
      VirtualOrganisation = "dteam"; usertags = [ exe = "ls" ]; rank =-other.GlueCEStateEstimatedResponseTime; Type = "job"; ShallowRetryCount = 3;
      InputSandbox = {"protocol://address/input/pippo","protocol://address/input/pluto" }; Fuzzyparameter = 1.000000000000000E-01 ]; include_brokerinfo = false; file =
      "/tmp/6056.20080625153135905"; number_of_results = -1 ]; command = "match"; version = "1.0.0" ]
    • Start the workload manager and look if it works.
 

Added:
>
>
    • Consider this /opt/glite/etc/glite_wms_wmproxy.gacl file:
      <?xml version="1.0"?>
      <gacl version="0.0.1">
      <entry>
      <any-user>
      </any-user>
      <deny><exec/></deny>
      </entry>
      <entry>
      <voms>
      <fqan>dteam</fqan>
      </voms>
      <deny><exec/></deny>
      </entry>
      <entry>
      <person>
      <dn>/C=IT/O=INFN/OU=Personal Certificate/L=Padova/CN=Massimo Sgaravatto</dn>
      </person>
      <allow><exec/></allow>
      </entry>
      </gacl>
      replacing "/C=IT/O=INFN/OU=Personal Certificate/L=Padova/CN=Massimo Sgaravatto" with your user DN
    • Try to issue some commands (e.g. glite-wms-job-list-match, glite-wms-job-submit, glite-wms-job-delegate-proxy, etc...) towards that WMS with your dteam VO proxy. They should succeed
 
Added:
>
>
    • Premises:
      • Current memory usage by CREAM is logged in ice log file ($GLITE_WMS_LOCATION_VAR/log/ice.log) is rows such as this one:
        2008-07-28 16:13:23,068 DEBUG - glite-wms-ice::main() - Used RSS Memory: 9780
      • Memory threshold is defined in the ICE section of the WMS conf file by the attribue max_ice_mem
      • When 'current memory' > ' Memory threshold' the suicidal patch is triggered
    • Try to trigger several times the suicidal patch, editing the WMS conf file and setting a low enough value for max_ice_mem
    • Restart ice: /opt/glite/etc/init.d/glite-wms-ice restart
    • When the suicidal patch is triggered, verify that ICE is properly shut down. You will see something like that in the log file:
      2008-07-28 16:45:27,591 FATAL - glite-wms-ice::main() - glite-wms-ice::main -
      Max memory reached [10532 kB] ! EXIT!
    • Then verify that after a while (5 min) ICE restarts
 
Added:
>
>
    • Procede as in the previous bug: #38816
 
Added:
>
>
    • You need to check the code of $GLITE_WMS_LOCATION/sbin/glite-wms-purgeStorage.sh as specified in the bug
 

Added:
>
>
    • Submit a job thorugh ICE (use this requirements: Requirements =  RegExp("cream",other.GlueCEUniqueID);
    • Remove the job directory from the WMS
    • Check in the log if ICE figure out that the proxy is disappeared.
 
Changed:
<
<
>
>
  -- AlessioGianelle - 27 Jun 2008 \ No newline at end of file
 
This site is powered by the TWiki collaboration platformCopyright © 2008-2022 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback