Tags:
create new tag
,
view all tags
---+ TESTS * =Normal= jobs work: %GREEN%OK%ENDCOLOR% * =Dag= jobs work: %GREEN%OK%ENDCOLOR% * =Perusal= jobs work: %GREEN%OK%ENDCOLOR% * =MPICH= jobs work: %GREEN%OK%ENDCOLOR% <verbatim> Modified mpirun: Executing command: /home/dteam029/globus-tmp.griditwn03.7486.0/https_3a_2f_2fdevel17.cnaf.infn.it_3a9000_2fOsslm3cw4T7lgR09qJTR4g/cpi Process 0 of 1 on griditwn03.na.infn.it pi is approximately 3.1415926544231341, Error is 0.0000000008333410 wall clock time = 10.001266 </verbatim> * Submission of 270 collections of 100 jobs each (10 collections every 30 minutes), using 1 user and a fuzzy rank (used 90 lcg CEs): * Success > 99.99% %GREEN%OK%ENDCOLOR% * Cancelled about 1800 jobs due to a problem with the CEs at in2p3.fr ---++ Check bugs: * BUG [[https://savannah.cern.ch/bugs/?13494][#13494]]: %GREEN%FIXED%ENDCOLOR% * checked by Laurence Field and ARC developers * BUG [[https://savannah.cern.ch/bugs/?16308][#16308]]: %GREEN%FIXED%ENDCOLOR% * Set the two parameters, subscription_update_threshold_time and subscription_duration in the ICE section of the glite_wms.conf file to low values, such as the following: <verbatim> subscription_duration = 300; subscription_update_threshold_time = 150;</verbatim> so that a subscription expires after 5 minutes * Re-start ICE by the script '/opt/glite/etc/init.d/glite-wms-ice' * Submit a job and check the status of the subscription by the following command of the cream client: <verbatim> CEMonitorSubscriberMgr <cert_proxy> <cert_path> <service_URL_address> </verbatim> * BUG [[https://savannah.cern.ch/bugs/?21909][#21909]]: %GREEN%FIXED%ENDCOLOR% * In the wmproxy startup script check if there are these lines: <verbatim> if ( /sbin/pidof $httpd ) >/dev/null 2>&1 ; then echo $httpd \(pid `/sbin/pidof $httpd`\) is running ....</verbatim> * BUG [[https://savannah.cern.ch/bugs/?23443][#23443]]: %GREEN%FIXED%ENDCOLOR% * Required documents are not put into the glite doc template in edms * BUG [[https://savannah.cern.ch/bugs/?24173][#24173]]: %GREEN%FIXED%ENDCOLOR% * Check if in the JobWrapper there are these lines: <verbatim> for edg_rm_command in $GLITE_LOCATION/bin/edg-rm $EDG_LOCATION/bin/edg-rm `which edg-rm 2>/dev/null`; do [...]</verbatim> * BUG [[https://savannah.cern.ch/bugs/?24690][#24690]]: %ORANGE%NOT COMPLETELY FIXED%ENDCOLOR% * The message error that you could find in the wmproxy log (also with level 5) is: *edg_wll_JobStat GSSAPI Error* * In any case now there is a dedicated cron script to renew host-proxy (e.g. it is not included in the cron-purger script) * BUG [[https://savannah.cern.ch/bugs/?26885][#26885]]: %GREEN%FIXED%ENDCOLOR% * Checked with two subsequent submissions of 5 collections made of 50 nodes each. ICE does not leave any job with status UNKNOWN behind in the cache * BUG [[https://savannah.cern.ch/bugs/?27215][#27215]]: %GREEN%FIXED (for a LCG-CE)%ENDCOLOR%; %RED%NOT fixed for a CREAM-CE%ENDCOLOR% * Set the parameter MaxOutputSandboxSize in the WorkloadManager section of the configuration file /opt/glite/etc/glite_wms.conf on the WMS to 100 and restart the workload manager. * Submit a jdl like this: <verbatim> [ Type = "Job"; Executable = "27215_exe.sh"; Arguments = "70"; StdOutput = "test.out"; StdError = "test.err"; Environment = {"GLITE_LOCAL_MAX_OSB_SIZE=35"}; InputSandbox = {"27215_exe.sh"}; OutputSandbox = {"test.err","test.out","out2", "out1"}; usertags = [ bug = "27215" ]; ]</verbatim> where 27215_exe.sh contains<verbatim> #!/bin/sh MAX=$1 i=0 while [ $i -lt $MAX ]; do echo -n "1" >> out1 echo -n "2" >> out2 i=$[$i + 1] done</verbatim> * When Done retrieving the output files, this should be the result of an ls -l of the output dir: <verbatim> -rw-rw-r-- 1 ale ale 30 Jul 8 16:02 out1.tail -rw-rw-r-- 1 ale ale 70 Jul 8 16:02 out2 -rw-rw-r-- 1 ale ale 0 Jul 8 16:02 test.err -rw-rw-r-- 1 ale ale 0 Jul 8 16:02 test.out</verbatim> * BUG [[https://savannah.cern.ch/bugs/?27797][#27797]]: %GREEN%FIXED%ENDCOLOR% * Submit a jdl like this one:<verbatim> [ JobType = "parametric"; Executable = "/usr/bin/env"; Environment = {"MYPATH_PARAM_=$PATH:/bin:/usr/bin:$HOME"}; StdOutput = "echo_PARAM_.out"; StdError = "echo_PARAM_.err"; OutputSandbox = {"echo_PARAM_.out","echo_PARAM_.err"}; Parameters = {test, 2}; ]</verbatim> * The generated jdl should contains:<verbatim> [ requirements = other.GlueCEStateStatus == "Production"; nodes = [ dependencies = { }; Node_test = [ ... ]; Node_2 = [ ... ]; [...] ]</verbatim> * BUG [[https://savannah.cern.ch/bugs/?27899][#27899]]: %GREEN%FIXED%ENDCOLOR% * Edit the configuration file in =/opt/glite/etc/<vo_name>/glite_wmsclient.conf= by changing the =virtualorganisation= attribute in the JdlDefaultAttributes section to a <vo_name> different from the one used to generate the user proxy, as in the following: * Submit a job and check the generated .jdl has the right virtualorganisation defined, that is the same <vo_name> as the one used to generated the user proxy * BUG [[https://savannah.cern.ch/bugs/?28235][#28235]]: %GREEN%FIXED%ENDCOLOR% * Change the jdl setting the name of an existing CE (in the requirements) * At the end of the jobs with this command: ==glite-wms-job-logging-info -v 2 "jobid" | grep -A 2 Match | grep Dest== you should see 3 times the name of the previously choosen CE. (The job must be Aborted with reason: hit job shallow retry count (2)) * BUG [[https://savannah.cern.ch/bugs/?28249][#28249]]: %GREEN%Hopefully fixed%ENDCOLOR% * bug posted by the developer * BUG [[https://savannah.cern.ch/bugs/?28498][#28498]]: %GREEN%FIXED%ENDCOLOR% * compilation error with gcc-4.x * BUG [[https://savannah.cern.ch/bugs/?28637][#28637]]: %GREEN%FIXED%ENDCOLOR% * Create a delegated proxy with ==glite-wms-job-delegate-proxy -d pippo== on the wmproxy server of the wms machine * Submit a job to a cream CE via the wms using the previously created delegated proxy: ==glite-wms-job-submit myjob.jdl -d pippo== * Delete records for the user DN you are submitting with from the delegationdb on the CREAM CE, like the following:<verbatim> delete from t_credential where dn like '%Elisabetta%'; delete from t_credential_cache where dn like '%Elisabetta%';</verbatim> * Submit a new normal job using the same delegated proxy as above; * BUG [[https://savannah.cern.ch/bugs/?28642][#28642]]: %GREEN%FIXED%ENDCOLOR% * Submit this jdl: <verbatim> [ Executable = "/usr/bin/env" ; Stdoutput = "env.out" ; StdError = "env.err" ; shallowretrycount = 2; InputSandbox = { "data/input.txt" }; OutputSandbox = { "env.out" ,"env.err", "input.txt" } ; Environment={"LD_LIBRARY_PATH=."}; usertags = [ bug = "28642" ]; ]</verbatim> * Get the output of the job. In the output directory you should find the file input.txt, and the LD_LIBRARY_PATH should be set to "." into the file env.out. * BUG [[https://savannah.cern.ch/bugs/?28657][#28657]]: %GREEN%FIXED%ENDCOLOR% * Stop ICE: /opt/glite/etc/glite-wms-ice stop * Corrupt ICE database, e.g. by doing the following:<verbatim> For each file (all but *proxy*) in /var/glite/ice/persist_dir do: cat "pippo" > "file" done</verbatim> * Start ICE: ==/opt/glite/etc/glite-wms-ice start== * In the ICE log file you should see something like:<verbatim> 2008-07-29 12:44:00,537 FATAL - jobCache::jobCache() - Failed to initialize the jobDbManager object. Reason is: Db::open: Invalid argument</verbatim> * BUG [[https://savannah.cern.ch/bugs/?29182][#29182]]: %GREEN%Hopefully fixed%ENDCOLOR% * not easy to reproduce * BUG [[https://savannah.cern.ch/bugs/?29538][#29538]]: %GREEN%Hopefully fixed%ENDCOLOR% * bug posted by the developer * BUG [[https://savannah.cern.ch/bugs/?30289][#30289]]: %GREEN%FIXED%ENDCOLOR% * Fixed by not using 'clog' * BUG [[https://savannah.cern.ch/bugs/?30308][#30308]]: %GREEN%FIXED%ENDCOLOR% * Submit this jdl:<verbatim> [ requirements = ( other.GlueCEStateStatus == "Production" ) && Member("MPICH",other.GlueHostApplicationSoftwareRunTimeEnvironment) && ( other.GlueCEInfoTotalCPUs >= 4 ) && ( other.GlueCEInfoLRMSType == "torque" || RegExp("pbs",other.GlueCEInfoLRMSType) ); Type = "Job"; NodeNumber = 4; Executable = "30308_exe.sh"; Arguments = "cpi 4"; StdOutput = "test.out"; StdError = "test.err"; InputSandbox = {"30308_exe.sh", "exe/cpi"}; OutputSandbox = {"test.err","test.out","executable.out"}; usertags = [ bug = "30308" ]; ]</verbatim>Where the 30308_exe.sh should be:<verbatim> #!/bin/sh # The first parameter is the binary to be executed EXE=$1 # The second parameter is the number of CPU's to be reserved for parallel execution CPU_NEEDED=$2 chmod 777 $EXE # prints the list of files in the working directory echo "List files on the working directory:" ls -alR `pwd` # execute the user job mpirun -np $CPU_NEEDED -machinefile $PBS_NODEFILE `pwd`/$EXE >& executable.out</verbatim> * When DONE retrieve the output and check that the directory ==.mpi== should not be listed in the test.out output file. * BUG [[https://savannah.cern.ch/bugs/?30518][#30518]]: %GREEN%Hopefully fixed%ENDCOLOR% * not easy to reproduce * BUG [[https://savannah.cern.ch/bugs/?30816][#30816]]: %GREEN%FIXED%ENDCOLOR% * Already fixed and working on the production wms using patch [[https://savannah.cern.ch/patch/?1491][#1491]] * BUG [[https://savannah.cern.ch/bugs/?30896][#30896]]: %GREEN%FIXED%ENDCOLOR% * Set =maxInputSandboxFiles = 2=; in the WorkloadManagerProxy section of the configuration file on the WMS, and restart the wmproxy. * Submit a job with more than 2 files listed in the =InputSandbox= parameter * Check if the job is immediately set as Aborted and if the reason of the status is: ==The Operation is not allowed: The maximum number of input sandbox files is reached== * Set =maxOutputSandboxFiles = 2=; in the WorkloadManagerProxy section of the configuration file on the WMS and restart the wmproxy. * Submit a job with more than 2 files listed in the I=OutputSandbox= parameter * Check if the job is immediately set as Aborted and if the reason of the status is: ==The Operation is not allowed: The maximum number of output sandbox files is reached== * BUG [[https://savannah.cern.ch/bugs/?30900][#30900]]: %GREEN%FIXED%ENDCOLOR% * Default value on WMS conf file is again: MinPerusalTimeInterval = 5; -> this should be set by yaim (see [[https://savannah.cern.ch/bugs/?30900][#30900]]) * Default value for MinPerusalTimeInterval should be check on the configuration file of the WMS. * Set =MaxPerusalFiles = 1=; on the WorkloadManagerProxy of the configuration file on the WMS and restart the wmproxy. * After the submission of the jdl give this command: ==glite-wms-job-perusal --set -f perusal.out -f perusal.err "jobid"== The answer should be:<verbatim> Error - WMProxy Server Error The Operation is not allowed: The maximum number of perusal files is reached Method: enableFilePerusal</verbatim> * BUG [[https://savannah.cern.ch/bugs/?31006][#31006]]: %GREEN%Hopefully FIXED%ENDCOLOR% * Not easy to reproduce * BUG [[https://savannah.cern.ch/bugs/?31026][#31026]]: %GREEN%FIXED%ENDCOLOR% * Simply check the =/opt/glite/etc/templates/template.sh= file on a WMS * BUG [[https://savannah.cern.ch/bugs/?31278][#31278]]: %GREEN%FIXED%ENDCOLOR% * Using the command ==glite-wms-job-info --jdl "jobid" | grep -i requirements== check if the expression =RegExp(".*sdj$",other.GlueCEniqueID);= is present (the exact expression should be found in the configuration file on the WMS, section: WorkloadManagerProxy, parameter: SDJRequirements) * Setting =ShortDeadlineJob=false;= in the jdl, the previous command should contain the expression =!RegExp(".*sdj$",other.GlueCEUniqueID)= * BUG [[https://savannah.cern.ch/bugs/?32078][#32078]]: %GREEN%FIXED%ENDCOLOR% * Set on the WMS conf file: =II_Contact = "lcg-bdii.cern.ch";= * Do a list-match using this jdl:<verbatim> [ Requirements = RegExp(".manchester.ac.uk:2119.*",other.GlueCEUniqueID) && anyMatch(other.storage.CloseSEs,target.GlueSEStatus == "unset"); Executable = "/bin/ls"; prologue = "/bin/false"; ]</verbatim> the output should be:<verbatim> - ce01.tier2.hep.manchester.ac.uk:2119/jobmanager-lcgpbs-dteam</verbatim> * BUG [[https://savannah.cern.ch/bugs/?32345][#32345]]: %GREEN%FIXED%ENDCOLOR% * Reproduced the problem by inserting a 500 sec sleep in the dirmanager and killing it by hand while unzipping the ISB. The job stays in status 'waiting' and is not forwarded to the WM. * BUG [[https://savannah.cern.ch/bugs/?32366][#32366]]: %GREEN%FIXED%ENDCOLOR% * Check if in the file =/opt/glite/etc/templates/template.sh= on a WMS there are these lines:<verbatim> # customization point if [ -n "${GLITE_LOCAL_CUSTOMIZATION_DIR}" ]; then if [ -f "${GLITE_LOCAL_CUSTOMIZATION_DIR}/cp_1_5.sh" ]; then . "${GLITE_LOCAL_CUSTOMIZATION_DIR}/cp_1_5.sh" fi fi</verbatim> * BUG [[https://savannah.cern.ch/bugs/?32528][#32528]]: %GREEN%FIXED%ENDCOLOR% * Set a very low timeout for the BDII on the WMS conf file: =II_Timeout = 3;= * Now setting on the WMS conf file: =IsmIILDAPSearchAsync = false;= * You should see in the log file of the workload_manager (if yuo use a populate BDII):<verbatim> [Warning] fetch_bdii_ce_info(ldap-utils.cpp:640): Timed out [Warning] fetch_bdii_se_info(ldap-utils.cpp:308): Timed out [Debug] do_purchase(ism-ii-purchaser.cpp:176): BDII fetching completed in 4 seconds [Info] do_purchase(ism-ii-purchaser.cpp:193): Total VO_Views entries in ISM : 0 [Info] do_purchase(ism-ii-purchaser.cpp:194): Total SE entries in ISM : 0</verbatim> * Setting: =IsmIILDAPSearchAsync = true:= you should obtain more (>0) VO_Views entries (e.g.):<verbatim> [Debug] fetch_bdii_ce_info(ldap-utils-asynch.cpp:628): #1652 LDAP entries received in 5 seconds [Debug] fetch_bdii_ce_info(ldap-utils-asynch.cpp:781): ClassAd reppresentation built in 0 seconds [Debug] fetch_bdii_se_info(ldap-utils-asynch.cpp:444): #2381 LDAP entries received in 5 seconds [Debug] fetch_bdii_se_info(ldap-utils-asynch.cpp:504): ClassAd reppresentation built in 0 seconds [Debug] do_purchase(ism-ii-purchaser.cpp:176): BDII fetching completed in 10 seconds [Info] do_purchase(ism-ii-purchaser.cpp:193): Total VO_Views entries in ISM : 53 [Info] do_purchase(ism-ii-purchaser.cpp:194): Total SE entries in ISM : 61</verbatim> * BUG [[https://savannah.cern.ch/bugs/?32962][#32962]]: %GREEN%FIXED%ENDCOLOR% * BUG [[https://savannah.cern.ch/bugs/?32980][#32980]]: %GREEN%FIXED%ENDCOLOR% * Submit a jdl * Look into the SandBox dir of the job (on the WMS) until you see the =Maradona= file * Put the condor job (equivalent to your job previously submitted) on hold, this should trigger a resubmission * When the job has been resubmitted check if the old =Maradona= file has been removed * BUG [[https://savannah.cern.ch/bugs/?33026][#33026]]: %GREEN%FIXED%ENDCOLOR% * Set the II_Timeout parameter in the NetworkServr section of the glite_wms.conf file on the WMS to a very low value, as for ex.: =II_Timeout = 2;= * Rre-start the WM and check the =$GLITE_WMS_LOCATION_VAR/workload_manager/ismdump.fl= does not get emptied * Perform some job-list-match operation checking that it gets some match results * BUG [[https://savannah.cern.ch/bugs/?33103][#33103]]: %GREEN%FIXED%ENDCOLOR%: * Add this parameter to section WorkloadManager of the glite_wms.conf configuration file (using for example vo "cms" as filter): ==IsmIILDAPCEFilterExt = "(|(GlueCEAccessControlBaseRule=VO:cms)(GlueCEAccessControlBaseRule=VOMS:/cms/*))"== * Restart the WM * Doing a list-match using a voms proxy of a different VO (e.g. dteam) you should obtain "no resource available". * BUG [[https://savannah.cern.ch/bugs/?33140][#33140]]: %GREEN%Hopefully FIXED%ENDCOLOR% * Not easy to reproduce * BUG [[https://savannah.cern.ch/bugs/?33378][#33378]]: %GREEN%FIXED%ENDCOLOR% * Removed if present the directory =$GLITE_WMS_LOCATION_VAR/workload_manager/jobdir= on the WMS * Restart the wm and check id the previous directory is recreated. * BUG [[https://savannah.cern.ch/bugs/?34508][#34508]]: %GREEN%FIXED%ENDCOLOR% * Stop the WM on the WMS. * Submit a collection * Restart the WM * Check if the status of the collection changes to Running * BUG [[https://savannah.cern.ch/bugs/?34510][#34510]]: %GREEN%FIXED%ENDCOLOR% * Set the "ExpiryPeriod" parameter in the glite_wms.conf configuration file to a very low value, such as the following: =ExpiryPeriod = 2;= * Overload the wms, by submitting several collections sequentially, like for example: 10 collections by 100 nodes * Check the jobstatus of the last submitted collections and keep submitting until the status of the parent node is aborted because of the following:<verbatim> ************************************************************* BOOKKEEPING INFORMATION: Status info for the Job : https://devel17.cnaf.infn.it:9000/qQe68ESYiRNDNXZPNsG-AA Current Status: Aborted Status Reason: request expired Submitted: Wed Jul 30 11:23:49 2008 CEST *************************************************************</verbatim> * Stop submitted and check if the status of all the children nodes is aborted as well * BUG [[https://savannah.cern.ch/bugs/?35156][#35156]]: %GREEN%FIXED%ENDCOLOR% * Check if the proxy file name is hardcoded on =$GLITE_WMS_LOCATION/sbin/glite-wms-purgeStorage.sh= * BUG [[https://savannah.cern.ch/bugs/?35250][#35250]]: %GREEN%FIXED%ENDCOLOR% * Test it for the filelist input method by setting the following two parameters in the glite_wms.conf WMS configuration file, workload manager section:<verbatim> DispatcherType = "filelist"; Input = "${GLITE_LOCATION_VAR}/workload_manager/input.fl";</verbatim> * Re-start the WM and submit a DAG * Check if it is successful * Test it for the jobdir input method by setting the following two parameters in the glite_wms.conf WMS configuration file, workload manager section:<verbatim> DispatcherType = "jobdir"; Input = "${GLITE_LOCATION_VAR}/workload_manager/jobdir";</verbatim> * Re-start the WM and submit a DAG * Check if it is successful * BUG [[https://savannah.cern.ch/bugs/?35878][#35878]]: %GREEN%FIXED%ENDCOLOR% * compilation error with gcc-4.x * BUG [[https://savannah.cern.ch/bugs/?36145][#36145]]: %GREEN%FIXED%ENDCOLOR% * Set the following two parameters in the glite_wms.conf WMS configuration file, workload manager section:<verbatim> DispatcherType = "jobdir"; Input = "${GLITE_LOCATION_VAR}/workload_manager/jobdir";</verbatim> * Re-start the WM and submit a DAG * Check if it is successful * BUG [[https://savannah.cern.ch/bugs/?36341][#36341]]: %GREEN%Hopefully fixed%ENDCOLOR% * bug posted by the developer * BUG [[https://savannah.cern.ch/bugs/?36466][#36466]]: %GREEN%Hopefully fixed%ENDCOLOR% * bug posted by the developer * BUG [[https://savannah.cern.ch/bugs/?36496][#36496]]: %GREEN%FIXED%ENDCOLOR% * Consider this =/opt/glite/etc/glite_wms_wmproxy.gacl= file:<verbatim> <?xml version="1.0"?> <gacl version="0.0.1"> <entry> <any-user/> <allow><exec/></allow> </entry> </gacl></verbatim> * Restart wmproxy: /opt/glite/etc/init.d/glite-wms-wmproxy restart * Try to issue some commands (e.g. glite-wms-job-list-match, glite-wms-job-submit, glite-wms-job-delegate-proxy, etc...) towards that WMS They should succeed considering any proxy * BUG [[https://savannah.cern.ch/bugs/?36536][#36536]]: %GREEN%FIXED%ENDCOLOR% * Submitted a normal job * Waited until finished successfully * Checked the job record is in the LBProxy mysql DB (e.g.: =mysql# select * from jobs where jobid like '%hLrG4YYebvYB0xsrPO4q8A%';= where https://devel17.cnaf.infn.it:9000/hLrG4YYebvYB0xsrPO4q8A is the jobid) * Retrieved the output via 'glite-wms-job-output' * Checked the job record is no more in the LBProxy mysql DB (e.g.: the previous query should return: =Empty set=) * BUG [[https://savannah.cern.ch/bugs/?36551][#36551]]: %GREEN%FIXED%ENDCOLOR% * BUG [[https://savannah.cern.ch/bugs/?36558][#36558]]: %GREEN%FIXED%ENDCOLOR% * Check if in the syslog there are lines like:<verbatim> May 14 12:37:12 trinity glite_wms_wmproxy_server[3633]: ts=2008-05-14T12:37:12 : event=wms.wmpserver_setJobFileSystem() : userid=502 jobid=https://devel15.cnaf.infn.it:9000/J...</verbatim> e.g. =userid= is specified * BUG [[https://savannah.cern.ch/bugs/?36870][#36870]]: %GREEN%FIXED%ENDCOLOR% * Fixed by removing the spec file * BUG [[https://savannah.cern.ch/bugs/?36876][#36876]]: %GREEN%Hopefully fixed%ENDCOLOR% * bug posted by the developer * BUG [[https://savannah.cern.ch/bugs/?36902][#36902]]: %GREEN%FIXED%ENDCOLOR% * Check if in the WMS there is this file: =/etc/cron.d/glite-wms-create-host-proxy.cron=<verbatim> HOME=/ MAILTO=SA3-italia 0 */6 * * * glite . /opt/glite/etc/profile.d/grid-env.sh ; /opt/glite/sbin/glite-wms-create-proxy.sh /opt/glite/var/wms.proxy /opt/glite/var/log/create_proxy.log</verbatim> * BUG [[https://savannah.cern.ch/bugs/?36907][#36907]]: %GREEN%Hopefully fixed%ENDCOLOR% * Not easy to reproduce * BUG [[https://savannah.cern.ch/bugs/?36913][#36913]]: %GREEN%FIXED%ENDCOLOR% * BUG [[https://savannah.cern.ch/bugs/?36962][#36962]]: %GREEN%FIXED%ENDCOLOR% * BUG [[https://savannah.cern.ch/bugs/?37659][#37659]]: %GREEN%Hopefully FIXED%ENDCOLOR% * BUG [[https://savannah.cern.ch/bugs/?37674][#37674]]: %GREEN%Hopefully FIXED%ENDCOLOR% * Not easy to reproduce * BUG [[https://savannah.cern.ch/bugs/?37756][#37756]]: %ORANGE%NOT COMPLETELY FIXED %ENDCOLOR% * Tested using a short proxy to submit a longer job and ICE does not resubmit it, but afterwards the status is not updated to Done by ICE, due to another bug [[https://savannah.cern.ch/bugs/?39807][#39807]] * BUG [[https://savannah.cern.ch/bugs/?37862][#37862]]: %GREEN%FIXED%ENDCOLOR% * Do this check: <verbatim> [root@wms008 init.d]# grep GLITE_LOCATION glite-wms-ice GLITE_LOCATION=${GLITE_LOCATION:-/opt/glite}</verbatim> * BUG [[https://savannah.cern.ch/bugs/?37916][#37916]]: %GREEN%Hopefully fixed%ENDCOLOR% * bug posted by the developer * BUG [[https://savannah.cern.ch/bugs/?38359][#38359]]: %GREEN%FIXED%ENDCOLOR% * Set the parameter =MaxOutputSandboxSize= in the WorkloadManager section of the configuration file /opt/glite/etc/glite_wms.conf on the WMS to 100 and restart the workload manager. * Submit this jdl:<verbatim> [ Type = "Job"; Executable = "38359_exe.sh"; Arguments = "50"; StdOutput = "test.out"; StdError = "test.err"; InputSandbox = {"38359_exe.sh"}; OutputSandbox = {"test.err","test.out","out3", "out1", "out4", "out2"}; usertags = [ bug = "38359" ]; ]</verbatim>where 38359_exe.sh is:<verbatim> #!/bin/sh MAX=$1 i=0 while [ $i -lt $MAX ]; do echo -n "1" >> out1 echo -n "2" >> out2 echo -n "3" >> out3 echo -n "4" >> out4 i=$[$i + 1] done i=200 while [ $i -lt 100 ]; do echo -n "1" >> out1 echo -n "2" >> out2 echo -n "3" >> out3 echo -n "4" >> out4 i=$[$i + 1] done</verbatim> * When Done retrieve the output files, this should be the result of an ls -l of the output dir:<verbatim> -rw-rw-r-- 1 ale ale 50 Jul 8 12:06 out1 -rw-rw-r-- 1 ale ale 0 Jul 8 12:06 out2.tail -rw-rw-r-- 1 ale ale 50 Jul 8 12:06 out3 -rw-rw-r-- 1 ale ale 0 Jul 8 12:06 out4.tail -rw-rw-r-- 1 ale ale 0 Jul 8 12:06 test.err -rw-rw-r-- 1 ale ale 0 Jul 8 12:06 test.out</verbatim> * BUG [[https://savannah.cern.ch/bugs/?38366][#38366]]: %GREEN%FIXED%ENDCOLOR% * Log on the WMS. Stop the workload manager. Put in the directory =$GLITE_WMS_LOCATION_VAR/workload_manager/jobdir/new/= this list-match request:<verbatim> [root@devel19 glite]# cat /var/glite/workload_manager/jobdir/tmp/20080625T133135.906497_3085874880 [ arguments = [ ad = [ requirements = ( other.GlueCEStateStatus =="Production" || other.GlueCEStateStatus == "CREAMPreCertTests" ) && !RegExp(".*sdj$",other.GlueCEUniqueID); RetryCount = 3; Arguments = "/tmp"; MyProxyServer = "myproxy.cnaf.infn.it"; AllowZippedISB = true; JobType = "normal"; InputSandboxDestFileName = { "pippo","pluto" }; SignificantAttributes = { "Requirements","Rank" }; FuzzyRank = true; Executable = "/bin/ls"; CertificateSubject = "/C=IT/O=INFN/OU=Personal Certificate/L=Padova/CN=Alessio Gianelle"; X509UserProxy = "/tmp/user.proxy.6056.20080625153135905"; Stdoutput = "ls.out"; VOMS_FQAN = "/dteam/Role=NULL/Capability=NULL"; OutputSandbox = { "ls.out" }; VirtualOrganisation = "dteam"; usertags = [ exe = "ls" ]; rank =-other.GlueCEStateEstimatedResponseTime; Type = "job"; ShallowRetryCount = 3; InputSandbox = {"protocol://address/input/pippo","protocol://address/input/pluto" }; Fuzzyparameter = 1.000000000000000E-01 ]; include_brokerinfo = false; file = "/tmp/6056.20080625153135905"; number_of_results = -1 ]; command = "match"; version = "1.0.0" ]</verbatim> * Start the workload manager and look if it works. * BUG [[https://savannah.cern.ch/bugs/?38509][#38509]]: %GREEN%FIXED%ENDCOLOR% * BUG [[https://savannah.cern.ch/bugs/?38739][#38739]]: %GREEN%FIXED%ENDCOLOR% * Consider this =/opt/glite/etc/glite_wms_wmproxy.gacl= file:<verbatim> <?xml version="1.0"?> <gacl version="0.0.1"> <entry> <any-user> </any-user> <deny><exec/></deny> </entry> <entry> <voms> <fqan>dteam</fqan> </voms> <deny><exec/></deny> </entry> <entry> <person> <dn>/C=IT/O=INFN/OU=Personal Certificate/L=Padova/CN=Massimo Sgaravatto</dn> </person> <allow><exec/></allow> </entry> </gacl></verbatim>replacing "/C=IT/O=INFN/OU=Personal Certificate/L=Padova/CN=Massimo Sgaravatto" with your user DN * Try to issue some commands (e.g. glite-wms-job-list-match, glite-wms-job-submit, glite-wms-job-delegate-proxy, etc...) towards that WMS with your dteam VO proxy. They should succeed * BUG [[https://savannah.cern.ch/bugs/?38816][#38816]]: %GREEN%FIXED%ENDCOLOR% * Premises: * Current memory usage by CREAM is logged in ice log file ($GLITE_WMS_LOCATION_VAR/log/ice.log) is rows such as this one:<verbatim> 2008-07-28 16:13:23,068 DEBUG - glite-wms-ice::main() - Used RSS Memory: 9780</verbatim> * Memory threshold is defined in the ICE section of the WMS conf file by the attribue =max_ice_mem= * When 'current memory' > ' Memory threshold' the suicidal patch is triggered * Try to trigger several times the suicidal patch, editing the WMS conf file and setting a low enough value for =max_ice_mem= * Restart ice: /opt/glite/etc/init.d/glite-wms-ice restart * When the suicidal patch is triggered, verify that ICE is properly shut down. You will see something like that in the log file:<verbatim> 2008-07-28 16:45:27,591 FATAL - glite-wms-ice::main() - glite-wms-ice::main - Max memory reached [10532 kB] ! EXIT!</verbatim> * Then verify that after a while (5 min) ICE restarts * BUG [[https://savannah.cern.ch/bugs/?38828][#38828]]: %GREEN%FIXED%ENDCOLOR% * Procede as in the previous bug: #38816 * BUG [[https://savannah.cern.ch/bugs/?39215][#39215]]: %GREEN%FIXED%ENDCOLOR% * You need to check the code of $GLITE_WMS_LOCATION/sbin/glite-wms-purgeStorage.sh as specified in the bug * BUG [[https://savannah.cern.ch/bugs/?39217][#39217]]: %GREEN%FIXED%ENDCOLOR% * BUG [[https://savannah.cern.ch/bugs/?39501][#39501]]: %GREEN%FIXED%ENDCOLOR% * Submit a job thorugh ICE (use this requirements: =Requirements = RegExp("cream",other.GlueCEUniqueID)=; * Remove the job directory from the WMS * Check in the log if ICE figure out that the proxy is disappeared. * BUG [[https://savannah.cern.ch/bugs/?40967][#40967]]: %GREEN%FIXED%ENDCOLOR% -- Main.AlessioGianelle - 27 Jun 2008
E
dit
|
A
ttach
|
PDF
|
H
istory
: r61
<
r60
<
r59
<
r58
<
r57
|
B
acklinks
|
V
iew topic
|
M
ore topic actions
Topic revision: r61 - 2011-02-24
-
AlessioGianelle
Home
Site map
CEMon web
CREAM web
Cloud web
Cyclops web
DGAS web
EgeeJra1It web
Gows web
GridOversight web
IGIPortal web
IGIRelease web
MPI web
Main web
MarcheCloud web
MarcheCloudPilotaCNAF web
Middleware web
Operations web
Sandbox web
Security web
SiteAdminCorner web
TWiki web
Training web
UserSupport web
VOMS web
WMS web
WMSMonitor web
WeNMR web
EgeeJra1It Web
Create New Topic
Index
Search
Changes
Notifications
RSS Feed
Statistics
Preferences
P
View
Raw View
Print version
Find backlinks
History
More topic actions
Edit
Raw edit
Attach file or image
Edit topic preference settings
Set new parent
More topic actions
Account
Log In
E
dit
A
ttach
Copyright © 2008-2022 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki?
Send feedback