Tags:
, view all tags

Testing report: IGIRTC-18

Summary

  • Product: BLAH 1.16.4
  • Release Task: Task #25000
  • ETICS Subsystem Configuration Name: emi-cream-ce_R_1_13_7_1
  • VCS Tag: emi-blahp_R_1_16_4_1
  • EMI Major Release: EMI 1 (Kebnekaise)
  • Platform: SL5 epel
  • Author: Sara Bertocco, Alessio Gianelle and Roberto Rosende Dopazo
  • Testing report: See here
  • Certification report: See here
  • Outcome: CERTIFIED

Deployment tests

Clean Installation

Upgrade Installation

LSF CE

PBS CE

SGE CE

Unit Tests

Not Available. The plan is to provide some unit tests starting with EMI-2.

System tests

Functionality tests

Test submission

  • Test result for LSF is available here PASSED
  • Test result for PBS is available here PASSED
  • Test result for SGE is available here PASSED

BLParser test

Old BLParser
  • Job which finishes normally
    • LSF PASSED
      [ale@cream-12 UI]$ glite-ce-job-submit -a -r cream-22.pd.infn.it:8443/cream-lsf-cert cream.jdl
      https://cream-22.pd.infn.it:8443/CREAM053260410
      
      [ale@cream-12 UI]$ glite-ce-job-status https://cream-22.pd.infn.it:8443/CREAM053260410
      
      ******  JobID=[https://cream-22.pd.infn.it:8443/CREAM053260410]
        Status        = [DONE-OK]
        ExitCode      = [0]
      
      
      [root@cream-22 ~]# grep 053260410 /var/log/cream/glite-lsfparser.log
      2011-12-20 13:47:29 Adding: ID:551886 Type:BLAHPNAME Value:"cre22_053260410"
      2011-12-20 13:47:29 Sent for Cream:[BatchJobId="551886"; JobStatus=1; BlahJobName="cre22_053260410"; ClientJobId="053260410";  ChangeTime="2011-12-20 13:47:28";]
      2011-12-20 13:47:33 Sent for Cream:[BatchJobId="551886"; JobStatus=2; BlahJobName="cre22_053260410"; ClientJobId="053260410"; WorkerNode="prod-wn-002"; ChangeTime="2011-12-20 13:47:33";]
      2011-12-20 13:49:14 Sent for Cream:[BatchJobId="551886"; JobStatus=4; BlahJobName="cre22_053260410"; ClientJobId="053260410"; WorkerNode="prod-wn-002"; Reason="lsf_reason=0"; ChangeTime="2011-12-20 13:49:14";]
    • PBS/Torque PASSED
      [bertocco@cream-12 mie_certificazioni]$ glite-ce-job-submit -a -r cream-41.pd.infn.it:8443/cream-pbs-cert test1.jdl
      
      [bertocco@cream-12 mie_certificazioni]$ glite-ce-job-status https://cream-41.pd.infn.it:8443/CREAM228600831
      
      ******  JobID=[https://cream-41.pd.infn.it:8443/CREAM228600831]
         Status        = [DONE-OK]
         ExitCode      = [0]
      [root@cream-41 ~]# grep 228600831 /var/log/cream/glite-pbsparser.log
      2011-12-20 15:45:34 Adding: ID:41080 Type:BLAHPNAME Value:cre41_228600831
      2011-12-20 15:45:34 Sent for Cream:[BatchJobId="41080"; JobStatus=1; BlahJobName="cre41_228600831"; ClientJobId="228600831";  ChangeTime="2011-12-20 15:45:34";]
      2011-12-20 15:45:35 Sent for Cream:[BatchJobId="41080"; JobStatus=2; BlahJobName="cre41_228600831"; ClientJobId="228600831";  ChangeTime="2011-12-20 15:45:35";]
      2011-12-20 15:47:16 Sent for Cream:[BatchJobId="41080"; JobStatus=4; BlahJobName="cre41_228600831"; ClientJobId="228600831";  Reason="pbs_reason=0"; ChangeTime="2011-12-20 15:47:15";]

  • Job which is cancelled
    • LSF PASSED
      [ale@cream-12 UI]$ glite-ce-job-submit -a -r cream-22.pd.infn.it:8443/cream-lsf-cert cream.jdl
      https://cream-22.pd.infn.it:8443/CREAM142681575
      
      [ale@cream-12 UI]$ glite-ce-job-cancel https://cream-22.pd.infn.it:8443/CREAM142681575
      
      Are you sure you want to cancel specified job(s) [y/n]: y
      
      [ale@cream-12 UI]$ glite-ce-job-status https://cream-22.pd.infn.it:8443/CREAM142681575
      
      ******  JobID=[https://cream-22.pd.infn.it:8443/CREAM142681575]
        Status        = [CANCELLED]
        ExitCode      = []
        Description   = [Cancelled by user]
      
      [root@cream-22 ~]# grep 142681575 /var/log/cream/glite-lsfparser.log
      2011-12-20 13:51:28 Adding: ID:551887 Type:BLAHPNAME Value:"cre22_142681575"
      2011-12-20 13:51:28 Sent for Cream:[BatchJobId="551887"; JobStatus=1; BlahJobName="cre22_142681575"; ClientJobId="142681575";  ChangeTime="2011-12-20 13:51:27";]
      2011-12-20 13:51:29 Sent for Cream:[BatchJobId="551887"; JobStatus=2; BlahJobName="cre22_142681575"; ClientJobId="142681575"; WorkerNode="prod-wn-003"; ChangeTime="2011-12-20 13:51:29";]
      2011-12-20 13:51:55 Sent for Cream:[BatchJobId="551887"; JobStatus=3; BlahJobName="cre22_142681575"; ClientJobId="142681575"; WorkerNode="prod-wn-003"; ChangeTime="2011-12-20 13:51:54";]
    • PBS/Torque PASSED
      [bertocco@cream-12 mie_certificazioni]$ glite-ce-job-submit -a -r cream-41.pd.infn.it:8443/cream-pbs-cert test1.jdl
      https://cream-41.pd.infn.it:8443/CREAM097109154
      
      [bertocco@cream-12 mie_certificazioni]$ glite-ce-job-status https://cream-41.pd.infn.it:8443/CREAM097109154
      
      ******  JobID=[https://cream-41.pd.infn.it:8443/CREAM097109154]
         Status        = [CANCELLED]
         ExitCode      = []
         Description   = [Cancelled by user]
      
      [root@cream-41 ~]# grep 097109154 /var/log/cream/glite-pbsparser.log
      2011-12-20 16:04:01 Adding: ID:41082 Type:BLAHPNAME Value:cre41_097109154
      2011-12-20 16:04:01 Sent for Cream:[BatchJobId="41082"; JobStatus=1; BlahJobName="cre41_097109154"; ClientJobId="097109154";  ChangeTime="2011-12-20 16:04:00";]
      2011-12-20 16:04:01 Sent for Cream:[BatchJobId="41082"; JobStatus=2; BlahJobName="cre41_097109154"; ClientJobId="097109154";  ChangeTime="2011-12-20 16:04:01";]
      2011-12-20 16:04:29 Sent for Cream:[BatchJobId="41082"; JobStatus=3; BlahJobName="cre41_097109154"; ClientJobId="097109154";  ChangeTime="2011-12-20 16:04:28";]
      2011-12-20 16:04:29 Sent for Cream:[BatchJobId="41082"; JobStatus=4; BlahJobName="cre41_097109154"; ClientJobId="097109154";  Reason="pbs_reason=271"; ExitReason="Killed by Resource Management System"; ChangeTime="2011-12-20 16:04:28";]

  • Job which is suspended and then resumed

New BLParser
  • Job which finishes normally
    • LSF PASSED
      [ale@cream-12 UI]$ glite-ce-job-submit -a -r cream-20.pd.infn.it:8443/cream-lsf-cert cream.jdl
      https://cream-20.pd.infn.it:8443/CREAM587966276
      
      [ale@cream-12 UI]$ glite-ce-job-status https://cream-20.pd.infn.it:8443/CREAM587966276
      
      ******  JobID=[https://cream-20.pd.infn.it:8443/CREAM587966276]
        Status        = [DONE-OK]
        ExitCode      = [0]
      
      [root@cream-20 ~]# grep 587966276 /var/log/cream/glite-ce-bnotifier.log
      2011-12-20 15:17:00 Sent for Cream:[BatchJobId="551901"; JobStatus=2; ChangeTime="2011-12-20 15:16:28"; WorkerNode="prod-wn-003"; ClientJobId="587966276"; BlahJobName="cre20_587966276";]
      2011-12-20 15:18:41 Sent for Cream:[BatchJobId="551901"; JobStatus=4; ChangeTime="2011-12-20 15:18:09"; WorkerNode="prod-wn-003"; JwExitCode=0; Reason="reason=0"; ClientJobId="587966276"; BlahJobName="cre20_587966276";]
    • PBS/Torque PASSED
      [bertocco@cream-12 mie_certificazioni]$ glite-ce-job-submit -a -r cream-40.pd.infn.it:8443/cream-pbs-cert test1.jdl
      https://cream-40.pd.infn.it:8443/CREAM509451203
      
      [bertocco@cream-12 mie_certificazioni]$ glite-ce-job-status https://cream-40.pd.infn.it:8443/CREAM509451203
      
      ******  JobID=[https://cream-40.pd.infn.it:8443/CREAM509451203]
         Status        = [DONE-OK]
         ExitCode      = [0]
      
      [root@cream-40 ~]# grep 509451203  /var/log/cream/glite-ce-bnotifier.log
      2011-12-20 16:42:32 Sent for Cream:[BatchJobId="41349.cream-40.pd.infn.it"; JobStatus=2; ChangeTime="2011-12-20 16:41:52"; WorkerNode="cream-wn-040.pn.pd.infn.it"; ClientJobId="509451203"; BlahJobName="cre40_509451203";]
      2011-12-20 16:44:13 Sent for Cream:[BatchJobId="41349.cream-40.pd.infn.it"; JobStatus=4; ChangeTime="2011-12-20 16:43:32"; WorkerNode="cream-wn-040.pn.pd.infn.it"; JwExitCode=0; Reason="reason=0"; ClientJobId="509451203"; BlahJobName="cre40_509451203";]

  • Job which is cancelled
    • LSF PASSED
      [ale@cream-12 UI]$ glite-ce-job-submit -a -r cream-20.pd.infn.it:8443/cream-lsf-cert cream.jdl
      https://cream-20.pd.infn.it:8443/CREAM260205441
      [ale@cream-12 UI]$ glite-ce-job-cancel https://cream-20.pd.infn.it:8443/CREAM260205441
      
      Are you sure you want to cancel specified job(s) [y/n]: y
      
      [ale@cream-12 UI]$ glite-ce-job-status https://cream-20.pd.infn.it:8443/CREAM260205441
      
      ******  JobID=[https://cream-20.pd.infn.it:8443/CREAM260205441]
        Status        = [CANCELLED]
        ExitCode      = []
        Description   = [Cancelled by user]
      
      [root@cream-20 ~]# grep 260205441 /var/log/cream/glite-ce-bnotifier.log
      2011-12-20 15:20:26 Sent for Cream:[BatchJobId="551902"; JobStatus=3; ChangeTime="2011-12-20 15:20:02"; JwExitCode=-999; Reason="reason=-999"; ClientJobId="260205441"; BlahJobName="cre20_260205441";]
    • PBS/Torque PASSED
      [bertocco@cream-12 mie_certificazioni]$ glite-ce-job-submit -a -r cream-40.pd.infn.it:8443/cream-pbs-cert test1.jdl
      https://cream-40.pd.infn.it:8443/CREAM887070151
      
      [bertocco@cream-12 mie_certificazioni]$ glite-ce-job-cancel https://cream-40.pd.infn.it:8443/CREAM887070151
      
      Are you sure you want to cancel specified job(s) [y/n]: y
      
      [bertocco@cream-12 mie_certificazioni]$ glite-ce-job-status https://cream-40.pd.infn.it:8443/CREAM887070151
      
      ******  JobID=[https://cream-40.pd.infn.it:8443/CREAM887070151]
         Status        = [CANCELLED]
         ExitCode      = []
         Description   = [Cancelled by user]
      
      [root@cream-40 ~]# grep 887070151  /var/log/cream/glite-ce-bnotifier.log
      2011-12-20 16:58:33 Sent for Cream:[BatchJobId="41350.cream-40.pd.infn.it"; JobStatus=3; ChangeTime="2011-12-20 16:58:20"; JwExitCode=-999; Reason="reason=-999"; ClientJobId="887070151"; BlahJobName="cre40_887070151";]

  • Job which is suspended and then resumed
    • LSF PASSED
      [ale@cream-12 UI]$ glite-ce-job-submit -a -r cream-20.pd.infn.it:8443/cream-lsf-cert cream.jdl
      https://cream-20.pd.infn.it:8443/CREAM908720454
      [ale@cream-12 UI]$ glite-ce-job-suspend https://cream-20.pd.infn.it:8443/CREAM908720454
      
      Are you sure you want to suspend specified job(s) [y/n]: y
      
      [ale@cream-12 UI]$ glite-ce-job-status https://cream-20.pd.infn.it:8443/CREAM908720454
      
      ******  JobID=[https://cream-20.pd.infn.it:8443/CREAM908720454]
        Status        = [HELD]
      
      [ale@cream-12 UI]$ glite-ce-job-resume https://cream-20.pd.infn.it:8443/CREAM908720454
      
      Are you sure you want to resume specified job(s) [y/n]: y
      
      [ale@cream-12 UI]$ glite-ce-job-status https://cream-20.pd.infn.it:8443/CREAM908720454
      
      ******  JobID=[https://cream-20.pd.infn.it:8443/CREAM908720454]
        Status        = [DONE-OK]
        ExitCode      = [0]
      
      [root@cream-20 ~]# grep 908720454 /var/log/cream/glite-ce-bnotifier.log
      2011-12-20 15:23:27 Sent for Cream:[BatchJobId="551904"; JobStatus=5; ChangeTime="2011-12-20 15:23:05"; ClientJobId="908720454"; BlahJobName="cre20_908720454";]
      2011-12-20 15:31:12 Sent for Cream:[BatchJobId="551904"; JobStatus=4; ChangeTime="2011-12-20 15:30:33"; JwExitCode=0; Reason="reason=0"; ClientJobId="908720454"; BlahJobName="cre20_908720454";]
    • PBS PASSED
      [bertocco@cream-12 mie_certificazioni]$ glite-ce-job-submit -a -r cream-40.pd.infn.it:8443/cream-pbs-cert test1.jdl
      [bertocco@cream-12 mie_certificazioni]$ glite-ce-job-status https://cream-40.pd.infn.it:8443/CREAM089011081
      
      ******  JobID=[https://cream-40.pd.infn.it:8443/CREAM089011081]
         Status        = [IDLE]
      
      [bertocco@cream-12 mie_certificazioni]$ glite-ce-job-suspend https://cream-40.pd.infn.it:8443/CREAM089011081
      
      Are you sure you want to suspend specified job(s) [y/n]: y
      
      [bertocco@cream-12 mie_certificazioni]$ glite-ce-job-status https://cream-40.pd.infn.it:8443/CREAM089011081
      
      ******  JobID=[https://cream-40.pd.infn.it:8443/CREAM089011081]
         Status        = [HELD]
      
      [bertocco@cream-12 mie_certificazioni]$ glite-ce-job-resume https://cream-40.pd.infn.it:8443/CREAM089011081
      
      Are you sure you want to resume specified job(s) [y/n]: y
      
      [root@cream-40 ~]# grep 089011081 /var/log/cream/glite-ce-bnotifier.log
      [BatchJobId="41400.cream-40.pd.infn.it"; JobStatus=1; ChangeTime="2011-12-20 17:13:10"; ClientJobId="089011081"; BlahJobName="cre40_089011081";]
      2011-12-20 17:14:35 Sent for Cream:[BatchJobId="41400.cream-40.pd.infn.it"; JobStatus=5; ChangeTime="2011-12-20 17:14:27"; ClientJobId="089011081"; BlahJobName="cre40_089011081";]
      2011-12-20 17:21:17 Sent for Cream:[BatchJobId="41400.cream-40.pd.infn.it"; JobStatus=2; ChangeTime="2011-12-20 17:21:14"; WorkerNode="cream-wn-040.pn.pd.infn.it"; ClientJobId="089011081"; BlahJobName="cre40_089011081";]
      2011-12-20 17:23:17 Sent for Cream:[BatchJobId="41400.cream-40.pd.infn.it"; JobStatus=4; ChangeTime="2011-12-20 17:22:55"; WorkerNode="cream-wn-040.pn.pd.infn.it"; JwExitCode=0; Reason="reason=0"; ClientJobId="089011081"; BlahJobName="cre40_089011081";]

Regression tests

Verification attached bugs

Bug #88974: BUpdaterSGE and BNotifier don't start if sge_helperpath var is not fixed FIXED

Comment the sge_helperpath and test the installation:

[root@sa3-ce ~]# cat /etc/blah.config | grep sge_helper
#sge_helperpath=/opt/glite/bin/sge_helper
[root@sa3-ce ~]# /etc/init.d/gLite start
*** tomcat5:
Starting tomcat5:                                          [  OK  ]

*** glite-lb-locallogger:
Starting glite-lb-logd ...This is LocalLogger, part of Workload Management System in EU DataGrid & EGEE.
 done
Starting glite-lb-interlogd ... done

*** glite-ce-blahparser:
Starting BNotifier:                                        [  OK  ]

Starting BUpdaterSGE:                                      [  OK  ]



[rrosende@ui test_cert_cream]$ ./runtest.sh 
 --> ++++++++++++++++++++++++++++++++++++++++++++
 --> + Test of CREAM-CE command line interface  +
 --> ++++++++++++++++++++++++++++++++++++++++++++
 --> Test starts at: 11:58:28
 --> CE used: sa3-ce.egee.cesga.es:8443/cream-sge-cesga
 --> 
 --> [ 11:58:28 ] run:  ./CREAM-cli-delegation.sh
 --> 
 -->  === TEST PASSED === 
 --> 
 --> 
 --> [ 11:58:46 ] run:  ./CREAM-cli-delegation-renew.sh
 --> 
 -->  === TEST PASSED === 
 --> 
 --> 
 --> [ 11:58:57 ] run:  ./CREAM-cli-job-submit.sh
 --> 
 -->  === TEST PASSED === 
 --> 
 --> 
 --> [ 11:59:16 ] run:  ./CREAM-cli-job-status-simple.sh
 --> 
 -->  === TEST PASSED === 
 --> 
 --> 
 --> [ 11:59:40 ] run:  ./CREAM-cli-job-status-filtered.sh
 --> 
 -->  === TEST PASSED === 
 --> 
 --> 
 --> [ 12:00:50 ] run:  ./CREAM-cli-job-cancel.sh
 --> 
 -->  === TEST PASSED === 
 --> 
 --> 
 --> [ 12:11:12 ] run:  ./CREAM-cli-job-suspend.sh
 --> 
 -->  === TEST PASSED === 
 --> 
 --> 
 --> [ 12:12:19 ] run:  ./CREAM-cli-job-list.sh
 --> 
 -->  === TEST PASSED === 
 --> 
 --> 
 --> [ 12:12:41 ] run:  ./CREAM-cli-submission-management.sh
 --> 
 -->  === TEST PASSED === 
 --> 
 -->  <<< All tests PASSED >>>
 --> cleaning up /tmp/cream-cli-test-rrosende-19724 ...

Bug #89859: There is a memory leak in the updater for LSF, PBS and Condor FIXED

Submit 1000 jobs, one every 3 seconds monitoring the Used RSS memory of the /usr/bin/BUpdaterLSF process:

LSFmem.png

Submit 1000 jobs, one every 3 seconds monitoring the Used RSS memory of the /usr/bin/BUpdaterPBS process:

PBSmem.png

Verification old bugs

Submitted 5000 jobs to a CREAM CE configured using the new blparser, and with job_registry_use_mmap=yes.

Monitored the used RSS of the blahpd processes. At the end the maximum value between all the process is 14040.

Test PASSED

Configure /etc/blah.config:

[root@cream-40 ~]# tail -4 /etc/blah.config
# Verify fix for bug #77776
pbs_batch_caching_enabled=yes
batch_command_caching_filter=/usr/bin/runcmd.pl

Where runcmd.pl is:

#!/usr/bin/perl
#---------------------#
#  PROGRAM:  argv.pl  #
#---------------------#

$numArgs = $#ARGV + 1;
open (MYFILE, '>>/tmp/xyz');
foreach $argnum (0 .. $#ARGV) {
    print MYFILE "$ARGV[$argnum] ";
}
print MYFILE "\n";
close (MYFILE); 

Restart the services and submit 10 jobs to the CE.

[root@cream-40 ~]# cat /tmp/xyz 
/usr/bin/qstat -f 
/usr/bin/tracejob -p /var/torque -m -l -a -n 2 43450.cream-40.pd.infn.it 
/usr/bin/tracejob -p /var/torque -m -l -a -n 2 43451.cream-40.pd.infn.it 
/usr/bin/tracejob -p /var/torque -m -l -a -n 2 43452.cream-40.pd.infn.it 
/usr/bin/tracejob -p /var/torque -m -l -a -n 2 43453.cream-40.pd.infn.it 
/usr/bin/tracejob -p /var/torque -m -l -a -n 2 43454.cream-40.pd.infn.it 
/usr/bin/tracejob -p /var/torque -m -l -a -n 2 43455.cream-40.pd.infn.it 
/usr/bin/tracejob -p /var/torque -m -l -a -n 2 43456.cream-40.pd.infn.it 
/usr/bin/tracejob -p /var/torque -m -l -a -n 2 43457.cream-40.pd.infn.it 
/usr/bin/tracejob -p /var/torque -m -l -a -n 2 43458.cream-40.pd.infn.it 
/usr/bin/tracejob -p /var/torque -m -l -a -n 2 43459.cream-40.pd.infn.it 
/usr/bin/tracejob -p /var/torque -m -l -a -n 2 43460.cream-40.pd.infn.it 
/usr/bin/tracejob -p /var/torque -m -l -a -n 2 43461.cream-40.pd.infn.it 
/usr/bin/tracejob -p /var/torque -m -l -a -n 2 43462.cream-40.pd.infn.it 
/usr/bin/tracejob -p /var/torque -m -l -a -n 2 43463.cream-40.pd.infn.it 
/usr/bin/tracejob -p /var/torque -m -l -a -n 2 43464.cream-40.pd.infn.it 
/usr/bin/tracejob -p /var/torque -m -l -a -n 2 43465.cream-40.pd.infn.it 
/usr/bin/tracejob -p /var/torque -m -l -a -n 2 43466.cream-40.pd.infn.it 
/usr/bin/tracejob -p /var/torque -m -l -a -n 2 43467.cream-40.pd.infn.it 
/usr/bin/tracejob -p /var/torque -m -l -a -n 2 43468.cream-40.pd.infn.it 
/usr/bin/tracejob -p /var/torque -m -l -a -n 2 43469.cream-40.pd.infn.it 
/usr/bin/tracejob -p /var/torque -m -l -a -n 2 43470.cream-40.pd.infn.it 
/usr/bin/tracejob -p /var/torque -m -l -a -n 2 43471.cream-40.pd.infn.it 
/usr/bin/tracejob -p /var/torque -m -l -a -n 2 43472.cream-40.pd.infn.it 
/usr/bin/qstat -f 

Test PASSED

[root@cream-40 ~]# ls -l /var/blah
total 12
-rw-r--r-- 1 tomcat tomcat    4 Dec 22 13:41 blah_bnotifier.pid
-rw-r--r-- 1 tomcat tomcat    4 Dec 22 13:41 blah_bupdater.pid
drwxrwx--t 4 tomcat tomcat 4096 Dec 22 14:37 user_blah_job_registry.bjr
[root@cream-40 ~]# ls -l /var/blah/user_blah_job_registry.bjr/
total 32548
-rw-rw-r-- 1 tomcat tomcat 25467712 Dec 22 14:38 registry
-rw-r--r-- 1 tomcat tomcat  7735260 Dec 22 14:37 registry.by_blah_index
-rw-rw-rw- 1 tomcat tomcat        0 Dec 22 14:38 registry.locktest
drwxrwx-wt 2 tomcat tomcat     4096 Dec 22 14:38 registry.npudir
drwxrwx-wt 2 tomcat tomcat    65536 Dec 22 14:38 registry.proxydir
-rw-r--r-- 1 tomcat tomcat      200 Dec 22 13:40 registry.subjectlist
[root@cream-40 ~]# ls -l /var/blah/user_blah_job_registry.bjr/registry.npudir
total 0
[root@cream-40 ~]# ls -l /var/blah/user_blah_job_registry.bjr/registry.proxydir/
total 108
lrwxrwxrwx 1 dteam009 dteam 192 Dec 22 13:39 proxy_43450.cream-40.pd.in_5YGF6T -> /var/cream_sandbox/dteam/_C_IT_O_INFN_OU_Personal_Certificate_L_Padova_CN_Alessio_Gianelle_dteam_Role_NULL_Capability_NULL_dteam009/proxy/e65639ca3adc874ae7aa7504652d83d2f5a2ad7c19364155241627
lrwxrwxrwx 1 dteam009 dteam 192 Dec 22 13:39 proxy_43451.cream-40.pd.in_tlAfpa -> /var/cream_sandbox/dteam/_C_IT_O_INFN_OU_Personal_Certificate_L_Padova_CN_Alessio_Gianelle_dteam_Role_NULL_Capability_NULL_dteam009/proxy/55e80b2a9b68a2cb7019ac69cda5390dd6489c4719364155241627

Test PASSED

[root@cream-40 ~]# su - tomcat
-sh-3.2$ /usr/bin/blahpd
$GahpVersion: 1.16.4 Mar 31 2008 INFN\ blahpd\ (poly,new_esc_format) $
BLAH_SET_SUDO_ID dteam001
S Sudo\ mode\ on
blah_job_submit 1 [cmd="/bin/cp";Args="fstab\ fstab.out";TransferInput="/home/dteam001/dir1/fstab";TransferOutput="fstab.out";TransferOutputRemaps="fstab.out=/home/dteam001/dir1/fstab.out";gridtype="pbs";queue="creamtest2";x509userproxy="/tmp/proxy"]
S
results
S 1
1 0 No\ error pbs/20111222/43477.cream-40.pd.infn.it
Connection closed by remote host


[root@cream-40 ~]# ls -l /home/dteam001/dir1/
total 8
-rw-r--r-- 1 dteam001 dteam 527 Dec 22 15:10 fstab
-rw-r--r-- 1 dteam001 dteam 527 Dec 22 15:11 fstab.out

Test PASSED

[root@cream-17 cream-40]# grep BUPDATER_LOOP_INTERVAL services/glite-creamce
BUPDATER_LOOP_INTERVAL=42

After yaim reconfiguration:

[root@cream-40 ~]# grep bupdater_loop_interval /etc/blah.config
bupdater_loop_interval=42

Test PASSED

Try first with direct submission:

[ale@cream-12 UI]$ glite-ce-job-submit -a -r cream-40.pd.infn.it:8443/cream-pbs-creamtest1 cream.jdl
https://cream-40.pd.infn.it:8443/CREAM974280852

[root@cream-40 ~]# grep CREAM974280852 /var/log/cream/accounting/blahp.log-20111222
"timestamp=2011-12-22 14:25:08" "userDN=/C=IT/O=INFN/OU=Personal Certificate/L=Padova/CN=Alessio Gianelle" "userFQAN=/dteam/Role=NULL/Capability=NULL" "userFQAN=/dteam/NGI_IT/Role=NULL/Capability=NULL" "ceID=cream-40.pd.infn.it:8443/cream-pbs-creamtest1" "jobID=CREAM974280852" "lrmsID=43478.cream-40.pd.infn.it" "localUser=18181" "clientID=cre40_974280852"

Try submission through a WMS:

[ale@cream-12 UI]$ glite-wms-job-submit -a -c etc/wmp_devel11.conf -r cream-40.pd.infn.it:8443/cream-pbs-creamtest1 cream.jdl

Connecting to the service https://devel11.cnaf.infn.it:7443/glite_wms_wmproxy_server


====================== glite-wms-job-submit Success ======================

The job has been successfully submitted to the WMProxy
Your job identifier is:

https://devel11.cnaf.infn.it:9000/B0wtklaUDmXbO1xDDqQ23Q

==========================================================================

[ale@cream-12 UI]$ glite-wms-job-logging-info -v 2 --event Transfer  https://devel11.cnaf.infn.it:9000/B0wtklaUDmXbO1xDDqQ23Q  | grep "Dest jobid"
- Dest jobid                 =    unavailable
- Dest jobid                 =    https://cream-40.pd.infn.it:8443/CREAM794974631

[root@cream-40 ~]# grep B0wtklaUDmXbO1xDDqQ23Q /var/log/cream/accounting/blahp.log-20111222
"timestamp=2011-12-22 14:28:10" "userDN=/C=IT/O=INFN/OU=Personal Certificate/L=Padova/CN=Alessio Gianelle" "userFQAN=/dteam/Role=NULL/Capability=NULL" "userFQAN=/dteam/NGI_IT/Role=NULL/Capability=NULL" "ceID=cream-40.pd.infn.it:8443/cream-pbs-creamtest1" "jobID=https://devel11.cnaf.infn.it:9000/B0wtklaUDmXbO1xDDqQ23Q" "lrmsID=43479.cream-40.pd.infn.it" "localUser=18181" "clientID=cre40_794974631"

Test PASSED

[root@cream-40 ~]# cat /etc/logrotate.d/blahp-logrotate | grep rotate
        rotate 365

Test PASSED

[ale@cream-12 UI]$ glite-ce-job-submit -a -r cream-40.pd.infn.it:8443/cream-pbs-creamtest1 creamshort.jdl
https://cream-40.pd.infn.it:8443/CREAM137590769

[root@cream-40 ~]# grep 137590769 /var/log/cream/glite-ce-bnotifier.log
2011-12-22 15:44:59 Sent for Cream:[BatchJobId="43482.cream-40.pd.infn.it"; JobStatus=4; ChangeTime="2011-12-22 15:44:22"; JwExitCode=0; Reason="reason=0"; ClientJobId="137590769"; BlahJobName="cre40_137590769";]

Test PASSED

[root@cream-40 ~]# su - tomcat
-sh-3.2$ /usr/bin/blahpd
$GahpVersion: 1.16.4 Mar 31 2008 INFN\ blahpd\ (poly,new_esc_format) $
BLAH_SET_SUDO_ID dteam001
S Sudo\ mode\ on
BLAH_JOB_SUBMIT 1 [Cmd="/bin/echo";Args="$HOSTNAME";Out="/tmp/stdout_l15367";In="/dev/null";GridType="pbs";Queue="creamtest1";x509userproxy="/tmp/proxy";Iwd="/tmp";TransferOutput="output_file";TransferOutputRemaps="output_file=/tmp/stdout_l15367";GridResource="blah"]
S
results
S 1
1 0 No\ error pbs/20111222/43483.cream-40.pd.infn.it
Connection closed by remote host

-sh-3.2$ cat /tmp/stdout_l15367 
cream-wn-040.pn.pd.infn.it

[root@cream-40 ~]# tracejob 43483 | grep exec_host
/var/torque/mom_logs/20111222: No such file or directory
/var/torque/sched_logs/20111222: No such file or directory
12/22/2011 15:47:59  A    user=dteam001 group=dteam jobname=bl_f355dbb0979b queue=creamtest1 ctime=1324565278 qtime=1324565278 etime=1324565278 start=1324565279 owner=dteam001@cream-40.pd.infn.it exec_host=cream-wn-040.pn.pd.infn.it/0 Resource_List.neednodes=cream-wn-040.pn.pd.infn.it 

Test PASSED

[root@cream-41 ~]# ps ax | grep BLParserPBS
  754 pts/1    S+     0:00 grep BLParserPBS
31468 ?        Sl     0:00 /usr/bin/BLParserPBS -d 1 -l /var/log/cream/glite-pbsparser.log -s /var/torque -p 33333 -m 56565

Test PASSED

Topic attachments
I Attachment Action Size Date Who Comment
Texttxt EMI_Certification_Report_Task25000.txt manage 2.8 K 2012-01-02 - 10:28 AlessioGianelle Certification report
Texttxt EMI_Test_Report_Task25000.txt manage 4.7 K 2012-01-02 - 10:28 AlessioGianelle Test report
PNGpng LSFmem.png manage 4.4 K 2011-12-22 - 10:51 AlessioGianelle Test for bug 89859 (lsf)
PNGpng PBSmem.png manage 4.9 K 2011-12-22 - 11:48 AlessioGianelle Test for bug 89859 (pbs)
Texttxt configure.txt manage 67.2 K 2011-12-23 - 14:20 AlessioGianelle Clean configuration
Texttxt install.txt manage 169.3 K 2011-12-23 - 14:20 AlessioGianelle Clean installation
Unknown file formatlog lsfsubmission.log manage 1118.6 K 2011-12-20 - 12:43 AlessioGianelle Test submission for LSF
Texttxt lsfupdate.txt manage 15.9 K 2011-12-20 - 10:50 AlessioGianelle LSF update log file
Texttxt lsfupdate_conf.txt manage 44.2 K 2011-12-20 - 10:51 AlessioGianelle LSF configuration update log file
Texttxt lsfupdate_conf_old.txt manage 48.1 K 2011-12-20 - 12:28 AlessioGianelle LSF configuration with old BLParser update log file
Unknown file formatlog pbssubmission.log manage 27.8 K 2011-12-20 - 14:10 SaraBertocco PBS submission test
Texttxt pbsupdate.txt manage 18.1 K 2011-12-20 - 13:21 SaraBertocco pbs update log file
Texttxt pbsupdate_conf.txt manage 14.3 K 2011-12-20 - 13:32 SaraBertocco Configuration log file (using new BLParser model)
Texttxt pbsupdate_conf_old.txt manage 16.4 K 2011-12-20 - 13:51 SaraBertocco  
Unknown file formatlog sgesubmission.log manage 8.5 K 2012-01-02 - 10:18 AlessioGianelle Test submission for SGE
Texttxt sgeupdate.txt manage 6.5 K 2012-01-02 - 10:15 AlessioGianelle SGE update log file
Texttxt sgeupdate_conf.txt manage 53.4 K 2012-01-02 - 10:16 AlessioGianelle SGE configuration update log file
Edit | Attach | PDF | History: r20 | r18 < r17 < r16 < r15 | Backlinks | Raw View | More topic actions...
Topic revision: r16 - 2012-01-02 - AlessioGianelle
 
  • Edit
  • Attach
This site is powered by the TWiki collaboration platformCopyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback