Tags:
, view all tags

Known issues

Known problems in CREAM software or in other software modules affecting a CREAM based CE (the list refer to known problem affecting the latest release of the software in production)

Condor and SGE support

Condor and SGE are not yet fully supported as batch system for CREAM.

Execution of DAG jobs

Execution of DAG jobs on the CREAM based CE through the gLite WMS is not implemented yet.

qsub crashes

With some Torque versions it was observer qsub crashing with glibc detecting a double free or corruption.Although this is a problem to be addressed in Torque problem, adding:

export MALLOC_CHECK_=0

to /etc/blah.config should help

CREAM CE not Torque master: communication errors when the maui server and client are not of the same builds.

* Bug #61698: when the CREAM CE is not a Torque server, there could be communication errors when the maui (and probably torque) server and client are NOT of the same builds.

A common scenario/example when this can happen:

  • The maui server is a 32bit binary deployed on a 32bit LCG-CE
  • The 64bit maui client is deployed on a 64bit CREAM-CE

From the CREAM-CE node perform:

[root@cream-ce]# diagnose –g

If you see:

ERROR:    lost connection to server
ERROR:    cannot request service (status)

you are affected by the problem.

A possible workaround is the following:

On the LCG-CE create a cron file to dump the diagnose -g output to a file:

[root@lcg-ce]# cat <<EOF>> /etc/cron.d/diagnose-for-cream
*/5 * * * * root  /usr/bin/diagnose –g > /export/dir/to/cream-ce/diagnose.out
EOF

The interval defined in /etc/cron.d/diagnose-for-cream file, has to be set by the experts. Just an example has been provided here.

Then export over NFS the directory where the file is located:

[root@lcg-ce]# cat /etc/exports
/export/dir/to/cream-ce            cream-ce(rw,map_identity,no_root_squash,sync)

On the CREAM-CE include/mount the remote directory to a local one:

[root@cream-ce]# cat /etc/fstab | grep diagnose
lcg-ce: /export/dir/to/cream-ce                /import/dir/to/cream-ce         nfs    defaults,bg        0 0

Then feed the lcg-info-dynamic-scheduler with the diagnose output file:

 
[root@cream-ce]# cat /opt/glite/etc/lcg-info-dynamic-scheduler.conf|grep vomaxjobs-maui
vo_max_jobs_cmd: /opt/lcg/libexec/vomaxjobs-maui -h lcg-ce –infile /import/dir/to/cream-ce/diagnose-for-cream

Reconfiguration after update

After an update of the CREAM RPM, it is mandatory to reconfigure (via yaim)

Special characters in CREAM_DB_USER and CREAM_DB_PASSWORD

Don't use special characters in the CREAM_DB_USER and CREAM_DB_PASSWORD yaim variables

Problems with OS language different than US English

Problems have been reported if jobs are submitted through the WMS to a CREAM CE deployed on a machine installed using a non-English language. This is because of different representations of decimal numbers. The workaround in this case is to uncomment the line:

LANG=en_US

in $CATALINA_HOME/conf/tomcat5.conf and then restart tomcat

-- MassimoSgaravatto - 2011-05-05

Edit | Attach | PDF | History: r69 | r4 < r3 < r2 < r1 | Backlinks | Raw View | More topic actions...
Topic revision: r2 - 2011-05-17 - MassimoSgaravatto
 

  • Edit
  • Attach
This site is powered by the TWiki collaboration platformCopyright © 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback