Tags:
,
view all tags
---+!! Known issues %TOC% ---++ Open known issues Known problems in CREAM software or in other software modules affecting a CREAM based CE (the list refer to known problem affecting the latest release of the software released in EMI) ---+++ No dynamic info published for one VOview For one VOView the =lcg-info-dynamic-scheduler= doesn't publish information, and therefore the values defined in the static ldif file is used. As found by Jan Astalos (thanks !) this is because a missing newline at the end of =/var/lib/bdii/gip/ldif/static-file-CE.ldif= created by YAIM. Waiting for the fix, the workaround is simply doing: <verbatim> echo >> /var/lib/bdii/gip/ldif/static-file-CE.ldif </verbatim> after having configured via yaim ---+++ Problems with Torque 2.5.7-1 There is a problem with latest torque version available in the EPEL repository (2.5.7-1). At start the following error is reported: <verbatim> [root@cream-38 ~]# /etc/init.d/pbs_server start /var/torque/server_priv/serverdb Starting TORQUE Server: PBS_Server: LOG_ERROR::No such file or directory (2) in pbs_init, unable to stat checkpoint directory /var/torque/checkpoint/, errno 2 (No such file or directory) PBS_Server: LOG_ERROR::PBS_Server, pbsd_init failed [FAILED] </verbatim> The workaround is to make that directory or install the =torque-mom= package otherwise pointlessly. ---+++ Problems if Torque is not configured to suppress mails Torque should be configured to suppress all mails (mail_domain=never). Otherwise the bupdater process of the blparser will keep dying. Relevant bug: https://savannah.cern.ch/bugs/index.php?86238 ---+++ Condor and SGE support Condor and SGE are not yet fully supported as batch system for CREAM. ---+++ Execution of DAG jobs Execution of DAG jobs on the CREAM based CE through the gLite WMS is not implemented yet. ---+++ Memory issues with new BLAH Blparser If the new Blparser is used (click [[http://wiki.italiangrid.org/twiki/bin/view/CREAM/SystemAdministratorGuideForEMI1#2_15_How_to_check_if_you_are_usi][here]] to check this) there can be issues if the blah registry becomes very large. The submission process can get slower and there can be problems with memory usage. Waiting for the fix, there are two possible workarounds: * Reduce the number of multiple instances of blahpd (the default value is 50). This means changing the value =cream_concurrency_level= in =cream-config.xml=. To apply the change, you will then need to restart tomcat. This should help addressing the issue, but it will also mean less parallel instances interacting with the batch system (and so a possible reduction of the throughput in the submission to the batch system) . Click [[http://wiki.italiangrid.org/twiki/bin/view/CREAM/SystemAdministratorGuideForEMI1#1_4_7_1_Tune_the_number_of_concu][here]] to get more details * Reduce the value for =purge_interval= in =blah.config=. This value is expressed in seconds. A job is removed from the BLAH registry (and therefore not managed anymore by BLAH and therefore CREAM) after =purge_interval= seconds since its submission. To apply the change, you will then need to restart the blparser (=/etc/init.d/glite-ce-blahparser restart=) Relevant bug: https://savannah.cern.ch/bugs/index.php?75854 ---+++ qsub crashes With some Torque versions it was observer qsub crashing with glibc detecting a double free or corruption.Although this is a problem to be addressed in Torque problem, adding: <verbatim> export MALLOC_CHECK_=0 </verbatim> to =/etc/blah.config= should help ---+++ CREAM CE not Torque master: communication errors when the maui server and client are not of the same builds. * [[https://savannah.cern.ch/bugs/?61968][Bug #61698]]: when the CREAM CE is not a Torque server, there could be communication errors when the maui (and probably torque) server and client are NOT of the same builds. A common scenario/example when this can happen: * The maui server is a 32bit binary deployed on a 32bit LCG-CE * The 64bit maui client is deployed on a 64bit CREAM-CE From the CREAM-CE node perform: <verbatim> [root@cream-ce]# diagnose –g </verbatim> If you see: <verbatim> ERROR: lost connection to server ERROR: cannot request service (status) </verbatim> you are affected by the problem. A possible workaround is the following: On the LCG-CE create a cron file to dump the =diagnose -g= output to a file: <verbatim> [root@lcg-ce]# cat <<EOF>> /etc/cron.d/diagnose-for-cream */5 * * * * root /usr/bin/diagnose –g > /export/dir/to/cream-ce/diagnose.out EOF </verbatim> The interval defined in =/etc/cron.d/diagnose-for-cream file=, has to be set by the experts. Just an example has been provided here. Then export over NFS the directory where the file is located: <verbatim> [root@lcg-ce]# cat /etc/exports /export/dir/to/cream-ce cream-ce(rw,map_identity,no_root_squash,sync) </verbatim> On the CREAM-CE include/mount the remote directory to a local one: <verbatim> [root@cream-ce]# cat /etc/fstab | grep diagnose lcg-ce: /export/dir/to/cream-ce /import/dir/to/cream-ce nfs defaults,bg 0 0 </verbatim> Then feed the =lcg-info-dynamic-scheduler= with the diagnose output file: <verbatim> [root@cream-ce]# cat /opt/glite/etc/lcg-info-dynamic-scheduler.conf|grep vomaxjobs-maui vo_max_jobs_cmd: /opt/lcg/libexec/vomaxjobs-maui -h lcg-ce –infile /import/dir/to/cream-ce/diagnose-for-cream </verbatim> ---+++ Reconfiguration after update After an update of the CREAM RPM, it is mandatory to reconfigure (via yaim) ---+++ Special characters in CREAM_DB_USER and CREAM_DB_PASSWORD Don't use special characters in the CREAM_DB_USER and CREAM_DB_PASSWORD yaim variables ---+++ Problems with OS language different than US English Problems have been reported if jobs are submitted through the WMS to a CREAM CE deployed on a machine installed using a non-English language. This is because of different representations of decimal numbers. The workaround in this case is to uncomment the line: <verbatim> LANG=en_US </verbatim> in =$CATALINA_HOME/conf/tomcat5.conf= and then restart tomcat ---++ Old known issues Problems in CREAM software or in other software modules affecting a CREAM based CE that have already been fixed (i.e. they are not affecting the latest release of the software released in EMI) ---+++ Problems affecting users with certificates signed by the GermanGrid Because of a bug in trustmanager, users with certificates signed by the GermanGrid CA can't submit jobs to CREAM. The error message is something like: <verbatim> Failed to create a delegation id for job https://grid-lb0.desy.de:9000/ADkeOt6tc0Rfi8oP-pzUrQ: reason is Client 'O=GermanGrid,OU=DESY,CN=Alexander Fomenko' is not issuer of proxy 'O=GermanGrid,OU=DESY,CN=Alexander Fomenko,CN=proxy,CN=proxy'. </verbatim> * Relevant bug: https://savannah.cern.ch/bugs/?83426 * Fix released with CREAM CE 1.13.2 (http://savannah.cern.ch/task/?21573), released with EMI-1 Update 4 ---+++ Problems with SubCAs when Argus is used as authorization system There are problems when CREAM CE is configured to use Argus, happening with sub-CAs (e.g. CERN-TCA, UKeScienceCA) * Relevant bug: https://savannah.cern.ch/bugs/?82567 * Fix released with CREAM CE 1.13.1 (http://savannah.cern.ch/task/?20813) -- Main.MassimoSgaravatto - 2011-05-05
Edit
|
Attach
|
PDF
|
H
istory
:
r69
|
r9
<
r8
<
r7
<
r6
|
B
acklinks
|
V
iew topic
|
More topic actions...
Topic revision: r7 - 2011-09-01
-
MassimoSgaravatto
Home
Site map
CEMon web
CREAM web
Cloud web
Cyclops web
DGAS web
EgeeJra1It web
Gows web
GridOversight web
IGIPortal web
IGIRelease web
MPI web
Main web
MarcheCloud web
MarcheCloudPilotaCNAF web
Middleware web
Operations web
Sandbox web
Security web
SiteAdminCorner web
TWiki web
Training web
UserSupport web
VOMS web
WMS web
WMSMonitor web
WeNMR web
General Doc
Functional Description
Batch System Support
CREAM and Information Service
Release Notes
Known Issues
Security in CREAM
Nagios Probes to monitor CREAM and WN
Papers
Presentations
User Doc
CREAM User Guide for EMI-1
CREAM User Guide for EMI-2
CREAM User Guide for EMI-3
CREAM JDL Guide
BLAH User Guide
Troubleshooting Guide
System Administrator Doc
System Administrator Guide for CREAM (EMI-3 release)
System Administrator Guide for CREAM (EMI-2 release)
System Administrator Guide for CREAM (EMI-1 release)
The CREAM configuration file
The CEMonitor configuration file
The CREAM CE Service Reference Card (EMI-2 release)
The CREAM CE Service Reference Card (EMI-1 release)
Batch System related documentation
Troubleshooting Guide
The guide for integrating EMIR in CREAM
]
Developers Doc
CREAM Client API C++ Documentation
CREAM Client API for Python
Other Doc
Contacts
Moving to CREAM from LCG-CE
Testing
Internal Collaboration Information
Credits
CREAM Web utilities
Create New Topic
Index
Search
Changes
Notifications
RSS Feed
Statistics
Preferences
P
View
Raw View
Print version
Find backlinks
History
More topic actions
Edit
Raw edit
Attach file or image
Edit topic preference settings
Set new parent
More topic actions
Account
Log In
Edit
Attach
Copyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki?
Send feedback