APEL Deployment

We are changing the accounting system used in our infrastructure, from DGAS to APEL and the procedure is quite simple.

Each resource centre needs to install a new node (the APEL Publisher) which receives the accounting information sent from the CE(s) by the APEL parser. Then the APEL publisher sends the data to the EGI central database using the messaging infrastructure.

In the past months we tested two scenarios:

  • the accounting data are sent directly to the central databse (canonic installation)
  • the accounting data are sent to FAUST and to the EGI central database

and we chose the second one

Registration

you need to register the APEL publisher in the GOC-DB: the service endpoint name to add is glite-APEL and fill in also the certificate subject information. Changes in GOCDB can take up to 4 hours to make it to the message brokers. this is necessary to authorize the publisher host in using the broker network. Do not touch the APEL service endpoint instead, otherwise nagios won't monitor the accounting data publication.

(for reference https://wiki.egi.eu/wiki/MAN09_Accounting_data_publishing)

APEL Publisher Installation and Configuration

Follow the EMI3 generic installation guide and the APEL one https://twiki.cern.ch/twiki/pub/EMI/EMI3APELClient/APEL_Publisher_System_Administrator_Guide.pdf

Use the production queue of the broker network:

# Queue to which SSM will send messages (use this)
destination: /queue/global.accounting.cpu.central 
and comment out or delete the testing one

Sending the data to FAUST

In order to send the accounting data also to FAUST, after installing and configuring the APEL Publisher as explained in the section above, follow the instructions https://github.com/andreaguarise/ssm-dupl-send

this mean that instead of apelclient script, you have to use the ssm-dupl-send.sh one.

Among the important paramaters to set in the faust-sender.cfg there are the following ones:

host: dgas-broker.to.infn.it
port: 61613

use_ssl: false

destination: apel.input

Create with mkdir the directory:

/var/spool/faust/outgoing

IMPORTANT: for Tier1 and Tier2 it will be used a dedicated queue:

destination: apel.<SITE-NAME>.input

For instance, in the case of INFN-PISA, into file faust-sender.cfg it will be set:

destination: apel.INFN-PISA.input

IMPORTANT: Run the FAUST script only after having launched the parser one on the CEs

  • Example of the cron:
cat /etc/cron.d/ssm-dupl-send 
# Run APEL client once daily
05 01 * * * root /root/bin/ssm-dupl-send.sh

APEL Parser Installation and Configuration

Install and configure the APEL parser on each computing element of your resource centre.

Follow the EMI3 generic installation guide and the APEL one https://twiki.cern.ch/twiki/pub/EMI/EMI3APELClient/APEL_Parsers_System_Administrator_Guide.pdf

IMPORTANT: Send the accounting data starting from September, because the previous ones have been already sent by DGAS to APEL, otherwise the will be overwritten causing some inconsistencies. Configure the parser accordingly to make process the proper files (or move the old logs in another directory).

you can launch the apelparser script after the setting-up of the apelclient database

an example of the cron:

cat /etc/cron.d/apelparser 
# Run APEL parser once daily
04 22 * * * root /usr/bin/apelparser

NOTE: the empty logfiles produce a CRITICAL error in the parsing operation:

2014-09-12 11:02:08,683 - apel.common.exceptions - CRITICAL - Unhandled exception raised!
2014-09-12 11:02:08,683 - apel.common.exceptions - CRITICAL - Please send a bug report with following information:
2014-09-12 11:02:08,684 - apel.common.exceptions - CRITICAL - UnboundLocalError: local variable 'line_number' referenced before assignment
2014-09-12 11:02:08,684 - apel.common.exceptions - CRITICAL - parse_file [/usr/bin/apelparser 139]
2014-09-12 11:02:08,684 - apel.common.exceptions - CRITICAL - scan_dir in /usr/bin/apelparser [187]
2014-09-12 11:02:08,684 - apel.common.exceptions - CRITICAL - handle_parsing in /usr/bin/apelparser [296]
2014-09-12 11:02:08,684 - apel.common.exceptions - CRITICAL - main in /usr/bin/apelparser [380]
2014-09-12 11:02:08,684 - apel.common.exceptions - CRITICAL - ? in /usr/bin/apelparser [392]
so check for these empty files and delete them.

Bug in APEL 1.2.1 - Apply the EMI-3 update 20

The EMI-3 update 20 released a fix of the APEL software for a bug in the parser which prevents it from working in the most common cases: it is unable to open uncompressed accounting logs for parsing.

Sites with this problem will have version 1.2.1 installed and see many log messages like this:

2014-08-11 12:54:11,819 - parser - ERROR - Cannot open file 
blahp.log-20140811: Not a gzipped file
in their parser log file - usually at /var/log/apel/parser.log.

Sites who have installed version 1.2.1 should upgrade to 1.2.2 immediately: http://www.eu-emi.eu/releases/emi-3-monte-bianco/updates/-/asset_publisher/5Na8/content/update-20-12-09-2014-v-3-11-0-1

What about DGAS sensors?

Because the DGAS server problems occurred at the beginning of September, stop dgas sensors on your computing element(s).

Fast checks

When you launch the apelparser script for the first time, if there are no errors, it will be filled in the tables BlahdRecords e EventRecords (database "apelclient"), so check if they really contain the data.

Then with the execution of ssm-dupl-send.sh script on the publisher host, it will be done the join of those tables (filling in the JobRecords e VJobRecords ones), and the data will be sent to FAUST and to EGI, so you can perform the following check:

mysql> use apelclient

mysql> SELECT year(EndTime),Month(EndTime),InfrastructureDescription,count(*) FROM VJobRecords GROUP BY 1,2,3;
+---------------+----------------+---------------------------+----------+
| year(EndTime) | Month(EndTime) | InfrastructureDescription | count(*) |
+---------------+----------------+---------------------------+----------+
|          2014 |              5 | APEL-CREAM-PBS            |     3747 |
|          2014 |              6 | APEL-CREAM-PBS            |     7243 |
|          2014 |              7 | APEL-CREAM-PBS            |     4852 |
|          2014 |              8 | APEL-CREAM-PBS            |     4770 |
|          2014 |              9 | APEL-CREAM-PBS            |     3882 |
+---------------+----------------+---------------------------+----------+
5 rows in set (0.13 sec)

STATUS

SITE NAME TICKET STATUS INFO
BIOCOMP 17459 SOLVED
CIRMMP 17460 OPEN
CNR-ILC-PISA 17461 SOLVED
CRS4 17462 OPEN
FBF-Brescia-IT 17463 OPEN
GARR-01-DIR 17464 SOLVED
GRISU-COMETA-INFN-CT 17465 IN PROGRESS
GRISU-UNINA 17466 IN PROGRESS
INAF-TS 17467 IN PROGRESS problemi di accesso ai broker di EGI. FAUST: OK
INFN-BARI 17448 OPEN
INFN-BOLOGNA 17468 SOLVED
INFN-BOLOGNA-T3 17469 IN PROGRESS manderà i dati assieme a INFN-T1 ed INFN-CNAF-LHCB
INFN-CATANIA 17449 IN PROGRESS problemi di configurazione
INFN-CNAF-LHCB 17470 OPEN
INFN-COSENZA 17471 IN PROGRESS
INFN-FERRARA 17472 IN PROGRESS EGI: OK. FAUST: non si sa
INFN-FRASCATI 17452 IN PROGRESS EGI: OK. FAUST: quasi
INFN-GENOVA 17473 IN PROGRESS EGI: in attesa del responso di nagios. FAUST: OK
INFN-LECCE 17474 IN PROGRESS i dati a FAUST non arrivano
INFN-LNL-2 17450 IN PROGRESS in attesa del primo lancio
INFN-MILANO-ATLASC 17448 OPEN
INFN-NAPOLI-ARGO 17477 OPEN
INFN-NAPOLI-ATLAS 17453 IN PROGRESS
INFN-NAPOLI-CMS 17478 OPEN
INFN-NAPOLI-PAMELA 17479 OPEN
INFN-PADOVA 17480 IN PROGRESS
INFN-PAVIA 17482 OPEN
INFN-PERUGIA 17483 IN PROGRESS i dati a FAUST non arrivano
INFN-PISA 17454 SOLVED
INFN-ROMA1 17455 IN PROGRESS manderà i dati assieme a INFN-ROMA1-VIRGO e INFN-ROMA1-CMS
INFN-ROMA1-CMS 17456 OPEN
INFN-ROMA1-VIRGO 17484 OPEN
INFN-ROMA2 17485 IN PROGRESS dati arrivati
INFN-ROMA3 17486 IN PROGRESS EGI: i dati mancano. FAUST: dati parzialmente arrivati
INFN-T1 17457 IN PROGRESS
INFN-TORINO 17458 IN PROGRESS
INFN-TRIESTE 17487 SOLVED
RECAS-NAPOLI 17488 IN PROGRESS
SNS-PISA 17489 IN PROGRESS installazione non completata
TRIGRID-INFN-CATANIA 17490 IN PROGRESS problemi di configurazione
UNI-PERUGIA 17491 OPEN
UNINA-EGEE 17492 IN PROGRESS

-- AlessandroPaolini - 2014-06-13


This topic: SiteAdminCorner > WebHome > ApelDeployment
Topic revision: r29 - 2014-09-25 - AlessandroPaolini
 
This site is powered by the TWiki collaboration platformCopyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback