APEL Deployment

We are changing the accounting system used in our infrastructure, from DGAS to APEL and the procedure is quite simple.

Each resource centre needs to install a new node (the APEL Publisher) which receives the accounting information sent from the CE(s) by the APEL parser. Then the APEL publisher sends the data to the EGI central database using the messaging infrastructure.

In the past months we tested two scenarios:

  • the accounting data are sent directly to the central databse (canonic installation)
  • the accounting data are sent to FAUST and to the EGI central database

and we chose the second one

Registration

you need to register the APEL publisher in the GOC-DB: the service endpoint name to add is glite-APEL and fill in also the certificate subject information. Changes in GOCDB can take up to 4 hours to make it to the message brokers. this is necessary to authorize the publisher host in using the broker network. Do not touch the APEL service endpoint instead, otherwise nagios won't monitor the accounting data publication.

(for reference https://wiki.egi.eu/wiki/MAN09_Accounting_data_publishing)

APEL Publisher Installation and Configuration

Follow the EMI3 generic installation guide and the APEL one https://twiki.cern.ch/twiki/pub/EMI/EMI3APELClient/APEL_Publisher_System_Administrator_Guide.pdf

Use the production queue of the broker network:

# Queue to which SSM will send messages (use this)
destination: /queue/global.accounting.cpu.central 
and comment out or delete the testing one

Sending the data to FAUST

In order to send the accounting data also to FAUST, after installing and configuring the APEL Publisher as explained in the section above, follow the instructions https://github.com/andreaguarise/ssm-dupl-send

this mean that instead of apelclient script, you have to use the ssm-dupl-send.sh one.

Among the important paramaters to set in the faust-sender.cfg there are the following ones:

host: dgas-broker.to.infn.it
port: 61613

use_ssl: false

destination: apel.input

Create with mkdir the directory:

/var/spool/faust/outgoing

IMPORTANT: for Tier1 and Tier2 it will be used a dedicated queue:

destination: apel.<SITE-NAME>.input

For instance, in the case of INFN-PISA, into file faust-sender.cfg it will be set:

destination: apel.INFN-PISA.input

IMPORTANT: Run the FAUST script only after having launched the parser one on the CEs

  • Example of the cron:
cat /etc/cron.d/ssm-dupl-send 
# Run APEL client once daily
05 01 * * * root /root/bin/ssm-dupl-send.sh

APEL Parser Installation and Configuration

Install and configure the APEL parser on each computing element of your resource centre.

Follow the EMI3 generic installation guide and the APEL one https://twiki.cern.ch/twiki/pub/EMI/EMI3APELClient/APEL_Parsers_System_Administrator_Guide.pdf

IMPORTANT: Send the accounting data starting from September, because the previous ones have been already sent by DGAS to APEL, otherwise the will be overwritten causing some inconsistencies. Configure the parser accordingly to make process the proper files (or move the old logs in another directory).

you can launch the apelparser script after the setting-up of the apelclient database

an example of the cron:

cat /etc/cron.d/apelparser 
# Run APEL parser once daily
04 22 * * * root /usr/bin/apelparser

NOTE: the empty logfiles produce a CRITICAL error in the parsing operation:

2014-09-12 11:02:08,683 - apel.common.exceptions - CRITICAL - Unhandled exception raised!
2014-09-12 11:02:08,683 - apel.common.exceptions - CRITICAL - Please send a bug report with following information:
2014-09-12 11:02:08,684 - apel.common.exceptions - CRITICAL - UnboundLocalError: local variable 'line_number' referenced before assignment
2014-09-12 11:02:08,684 - apel.common.exceptions - CRITICAL - parse_file [/usr/bin/apelparser 139]
2014-09-12 11:02:08,684 - apel.common.exceptions - CRITICAL - scan_dir in /usr/bin/apelparser [187]
2014-09-12 11:02:08,684 - apel.common.exceptions - CRITICAL - handle_parsing in /usr/bin/apelparser [296]
2014-09-12 11:02:08,684 - apel.common.exceptions - CRITICAL - main in /usr/bin/apelparser [380]
2014-09-12 11:02:08,684 - apel.common.exceptions - CRITICAL - ? in /usr/bin/apelparser [392]
so check for these empty files and delete them.

Bug in APEL 1.2.1 - Apply the EMI-3 update 20

The EMI-3 update 20 released a fix of the APEL software for a bug in the parser which prevents it from working in the most common cases: it is unable to open uncompressed accounting logs for parsing.

Sites with this problem will have version 1.2.1 installed and see many log messages like this:

2014-08-11 12:54:11,819 - parser - ERROR - Cannot open file 
blahp.log-20140811: Not a gzipped file
in their parser log file - usually at /var/log/apel/parser.log.

Sites who have installed version 1.2.1 should upgrade to 1.2.2 immediately: http://www.eu-emi.eu/releases/emi-3-monte-bianco/updates/-/asset_publisher/5Na8/content/update-20-12-09-2014-v-3-11-0-1

What about DGAS sensors?

Because the DGAS server problems occurred at the beginning of September, stop dgas sensors on your computing element(s).

Fast checks

When you launch the apelparser script for the first time, if there are no errors, it will be filled in the tables BlahdRecords e EventRecords (database "apelclient"), so check if they really contain the data.

Then with the execution of ssm-dupl-send.sh script on the publisher host, it will be done the join of those tables (filling in the JobRecords e VJobRecords ones), and the data will be sent to FAUST and to EGI, so you can perform the following check:

mysql> use apelclient

mysql> SELECT year(EndTime),Month(EndTime),InfrastructureDescription,count(*) FROM VJobRecords GROUP BY 1,2,3;
+---------------+----------------+---------------------------+----------+
| year(EndTime) | Month(EndTime) | InfrastructureDescription | count(*) |
+---------------+----------------+---------------------------+----------+
|          2014 |              5 | APEL-CREAM-PBS            |     3747 |
|          2014 |              6 | APEL-CREAM-PBS            |     7243 |
|          2014 |              7 | APEL-CREAM-PBS            |     4852 |
|          2014 |              8 | APEL-CREAM-PBS            |     4770 |
|          2014 |              9 | APEL-CREAM-PBS            |     3882 |
+---------------+----------------+---------------------------+----------+
5 rows in set (0.13 sec)

STATUS

SITE NAME TICKETSorted ascending STATUS INFO
INFN-BARI 17448 OPEN
INFN-MILANO-ATLASC 17448 OPEN
INFN-CATANIA 17449 OPEN
INFN-LNL-2 17450 IN PROGRESS
INFN-FRASCATI 17452 IN PROGRESS
INFN-NAPOLI-ATLAS 17453 IN PROGRESS
INFN-PISA 17454 SOLVED
INFN-ROMA1 17455 OPEN
INFN-ROMA1-CMS 17456 OPEN
INFN-T1 17457 OPEN
INFN-TORINO 17458 OPEN
BIOCOMP 17459 IN PROGRESS
CIRMMP 17460 OPEN
CNR-ILC-PISA 17461 SOLVED
CRS4 17462 OPEN
FBF-Brescia-IT 17463 OPEN
GARR-01-DIR 17464 SOLVED
GRISU-COMETA-INFN-CT 17465 IN PROGRESS
GRISU-UNINA 17466 IN PROGRESS
INAF-TS 17467 OPEN
INFN-BOLOGNA 17468 IN PROGRESS
INFN-BOLOGNA-T3 17469 IN PROGRESS
INFN-CNAF-LHCB 17470 OPEN
INFN-COSENZA 17471 OPEN
INFN-FERRARA 17472 IN PROGRESS
INFN-GENOVA 17473 OPEN
INFN-LECCE 17474 IN PROGRESS
INFN-NAPOLI-ARGO 17477 OPEN
INFN-NAPOLI-CMS 17478 OPEN
INFN-NAPOLI-PAMELA 17479 OPEN
INFN-PADOVA 17480 IN PROGRESS
INFN-PAVIA 17482 OPEN
INFN-PERUGIA 17483 IN PROGRESS
INFN-ROMA1-VIRGO 17484 OPEN
INFN-ROMA2 17485 IN PROGRESS
INFN-ROMA3 17486 IN PROGRESS
INFN-TRIESTE 17487 SOLVED
RECAS-NAPOLI 17488 IN PROGRESS
SNS-PISA 17489 IN PROGRESS
TRIGRID-INFN-CATANIA 17490 OPEN
UNI-PERUGIA 17491 OPEN
UNINA-EGEE 17492 IN PROGRESS

-- AlessandroPaolini - 2014-06-13


This topic: SiteAdminCorner > WebHome > ApelDeployment
Topic revision: r26 - 2014-09-23 - AlessandroPaolini
 
This site is powered by the TWiki collaboration platformCopyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback