APEL Deployment

We are changing the accounting system used in our infrastructure, from DGAS to APEL and the procedure is quite simple.

Each resource centre needs to install a new node (the APEL Publisher) which receives the accounting information sent from the CE(s) by the APEL parser. Then the APEL publisher sends the data to the EGI central database using the messaging infrastructure.

In the past months we tested two scenarios:

  • the accounting data are sent directly to the central databse (canonic installation)
  • the accounting data are sent to FAUST and to the EGI central database

and we chose the second one

Registration

you need to register the APEL publisher in the GOC-DB: the service endpoint name to add is glite-APEL and fill in also the certificate subject information. Changes in GOCDB can take up to 4 hours to make it to the message brokers. this is necessary to authorize the publisher host in using the broker network. Do not touch the APEL service endpoint instead, otherwise nagios won't monitor the accounting data publication.

(for reference https://wiki.egi.eu/wiki/MAN09_Accounting_data_publishing)

APEL Publisher Installation and Configuration

Follow the EMI3 generic installation guide and the APEL one https://twiki.cern.ch/twiki/pub/EMI/EMI3APELClient/APEL_Publisher_System_Administrator_Guide.pdf

Use the production queue of the broker network:

# Queue to which SSM will send messages (use this)
destination: /queue/global.accounting.cpu.central 
and comment out or delete the testing one

Sending the data to FAUST

In order to send the accounting data also to FAUST, after installing and configuring the APEL Publisher as explained in the section above, follow the instructions https://github.com/andreaguarise/ssm-dupl-send

this mean that instead of apelclient script, you have to use the ssm-dupl-send.sh one.

Among the important paramaters to set in the faust-sender.cfg there are the following ones:

host: dgas-broker.to.infn.it
port: 61613

use_ssl: false

destination: apel.input

Create with mkdir the directory:

/var/spool/faust/outgoing

IMPORTANT: for Tier1 and Tier2 it will be used a dedicated queue:

destination: apel.<SITE-NAME>.input

For instance, in the case of INFN-PISA, into file faust-sender.cfg it will be set:

destination: apel.INFN-PISA.input

IMPORTANT: Run the FAUST script only after having launched the parser one on the CEs

  • Example of the cron:
cat /etc/cron.d/ssm-dupl-send 
# Run APEL client once daily
05 01 * * * root /root/bin/ssm-dupl-send.sh

APEL Parser Installation and Configuration

Install and configure the APEL parser on each computing element of your resource centre.

Follow the EMI3 generic installation guide and the APEL one https://twiki.cern.ch/twiki/pub/EMI/EMI3APELClient/APEL_Parsers_System_Administrator_Guide.pdf

IMPORTANT: Send the accounting data starting from September, because the previous ones have been already sent by DGAS to APEL, otherwise the will be overwritten causing some inconsistencies. Configure the parser accordingly to make process the proper files (or move the old logs in another directory).

you can launch the apelparser script after the setting-up of the apelclient database

an example of the cron:

cat /etc/cron.d/apelparser 
# Run APEL parser once daily
04 22 * * * root /usr/bin/apelparser

NOTE: the empty logfiles produce a CRITICAL error in the parsing operation:

2014-09-12 11:02:08,683 - apel.common.exceptions - CRITICAL - Unhandled exception raised!
2014-09-12 11:02:08,683 - apel.common.exceptions - CRITICAL - Please send a bug report with following information:
2014-09-12 11:02:08,684 - apel.common.exceptions - CRITICAL - UnboundLocalError: local variable 'line_number' referenced before assignment
2014-09-12 11:02:08,684 - apel.common.exceptions - CRITICAL - parse_file [/usr/bin/apelparser 139]
2014-09-12 11:02:08,684 - apel.common.exceptions - CRITICAL - scan_dir in /usr/bin/apelparser [187]
2014-09-12 11:02:08,684 - apel.common.exceptions - CRITICAL - handle_parsing in /usr/bin/apelparser [296]
2014-09-12 11:02:08,684 - apel.common.exceptions - CRITICAL - main in /usr/bin/apelparser [380]
2014-09-12 11:02:08,684 - apel.common.exceptions - CRITICAL - ? in /usr/bin/apelparser [392]
so check for these empty files and delete them.

Bug in APEL 1.2.1 - Apply the EMI-3 update 20

The EMI-3 update 20 released a fix of the APEL software for a bug in the parser which prevents it from working in the most common cases: it is unable to open uncompressed accounting logs for parsing.

Sites with this problem will have version 1.2.1 installed and see many log messages like this:

2014-08-11 12:54:11,819 - parser - ERROR - Cannot open file 
blahp.log-20140811: Not a gzipped file
in their parser log file - usually at /var/log/apel/parser.log.

Sites who have installed version 1.2.1 should upgrade to 1.2.2 immediately: http://www.eu-emi.eu/releases/emi-3-monte-bianco/updates/-/asset_publisher/5Na8/content/update-20-12-09-2014-v-3-11-0-1

What about DGAS sensors?

Because the DGAS server problems occurred at the beginning of September, stop dgas sensors on your computing element(s).

Fast checks

When you launch the apelparser script for the first time, if there are no errors, it will be filled in the tables BlahdRecords e EventRecords (database "apelclient"), so check if they really contain the data.

Then with the execution of ssm-dupl-send.sh script on the publisher host, it will be done the join of those tables (filling in the JobRecords e VJobRecords ones), and the data will be sent to FAUST and to EGI, so you can perform the following check:

mysql> use apelclient

mysql> SELECT year(EndTime),Month(EndTime),InfrastructureDescription,count(*) FROM VJobRecords GROUP BY 1,2,3;
+---------------+----------------+---------------------------+----------+
| year(EndTime) | Month(EndTime) | InfrastructureDescription | count(*) |
+---------------+----------------+---------------------------+----------+
|          2014 |              5 | APEL-CREAM-PBS            |     3747 |
|          2014 |              6 | APEL-CREAM-PBS            |     7243 |
|          2014 |              7 | APEL-CREAM-PBS            |     4852 |
|          2014 |              8 | APEL-CREAM-PBS            |     4770 |
|          2014 |              9 | APEL-CREAM-PBS            |     3882 |
+---------------+----------------+---------------------------+----------+
5 rows in set (0.13 sec)

STATUS

SITE NAME TICKET STATUS INFO
BIOCOMP 17459 SOLVED
CIRMMP 12 SOLVED
CNR-ILC-PISA 17461 SOLVED
CRS4 13 SOLVED
FBF-Brescia-IT 14 SOLVED
GARR-01-DIR 17464 SOLVED
GILDA-INFN-CATANIA 27 SOLVED
GILDA-SIRIUS 29 SOLVED
GRISU-COMETA-INFN-CT 17465 SOLVED
GRISU-UNINA 17466 SOLVED
ICEAGE-CATANIA 28 SOLVED
INAF-TS 17467 SOLVED
INFN-BARI 15 IN PROGRESS in attesa del primo lancio
INFN-BOLOGNA 17468 SOLVED
INFN-BOLOGNA-T3 17469 SOLVED
INFN-CATANIA 17449 SOLVED
INFN-CNAF-LHCB 17470 SOLVED
INFN-COSENZA 17471 SOLVED
INFN-FERRARA 17472 SOLVED
INFN-FRASCATI 17452 SOLVED
INFN-GENOVA 17473 SOLVED
INFN-LECCE 17474 SOLVED
INFN-LNL-2 17450 SOLVED  
INFN-MILANO-ATLASC 16 IN PROGRESS in attesa della prima pubblicazione
INFN-NAPOLI-ARGO 17477 SOLVED
INFN-NAPOLI-ATLAS 17453 SOLVED
INFN-NAPOLI-CMS 17478 SOLVED
INFN-NAPOLI-PAMELA 17479 SOLVED
INFN-PADOVA 17480 SOLVED  
INFN-PAVIA 17 OPEN
INFN-PERUGIA 17483 SOLVED
INFN-PISA 17454 SOLVED
INFN-ROMA1 26 IN PROGRESS manderÓ i dati assieme a INFN-ROMA1-VIRGO e INFN-ROMA1-CMS
INFN-ROMA1-CMS 17456 SOLVED
INFN-ROMA1-VIRGO 17484 OPEN
INFN-ROMA2 17485 SOLVED
INFN-ROMA3 17486 SOLVED
INFN-T1 17457 SOLVED
INFN-TORINO 17458 SOLVED
INFN-TRIESTE 17487 SOLVED
RECAS-NAPOLI 17488 SOLVED
SNS-PISA 17489 SOLVED sito sospeso
TRIGRID-INFN-CATANIA 17490 SOLVED
UNI-PERUGIA 38 SOLVED
UNINA-EGEE 17492 SOLVED

-- AlessandroPaolini - 2014-06-13

Edit | Attach | PDF | History: r59 < r58 < r57 < r56 < r55 | Backlinks | Raw View | More topic actions
Topic revision: r59 - 2014-11-13 - AlessandroPaolini
 
This site is powered by the TWiki collaboration platformCopyright © 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback