Logs

Logs have levels FATAL, ERROR, INFO and DEBUG. The log level must be set in the configuration file config.js

/**
 * Default log level. Can be one of: 'DEBUG', 'INFO', 'WARN', 'ERROR', 'FATAL'
 * @type {string}
 */
config.logLevel = "INFO";

In order to have logs that can enable alarms being raised and ceased, ERROR level (or one with more detail) should be set in the configuration file. Generally, ERROR level should be used at least, as some important information can be lost in other case (e.g. when set to FATAL)

Each log line contains several fields of the form name= value, separated by |

  • time time of the log
  • lvl log level
  • from IP from X-Real-IP header field or client's IP if missing
  • corr correlator from the incoming request that caused the current transaction. It allows trace a global operation through several systems. If not present as field header in the incoming request, the internal transaction ID will be used.
  • trans internal transaction ID
  • op description of the operation being done. Generally, the path of the URL invoked.
  • msg message
time=2014-12-16T12:01:46.487Z | lvl=ERROR | corr=62d2f662-37de-4dcf-ba02-013642501a2d | trans=62d2f662-37de-4dcf-ba02-013642501a2d | op=/actions/do | msg=emailAction.SendMail connect ECONNREFUSED

Logs for errors can show additional info in the message, giving a hint of the root cause of the problem (ECONNREFUSED,ENOTFOUND, ECONNRESET, ...)

The log level can be changed at run-time, with an HTTP PUT request

 curl --request PUT <host>:<port>/admin/log?level=<FATAL|ERROR|WARN|INFO|DEBUG>

The log level can be retrieved at run-time, with an HTTP GET request

 curl --request GET <host>:<port>/admin/log

Alarms

Alarm levels

  • Critical - The system is not working
  • Major - The system has a problem that degrades the service and must be addressed
  • Warning - It is happening something that must be notified

Alarms will be inferred from logs typically. For each alarm, a 'detection strategy' and a 'stop condition' is provided (note that the stop condition is not shown in the next table, but it is included in the detailed description for each alarm below). The conditions are used for detecting logs that should raise the alarm and cease it respectively. The log level for alarms is ERROR if no other level is said. The message in a condition should be taken as a prefix of the possible message in the log. We recommend you to ignore starting spaces in each field in order to avoid missing a log that should meet the condition in other case.

Some errors cause perseo to fail to start up. They have FATAL level and are caused by:

  • Lack of connection to database
  • Lack of connection to perseo-core

They should be solved in order to get perseo running.

Alarm conditions

Alarm ID Severity Description
START Critical Impossible to start perseo
CORE Major Refreshing of rules at core is failing.
POST_EVENT Critical Sending an event to core is failing.
EMAIL Critical Trying to execute an email action is failing.
SMS Critical Trying to execute an SMS action is failing.
SMPP Critical Trying to execute an SMPP action is failing.
ORION Critical Trying to execute an update action is failing
DATABASE Critical A problem in connection to DB.
DATABASE_ORION Critical A problem in connection to Orion DB (accessed by no-signal checker)
AUTH Major A problem in connection to Keystone. Update-actions to Orion through PEP are not working
LOOP Major Some rules can be provoking an infinite loop of triggered actions

Alarm START

Severity: Critical

Detection strategy: lvl:FATAL op:perseo

Stop condition: N/A

Description: Starting perseo is failing.

Action: Check HTTP connectivity to perseo-core from perseo and connectivity to the mongoDB, as set in the config file.


Alarm CORE

Severity: Major

Detection strategy: msg:ALARM-ON [CORE]

Stop condition: msg:ALARM-OFF [CORE]

Description: Communication with core is failing.

Action: Check HTTP connectivity to perseo-core from perseo. Also check deployment of perseo-core at the right URL path


Alarm POST_EVENT

Severity: Critical

Detection strategy: msg:ALARM-ON [POST_EVENT]

Stop condition: msg:ALARM-OFF [POST_EVENT]

Description: Sending an event to core is failing.

Action: Check HTTP connectivity to perseo-core from perseo. Also check deployment of perseo-core at the right URL path


Alarm EMAIL

Severity: Critical

Detection strategy: msg:ALARM-ON: [EMAIL]

Stop condition: msg:ALARM-OFF: [EMAIL]

Description: Trying to execute an email action is failing.

Action: Check the configured SMTP Server is accessible and working properly


Alarm SMS

Severity: Critical

Detection strategy: msg:ALARM-ON: [SMS]

Stop condition: msg:ALARM-OFF: [SMS]

Description: Trying to execute an SMS action is failing.

Action: Check the configured SMPP Adapter Server is accessible and working properly


Alarm SMPP

Severity: Critical

Detection strategy: msg:ALARM-ON: [SMPP]

Stop condition: msg:ALARM-OFF: [SMPP]

Description: Trying to execute an SMPP action is failing.

Action: Check the configured SMPP Server is accessible and working properly


Alarm ORION

Severity: Critical

Detection strategy: msg:ALARM-ON: [ORION]

Stop condition: msg:ALARM-OFF: [ORION]

Description: Trying to execute an update action is failing.

Action: Check the configured Orion path for updating is accessible and working properly


Alarm DATABASE

Severity: Critical

Detection strategy: msg:ALARM-ON: [DATABASE]

Stop condition: msg:ALARM-OFF: [DATABASE]

Description: There is a problem in connection to DB.

Action: Check configured mongoDB is up and running and is accessible from perseo. Check that databases exist.

You can find more information about DB dynamics in the database aspects documentation.


Alarm DATABASE_ORION

Severity: Critical

Detection strategy: msg:ALARM-ON: [DATABASE_ORION]

Stop condition: msg:ALARM-OFF: [DATABASE_ORION]

Description: There is a problem in connection to Orion DB (accessed by no-signal checker)

Action: Check configured mongoDB is up and running and is accessible from perseo. Check that databases exist.


Alarm AUTH

Severity: Major

Detection strategy: msg:ALARM-ON: [AUTH]

Stop condition: msg:ALARM-ON: [AUTH]

Description: There is a problem in connection to Keystone. Update-actions to Orion through PEP are not working.

Action: Check HTTP connectivity to Keystone. Check provisioned user and roles/grants.


Alarm LOOP

Severity: Major

Detection strategy: msg:check infinite loop

Stop condition: N/A

Description: Some rules can be provoking an infinite loop of triggered actions.

Action: Report to client/product about possible loop with the pointed rule. Check log for the correlator in the log message