Troubleshooting Alerts

Overview

There are a number of issues which may occur with Highlight Alerting. This page reviews the more common symptoms and potential causes. All alerts sent by Highlight are recorded in the Alert Log.

Symptom 1. No alert received

Assuming there is no issue with the receiving email system or trap receiver, check the following to determine why no alert was received.

Check that you have a valid alert action defined...
- which is active in the location
- with a valid email address
- with the correct area of interest (Stability/Load/Health)
- also check if the issue occurred out of normal business hours. An action with the Alert period set to "During business hours only" would not generate an email, neither at the start of the issue nor at the start of business hours. Alerts are not cached. To correct this for future issues the Alert period needs to be set to "Any time of day."
Check that the watch in question...
- was not in maintenance, which would have suppressed the alert. See Maintenance indicators on Details page
- has not had "alerting" turned off (an admin function). See Suppressing alerts on bearers point 1
Check for other alert suppression
- if one of more bearers at the location are set as a "site link" and all site links are down, then alerts from other watches are suppressed. See Suppressing alerts on bearers point 2 for more detail
Check if the number of bad samples received would trigger an alert
- Highlight does not alert immediately if a bad sample is received. It uses a sensitivity measure which is visible for those with the Highlight permission "Manage folders/locations". You can check the number of samples required before Highlight reacts to an outage by referring to the Status Alerting page.
For link health issues, the processed samples can be extracted from the Details page in "day" view by clicking on the download button. From this file you can, for example, compare the number of consecutive samples without traffic with the sensitivity stability setting, and so determine if Highlight would have reacted.
In the following example there were 4 samples without traffic but the sensitivity threshold is 5 so no alert would have been generated.

Symptom 2. An alert received but watch is green

If an alert is received for a watch in green state, the likelihood is that one of the Highlight pollers no longer has access to the device. This results in the Highlight system getting mixed messages about whether the device is up or down.
If you suspect this may be the case please contact us to investigate

Symptom 3. Multiple alerts received - a single watch

If multiple alerts are received for the same watch

check if the watch is in an unstable state, so flipflopping through an alert threshold and back
check that there is only one alert set up for the email address (Status tab, alerts)

Symptom 4. Multiple alerts received - a cluster of watches at a location

If an alert is received for every watch in the same location, it is likely that the location is unreachable. To suppress alerts on all but the primary bearer(s) serving the site, make those bearers site links. See Suppressing alerts on bearers point 2

Symptom 5. Multiple alerts received for many locations

If a cluster of alerts is received for multiple locations there are a number of options to consider:

There has been a major service outage in the providers network
If there is a traffic limited link between the Highlight poller and the target network it is possible that SNMP monitoring traffic is being dropped in high traffic load conditions

Symptom 6. SNMP trap alert issues

Problem receiving traps?

You may find that the trap receiver is not receiving any messages. If so, ensure that:

the Trap Receiver is listening on UDP port 162
you have an inbound firewall rule that permits UDP 162

Symptom 7. Webhook alert issues

During initial set up, refer to webhook output format for notes on the expected contents of each field.

If you get Unable to connect when testing the webhook URL, possible reasons are:

Validation issues

https:// is missing from the start of the URL
the format of the URL is invalid

Testing issues

the URL is formatted correctly but the endpoint doesn't exist
the site being posted to times out
the URL entered doesn't accept post
the SSL certificate of the site is invalid
the webhook receiver expects a different format webhook file