Understanding Heat Tiles
What are heat tiles?
Highlight’s unique heat tiles instantly show a clear status of both network and application services. They are more powerful than a simple traffic light or dashboard display because they show trends too. They are a time-based capability which understands that one incident does not make a network link bad, nor is it good the minute a long-standing problem is resolved.
Highlight uniquely measures performance using problem levels - an on-going rating that smoothes the display and indicates if a situation is improving or deteriorating. There are 2 types of heat tile:
Every location within your network is shown as an individual, summary tile with chevrons to indicate improving or deteriorating issues across stability, load and health metrics
Group watches from any location in your network into a single tile, custom-define thresholds to change tile colour based on the total number of issues
Warning Triangle: Number of issues
Impact Meter: Percent of total tile elements with issues (see examples below)
|Explanation||1 amber issue||at least 1 red issue, 3 total issues||at least 1 red issue, 3 total issues|
|Number of issues||1||3||3|
|Total elements||many (50)||6||3|
|Percent of total||approx. 2%||50%||100%|
These tiles are created by adding watches to a container and setting it to display as a tile.
Service Tile Example
A service tile for email could comprise the summary status of:
- A host watch for the email server (CPU/memory/disk)
- TCP tests from the local router to the server listening ports (application status)
- Link Health watches for the Internet service
- WAN tests from remote locations to the Data Centre (to assess impact of network)
The mechanism works on the principle that if an error condition is encountered in any element then a fuel gauge type counter is decreased. In all other circumstances it’s increased. When the counter goes below pre-set thresholds, Highlight signals an amber condition, and then a red. Please see below the conditions when the various counters are decreased, associated with Stability, Load, or Health which we refer to as level 3 metrics.
Decrease the counter if any of the following conditions are met, which we refer to as level 2 metrics affecting stability:
- The device loses connection with either of the Highlight pollers
- The monitored interface is down or indicating a brief outage or has been taken out of service or no longer exists
- There has been a device restart
- A switch port designated as critical is down
- Performance tests: ICMP Ping, UDP Echo and TCP Open: 100% packet loss of all tests in a sample
- Performance tests: HTTP and HTTPS: Application failure, HTTP response 4XX or timed out
- Performance test: MOS: Application failure, MOS is less than 1.0
- Performance test: Precision Delay: 100% packet loss of all tests in a sample. Health index is also affected.
Decrease the counter if any of these conditions are met, which we refer to as level 2 metrics affecting load:
- Link utilisation (traffic in or out) exceeds threshold (default is 80% )
- CPU for a router exceeds 60%
- CPU for a host exceeds threshold (default is 75% )
- Client Count exceeds threshold (default is 30 client devices )
- Wireless Utilisation exceeds threshold (default is 50% )
Decrease the counter if any of these conditions are met, which we refer to as level 2 metrics affecting health:
- Link errors exceed threshold (default is 1% or 10,000 packets per million )
- Link congestion occurs:
- Queue length exceeds 0
- Discards exceed threshold (default is 1% or 10,000 packets per million )
- Class drops exceed 0 ()
- Broadband Clarity:
- Connection speed of the broadband service drops below the speed threshold, which is auto-learned or manually set
- Physical Memory utilisation exceeds threshold (default is 80% )
- Disk utilisation exceeds threshold (default is 80% )
- Congestion (discards) exceed threshold (default is 1% or 10,000 packets per million ) or
- Signal Problems exceeds threshold (default is 25% )
- Performance tests - ICMP Ping, UDP Echo and TCP Open:
- Any one of the tests in a sample shows response exceeds target ()
- At least one test in a sample fails to respond (lost packet)
Note: One sample can contain up to six test results; if all tests in a sample fail it affects stability and healthPerformance tests - HTTP and HTTPS:
- Page load response exceeds target ()
- Performance tests - Precision Delay and MOS: ()
Condition Precision Delay MOS Average response from the burst exceeds response target Yes Yes Percentage of lost packets exceeds packet loss target Yes Yes Jitter measured over the burst exceeds jitter target Yes Yes MOS Score is less than target N/A Yes
Note that each level 2 metric above can trigger an alert so for example you may get an alert when a heat tile goes red caused by Inbound Link utilisation, then another alert caused by Outbound Link utilisation even though the tile is already red. The tile colour represents the worst case of all level 2 metrics associated with it.
Open the options panel using the chevron button.
Heat tiles can be resized using the slider control. The sizes available are from 100 to 400 pixels wide. Depending of your browser, you may be able to incrementally increase tile widths using the UP and RIGHT arrows on your keyboard or decrease using the DOWN and LEFT arrows.
Use to restore the tile size to the default width:
- Service tiles:
- 185 pixels
- Location tiles:
- 150 pixels
Note: tile size changes are automatically applied, click away to close the dialog
The Status Heat tiles page refreshes every 180 seconds by default, with Auto Refresh On
Alternatively, switching AutoRefresh Off will keep the page in view constant until you select a different position in the Network explorer Tree.
Open the options panel using the chevron button.
The chevron button changes to bright blue to indicate settings have changed from the default.
Only selected folder
By default all service tiles are shown which contain any watch from the currently selected folder or below. By checking this option only services defined in the selected folder are displayed.
Only red & amber tiles
No green tiles will be displayed, only red or amber tiles (indicating services with issues)
Group by folder
By default all locations below the selected folder are shown unless this option is checked. Multiple location tiles are shown as a single folder tile (see image) which simplifies the heat tile view for those with large networks.
Only red & amber tiles
No green tiles will be displayed, only red or amber tiles (indicating locations with issues); select Stability, Load, Health or any combination of these three.
Arrange by duration
Although by default tiles are displayed in alphabetical order by folder, if this option is checked, all options under "Only red & amber tiles" are automatically checked and tiles are displayed by duration of issues in these three groups:
- Last 24 Hours
- Last 7 Days
- Older than 7 days
Use to reset all options to their defaults (all unchecked). This button will be unavailable if all check boxes are already in their default state (unchecked).
Use to make your selections take effect. This button may be temporarily unavailable whilst your criteria are applied.
Clicking either of the above buttons closes the dialog.