Follow the steps in Set up meshIQ-Grafana integration to set up Grafana integration. After alerts, contacts, and notification policies are set up, you can turn off alerts temporarily or at scheduled times using Silences and Mute timings.
After you have collected data using a jKQL query, you can then use the data as a basis for setting up an alert. We will look at three alert examples:
- Example 1: An alert that indicates whether a fact (that is, a value) has reached a threshold that you specify.
- Example 2: An alert that indicates whether the number of messages in a queue exceeds a specified threshold.
- Example 3: An alert that indicates whether the number of logs exceeds a specified threshold.
The query examples above can be created with only the REST endpoint made available in version 11 as an expert, which will provide the data, and the jKQL query language. No streaming or data replication is necessary to view data that is generated by meshIQ core services. Solr is not used in returning results.
Although we have used very simple examples, keep in mind when you create your own queries that in addition to fact values, derived metrics are also available. You can request any of the analytics that meshIQ core services produce.
Alert Rule Example 1
In this example we set up a "counter" fact that increments every three seconds. We then set up an alert to indicate when that counter exceeds 100. The image below shows the counter fact.
To create this alert:
- First, run the jKQL query in a test panel to make sure that it collects the data that is required for your alert.
After entering the query, press Enter on your keyboard to run it. - Copy the query from the panel.
- To create an alert, from the Grafana menu , expand the Alerting section and select Alert rules.
- Click Create alert rule.
- Enter a name for the alert rule:
- At the jKQL prompt, paste the query from steps 1 and 2:
Get ExpertFact field ExpertFactValue where ExpertName = 'MyCounter'
- Run the query by pressing Enter on your keyboard.
By default, Grafana creates two expressions: B (Reduce) and C (Threshold). - Update Expression C (Threshold) to 100.
- Click Preview.
-
- If the condition in B hasn't been met, Series 1 in C has a green Normal indicator .
- If the condition in B has been met, Series 1 in C has a red Firing indicator .
Since the ExpertFactValue is greater than 100, Expression C is updated to show that the alert is Firing.
Alert Rule Example 2
In this example we look at the number of messages in a series of queues. We then set up an alert to indicate whether the number of messages in each queue is greater than 50. The images below show the queues.
Get WgsEmsQueue
Get WgsEmsQueue fields all
Using the same steps from example 1, do the following:
- Run the jKQL query in a test panel to make sure that it gathers the data that is required for your alert.
Get WgsEmsQueue fields Name, inStatTotalMessages
After entering the query, press Enter on your keyboard to run it.
- Copy the query from the test panel.
- From the Grafana menu , expand the Alerting section and select Alert rules.
- Click Create alert rule.
- Enter a name for the alert rule:
- At the jKQL prompt, paste the query from steps 1 and 2:
Get WgsEmsQueue fields Name, inStatTotalMessages
- Run the query by pressing Enter on your keyboard.
- Update Expression C (Threshold) to indicate that you are looking for queues with more than 50 messages.
- Click Preview.
The alert rule has been applied to each of the values.
Alert Rule Example 3
In this example we set up an alert that evaluates the number of log files.
To create this alert rule:
- Select the Grafana menu , expand the Alerting section and select Alert rules.
- Click Create alert rule.
- Enter a name for the alert rule.
- Choose the time range to which your query applies. In this example, we are evaluating the previous ten minutes.
- At the jKQL prompt, enter the query:
get number of logs
- Run the query by pressing Enter on your keyboard.
By default, Grafana creates two expressions: B (Reduce) and C (Threshold). Expression B is a way to tell Grafana how to handle NaN (not a number) results. Since it is not applicable to the query, we can delete it using the delete icon it the upper-right corner, as shown below. - To indicate that you are looking for cases when there are more than 500 logs, update Expression C (Threshold) to 500.
- Click Preview to find out what would happen if you created the rule now.
- If the condition in B hasn't been met, Series 1 in C has a green Normal indicator .
- If the condition in B has been met, Series 1 in C has a red Firing indicator .
- Alerts can be set up in evaluation groups. An evaluation group is a group of rules that use the same time interval. In this example, the results of the query will be evaluated every 30 seconds. To configure this interval, enter 30s in the Evaluation Interval.
- In the "for" field, enter the length of time that a condition needs to be met before a notification is sent. In this example, "5m" means that if the alert is still in the firing state after 5 minutes, the alert will send a notification.
- To inform alert recipients what the alert email is about, enter a Summary of what the alert indicates.
Note: When the Set dashboard and panel option Is used, notification recipients can click a link in the notification email to view the dashboard and panel. - Labels tie alert rules to notifications, contact points, and more. You can add new labels here. An example of a label "team" with a value "Devs" is shown below.
- Click Save rule and exit to return to the list of rules. An alert state is Normal by default. Upon its first evaluation, its status changes to Pending. After the Pending status it proceeds to the Firing status.
When it is no longer in a Pending or Firing state, the alert status returns to Normal.