This post is part of the OCI Logging – Complete Hands-on Series. Make sure to check out the other posts as well.
Intro
Let’s say we have our app running on OCI and we also configured OCI Logging to get all necessary logs out of our VMs. Now would be a good opportunity to look at how we can set up alerts on specific log queries like “*ERROR*” or basically anything so we get notified when these kinds of messages occur.
Configuration Overview
Let’s look at a high level overview of the configuration:
- We have an application running on a VM (can be in OKE, Functions, On-prem, etc.) that produces logs
- Create a Custom Log to collect those logs from the VM (in this case we’ll get “/var/log/messages” log file)
- Create a Service Connector to collect and send log messages to the Monitoring service
- Collect and filter logs based on a pattern like “*ERROR*”
- Send logs to a new custom metric
- Use the new custom metric to create Alarms based on your specific thresholds to get notified
Configuration Details
Log Configuration
I have configured a Custom Log to get the “/var/log/messages” logs from a Compute VM.
I won’t get into details in this article on how to configure it, so if you need help check out How to configure custom logs in OCI for any type of workload.
If I connect to the VM and I write something to the log, it should appear shortly in the OCI Console as well:
[opc@playground ~]$ logger Hello from cmd!
Connector Configuration
Let’s creat a Service Connector to move the data from Logging to Monitoring.
From the main menu -> Observability and Management -> Service Connectors under the Logging Category, click on Create Service Connector:
Give the Service Connector a name and a description, then select Logging as Source and Monitoring as Target
First select your Log Group and your Log from where you want to get logs. You can add multiple sources. You can also select only a Log Group if you want to get logs from all configured logs within a Log Group.
Second filter you logs using the properties available. In the example above I’m using the data.message property (which is the log message) and a value of *ERROR*. You can see on the right the Query syntax and that I have 173 log records that are mathcing my filter in the last 6 hours.
You can also Switch to advanced mode and write the query yourself. Check out the query documentation.
Create a new metric namespace and a new metric for your target.
You can add dimensions to filter even further the data that gets sent, like only getting messages from a particular VM, or a particular compartment, etc.
Don’t forget to click on Create Policy when asked and Create the connector.
Test Connector
Let’s now test and see if our Service Connector Configuration works fine.
Connect to your instance and write a message to the log file with the pattern *ERROR*:
[opc@playground ~]$ logger This is an Error message
Now let’s find this message in the Log:
Don’t forget to wait a few minutes before the log is available in OCI
Let’s now check the Service Connector Metrics:
I’ve sent a few Error messages to the log, and we can clearly see that there were 3 messages written to the target – in our case Monitoring
Alert Configuration
Now that our new custom metric is getting data, we can create an alert on it from the main menu -> Observability and Management -> Alarm Definitions under Monitoring:
Give your alarm a proper name and description, and select the Severity
Select the custom Metric Namespace and Metric Name
I’ve set my Trigger Rule to greater than or equal to 1.
This means that if there is on or more error message, my alarm will be triggered
Select your Notifications Topic and Save
Test Alarm
Let’s create some more error messages in the log:
[opc@playground ~]$ logger This is an Error message
Don’t forget, there will be a delay of a few minutes from log message creation to alarm trigger.
We can see that the alarm is Firing now and I’ve received an email with the notification.
Of course, this depends on how you set up your Notification Topic.