Alerting in Splunk with a Purpose
Aaron Dobrzeniecki, Senior Splunk Consultant
If you are just getting started with Splunk, one of the first and highly powerful features you will come across is alerting. These are extremely useful for staying on top of important events in your data. Firstly, what is a Splunk alert? A Splunk Alert is a way to automatically notify you when certain conditions are met in your data. Quick examples:
-Too many login failures? ALERT!
-Server is down? ALERT!
-Unusual spikes in network traffic? ALERT!
You set up a Splunk query/search using Splunk Processing Language, add conditions (like thresholds or patterns), and Splunk will watch your data and let you know when these conditions trigger. The most common way to receive the results of an alert via Splunk is to have an email sent with the results, link to the alert, and link to the results, to the necessary email addresses.
If you receive an email or use one of the other ways to receive the results of the alerts kicking off, there is NO REASON to set the Alert Expire setting higher than the default of 24 hours. When you create an alert in Splunk, there is a setting called “Expire After” or “Expire”. This tells Splunk how long the alert results will stay active in the system after the alert triggers. What is the major importance of this setting you may ask? Firstly, it keeps your system clean and free of “junk” build up from old or stale alerts just sitting on the Splunk Search Head(s).
Let’s say you set the expire time to 24 days. This means:
- After the alert fires, it will stay visible in the “Jobs” list (Top Left, Activity –> Jobs) for 24 days.
- After the 24 days, the alert results will disappear
- Every time that alert triggers, it will keep the results for EVERY RUN in the system, adding up the MBs that the results consist of
- Let’s say your search quota is 800MB (depending on the size of the results, you will only be able to run and keep X number of searches before your searches get queued up.
As you can see from the image above, these alerts are set to expire in middle and late August. Every single time these alerts run it will set the expire setting out ~3 months. With each result being ~3.14MB. Do the math. 800\3.14=~254 results that can exist before the user’s searches start to get queued up. The most common interval for alerts is anywhere from 5 minutes to 1+ hours. Let’s say you have an alert running every 10 minutes, that is 6 times an hour, 144 times per day. Within 2 whole days of running this alert, getting the results every time, the user will run into a quota issue and will not be able to run Splunk searches anymore until their search jobs are cleaned out. Imagine having an alert run every minute, or even every 5 minutes!
Key points to remember about Splunk alerts and the expire setting:
- Alerts notify you when something important/critical happens in your environment/data
- The “Expire” setting controls how long a triggered alert’s results stay in the list (on the Search Heads)
- The “Expire” setting will assist with managing and keeping the performance of the system more efficient (When set low)
- If you are receiving an email with the results, you should set the expire setting to something like 30 minutes (OR EVEN LOWER). This is because you will be receiving the results in an email. No reason to have a duplicate just sitting there taking up valuable space
In conclusion, Splunk alerting is a great way to automate monitoring of important/critical events and understanding the “Expire” setting helps you keep things in the environment tidy and efficient.
About the Author
Aaron Dobrzeniecki has over 9 years of experience with Information Technology. He first worked for an MSP broadening his skills with Tier 1 and Tier 2 Help Desk support for just over 40 small companies. Aaron has been working with Splunk for 7 years starting on the apps and add-ons team troubleshooting issues with Splunk apps and add-ons. He then worked on the Admin on Demand team creating content, assisting with Splunk core issues or tasks. As well as providing Splunk Best Practices. He has provided excellent service to the Splunk customers he has worked with and continues to get praise from his customers. Aaron was also a Customer Success Manager for 14 companies working with their Splunk teams to better their environments. As a Certified Splunk Consultant, Aaron continues to increase his knowledge and follow the Splunk Best Practices while maintaining an excellence in communication. Aaron graduated with his Bachelors’ majoring in Security Informatics from Indiana University in 2015.