Alarm Language - Generic Thresholds Made Easy

  • Last updated on: 2016-01-22
  • Authored by: Daniel Dispaltro

It is not necessary to edit config files on a Nagios server. Rackspace Monitoring lets you:

  • Set thresholds with an easy to use alarm language:

    if (metric['code'] != "200") {
      return CRITICAL, "Bad HTTP Status: #{code}"
    }
    
  • Create expressive alarms that validate multiple criteria while maintaining an easy to use javascript-like interface:

    if (metric['duration'] > 2000) {
      return CRITICAL, "HTTP request took more than 2 seconds, it took #{duration} milliseconds."
    }
    if (metric['duration'] > 1000) {
      return WARNING, "HTTP request took more than 1 second, it took #{duration} milliseconds."
    }
    # Check for an empty body match
    if (metric['body_match'] == "") {
      return CRITICAL, "Body match missing"
    }
    return OK, "HTTP connection time is normal"
    
  • See the solution patterns in our best practices documentation and then easily create your own complex alarms. Our API makes it simple to keep up to date on these examples, get more information here.
  • Put developers in control by letting them build thresholds similar to how you create your application code.
  • Test thresholds before you configure them. Use data from our Test Check API and feed that into our Test Alarm API to simulate an alerting scenario.
  • Use multiple data center alert policies to seamlessly evaluate alarm criteria from multiple data centers:

The graph above shows a check running in 3 monitoring zones. The yellow and red areas represent when an alarm is in WARNING and CRITICAL respectively.


Key Takeaways

  • Don’t run a DIY nagios server.
  • Our flexible alerting language puts you in control, don’t bother with an awkward JSON API for defining thresholds.
  • Supports dual stacks, both IPv4 and IPv6.
  • Send an alert to different notification addresses depending on severity.
  • Monitor your website from up to 5 different locations. Set the policy you want to execute on mixed results.
  • Reduce false alerts on network hiccups.
  • Start monitoring faster and spend less sysadmin time on making sure that server stays up.

Continue the conversation in the Rackspace Community.