Sensu Checks to report Metrics
Once you understand the basics of creating a Sensu check, creating a check that reports metrics is actually pretty simple. Here’s the lowdown.
The differences vs a standard check
Sensu needs to be told that this is a metrics check:
1 2 3 4 5 |
|
The exit status (ok, warning, critical) can still be monitored by Sensu, so no change there. Although my understanding is that most metrics checks simply take care of the metrics aspect and aren’t built for alerting with critical and warning statuses.
The textual output is expected to be in Graphite format. Most of the time it’s spread over multiple lines, to output multiple data points:
1 2 3 4 5 |
|
The Graphite format is pretty simple: a path of words separated by dots, a numeric value and an optional timestamp (defaults to the time of reception if not supplied).
Ruby is nicer
Once again, you can use the sensu-plugin
gem to get a few things handled
automatically for you. Here’s the basics for building a metrics check.
- You inherit from
Sensu::Plugin::Metric::CLI::Graphite
- You still implement
run()
. - You still describe configurations with
option
and access them withconfig[]
. - You output each stat with
output(name, value, timestamp)
. - You need to at least end with
ok()
, you can also use the other exit helpers if you want.
Here’s disk-usage-metrics.rb as an example:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 |
|
Conventions
A few conventions have evolved in the sensu-community-plugins.
Most standard checks are named check-xxx
and most metrics checks are named
xxx-metrics
.
A more interesting convention has also evolved, around the metrics checks.
If you look back at the disk-usage-metrics.rb code up there,
you’ll notice the --scheme
option defaults to hostname + plugin name.
Let’s come back to the example output I gave earlier:
1
|
|
The first two can be overridden with --scheme
, the rest is decided by the plugin.
There’s a few scenarios where you could want to override the scheme.
- If your systems have properly set FQDNs,
Socket.gethostname
will return that. Which may give you metrics named likewww1.app-name.phoenix-1.example.com.disk_usage.root.used
. If that’s too unwieldy for you, you could for example invoke the check with--scheme $(hostname --short).disk_usage
. - You may want to nest your Sensu-generated metrics deeper into an existing
Graphite ontology:
--scheme system.$(hostname).disk_usage
. - If you’re using a cloud Graphite provider like
HostedGraphite,
you may need to prepend your account’s API key to the metric name you’re sending:
--scheme deadbeef4242.$(hostname).disk_usage
or even better--scheme :::hostedgraphite.apikey:::$(hostname).disk_usage
.
Most of the metrics check have this option. If you want to share your new metric check, I would strongly recommend adding this same option as well.
Conclusion
So that’s it! I hope my lenghty prose still let me demonstrate clearly how easy it is to create standard and metrics Sensu checks. Let me know if you have questions or if you think I’ve overlooked anything!