Monitoring is big. Having an automated daemon watch your stuff and make sure it’s running properly can let you sleep at night, knowing that if something blows up, there’s an ever-watchful guardian ready to wake you up so you can fix it.
There are a number of monitoring solutions that are popular these days, such as monit, god, and Nagios. They’re all fantastic, but sometimes you just want something simple and to-the-point, right?
With jabberish in your project, this becomes a no-brainer. I’m already using Jabberish in my project, so I whipped up a little script that checks system load, available memory, and any changes in swap usage and shoots me an IM under certain conditions. My monitoring still handles automated maintenance in the case of a runaway process or whatnot, but this keeps me instantly informed of any problems that my system might be running in to.
require 'rubygems'
require 'drb'
require 'daemons'
MAX_MEMORY = 95
MAX_LOAD = 4.0
DELIVER_TO = "cheald@gmail.com"
JABBERISH_SERVER = "druby://localhost:35505"
$warned = {}
$hostname = `hostname`.strip
def im
$im_service ||= DRbObject.new(nil, JABBERISH_SERVER)
end
def deliver(msg)
im.deliver DELIVER_TO, "[#{$hostname}] #{msg}"
end
def check_stats
meminfo = open("/proc/meminfo").read
mtotal = meminfo.match(/MemTotal:\s+(\d+)/)[1].to_i
mfree = meminfo.match(/MemFree:\s+(\d+)/)[1].to_i
mused = mtotal - mfree
stotal = meminfo.match(/SwapTotal:\s+(\d+)/)[1].to_i
sfree = meminfo.match(/SwapFree:\s+(\d+)/)[1].to_i
sused = stotal - sfree
begin
if $warned[:swap] and sused > $warned[:swap] then
deliver "WARNING: Swap has increased from #{$warned[:swap]} to #{sused}"
end
$warned[:swap] = sused
pct = mused / mtotal.to_f * 100.0
if pct > MAX_MEMORY then
unless $warned[:memory]
deliver sprintf("ALERT: Memory free: %2.2fmb (%2.2f%% used)", mfree / 1024.0, pct)
$warned[:memory] = true
end
else
$warned[:memory] = false
end
load = open("/proc/loadavg").read.split(" ").first
if load > MAX_LOAD then
unless $warned[:load]
deliver sprintf("WARNING: Load average is %s", load)
$warned[:load] = true
end
else
$warned[:load] = false
end
rescue
puts "Error: #{$!}"
end
end
Daemons.daemonize(:backtrace => true)
loop {
check_stats
sleep(10)
}
Not too bad, huh? This is written for a CentOS installation, so you may need to change things like the meminfo regexes depending on your system. It could probably use a YAML config file to be truly correct – configuration options in constants works, but is a little ugly.
Now I get alerts like these via instant message:
[iceman.tagteam] WARNING: Load average is 4.44 [iceman.tagteam] ALERT: Memory free: 99.82mb (93.38% used) [polaris.tagteam] ALERT: Memory free: 72.20mb (95.14% used)
This lets me respond to changing system conditions extremely rapidly, and serves as a high-level alert log when when I’m not at the keyboard – when I get back, I check my messages from blippr, and can see when and how often certain marginal conditions are being met.
Hope it’s useful!
4 Comments
Hey,
I think Jabberrish is a great idea! And I would really love to see more mixed XMPP adn HTTP apps… but, honestly, XMPP4R (and its threaded architecture) really can't handle the load of thousands of messages per minute! I am working on a project that hopes to make XMPP better for Ruby : http://github.com/julien51/babylon/tree/master
Do you think you could "port" Jabberrish to use it?
Very interesting work there. I can't say I've needed super high-performance XMPP in Rails yet, but having the option is always very good. Jabberish could very easily be wrapped around Babylon – it's really just a mechanism to abstract the actual XMPP setup away from the user, so they just write messages to it and don't worry about concurrency and the like.
I'll see about Babylon support!
I might try do do it myself ;)
Anyway, I see that you use Drb for communication between your mongrels and the XMPP process. How "scalable" is that?
I mean, can you give us some hints on performance?
I haven't run into an upper limit yet, even when I managed to introduce an errant logging mechanism into my code that caused me to send about 80,000 messages in 60 seconds, heh.
It would be fairly easy to cut out the DRb piece if it became a limiting factor – using an eventmachine daemon rather than a DRb daemon would be relatively easy, though not quite as brainlessly easy as using DRb.