.

Coffee Powered

code and content

Quick tip: Strip URLs before parsing!

Rather than roll my own URL regexes, I prefer to let the existing libraries do the heavy lifting. Ruby has a uri library which is fantastic for parsing (and validating) URLs.

For example, something like this might be used in a model validation:

require 'uri'

def validate_url(url)
	parsed_uri = URI::parse(url)
rescue URI::InvalidURIError
	errors.add :url, "Sorry, that doesn't look like a valid URL"
end

I noticed a bit ago that I started getting invalid URL errors where there shouldn’t be any. After far too long spent in the library’s code, I realized my error: the URLs were being pasted with a trailing space. Stripping the string before attempting to parse it fixed it right up.

I’d argue that URI::parse should likely strip any incoming strings, but in the meantime, remember to strip your user input before trying to determine whether it’s valid or not, or you may end up with frustrated users.

Announcing Scrap

I do a lot of memory and garbage analysis on my Rails apps, and in upgrading to Rails 2.3, I discovered a practical use for the new Rails Metal middleware. Dumping memory stats to my log was just sorta unreadable in a practical scenario, and was more or less entirely unusable in production. Fortunately, Metal provides a really easy way to output readable information to the browser without invoking the full Rails stack. (It’s also an excuse to write a Metal endpoint because it’s new and shiny, but that’s beside the point.)

It’s up at github – installation is dead easy (assuming you’re on Rails 2.3+, of course) – just install the plugin, restart your app, and hit <your url>/stats/scrap in your browser. Bam, instant juicy memory goodness about your app at your fingertips. If you’d like an example of the output, good news! Check it out at http://tachyonsix.com/scrap.htm.

You can use it to troubleshoot heap leaks – just run a few requests, hit your Scrap URL, and see what your deltas look like. Seeing a huge growth in a certain type of object? Chances are pretty good that you have a heap leak, and can start tracking it down.

The request history can help you locate certain actions that might be causing spikes in memory usage. It’ll show the last N requests, along with memory and heap statistics before each request. If there’s a consistent memory usage leap after a certain action, chances are that it’s doing something naughty.

Want to get a bigger picture on what objects are hanging around? You can use the config/scrap.yml file to get Scrap to spit out more detailed reports on instances of a given class. There’s full documentation on it in the README.

Anyhow, give it a shot, let me know what you think.

Things to do when upgrading to Rails 2.3

I’m upgrading blippr to Rails 2.3. Here are some of the things that had to be changed to upgrade:

Switch the application entirely to LibXML for all its XML parsing needs

In config/environment.rb: Add the following

ActiveSupport::XmlMini.backend = 'LibXML'

This means that the faster_xml_simple monkeypatch is no longer needed. I don’t think we’re doing much else with XML on blippr, but it’ll be nice to have libxml-backed parsing all around. I must not use REXML. REXML is the app-killer. REXML is the little-death that brings total obliteration.

Fixes for will_paginate and SQL errors when counting records with a custom :select clause

* Upgrade will_paginate. Even after the upgrade, something about 2.3’s named scope handling was still breaking my app. I have a named scope like so:

  :select => "*, (blips.vote_score+2)/WEIGHT_FACTOR as weighted_score",
  :order => "weighted_score desc"

This was causing .paginate calls with this named scope to fail with an invalid SQL error. will_paginate should automatically clobber :select phrases before attempting to count records, but it wasn’t. The solution is to specify a :count condition to my .paginate calls with the right select clause.

Blip.best.paginate(:page => current_page, :per_page => 30, :count => {:select => "blips.id"})

In general, any paginate call with a :select specified seems to break. The :count clause fixes them.

Upgrade my libmemcached plugin

A lot of the internal session stuff has changed. We use Evan Weaver’s libmemcached client, and an upgraded copy of 37signals’ libmemcached store for Rails. The plugin’s been upgraded to work with 2.3, and provides a session store on top of the general Rails store.

Our caching config now looks something like this:

GENERAL_CACHE_SERVERS = ["localhost:11211"]
GENERAL_CACHE_OPTIONS = {:untaint => true}
SESSION_CACHE_SERVERS = ["localhost:11212"]
SESSION_CACHE_OPTIONS = { :prefix_key => "session:blippr" }
SESSION_MEMCACHE_CLIENT = Memcached.new(SESSION_CACHE_SERVERS, SESSION_CACHE_OPTIONS)

config.cache_store = :libmemcached_store, GENERAL_CACHE_SERVERS, GENERAL_CACHE_OPTIONS
config.action_controller.session_store = :libmemcached_store
config.action_controller.session = {
	:cache => SESSION_MEMCACHE_CLIENT,
	:expires_after => 86400
}

Works great with libmemcached, with separate memcached instances for fragments and sessions (so that an over-populated fragment store won’t start clobbering sessions).

Update query parsing

I parse query parameters for some funky filtering. In 2.2.2 I used:

ActionController::AbstractRequest.parse_query_parameters(query_string)

In 2.3, that becomes:

Rack::Utils.parse_query(query_string)

That’s about it for now, but as problems arise I’ll be sure to add them.

Monitoring Rails: Getting instant monitoring alerts

Monitoring is big. Having an automated daemon watch your stuff and make sure it’s running properly can let you sleep at night, knowing that if something blows up, there’s an ever-watchful guardian ready to wake you up so you can fix it.

There are a number of monitoring solutions that are popular these days, such as monit, god, and Nagios. They’re all fantastic, but sometimes you just want something simple and to-the-point, right?

Read More »

Installing the fauna libmemcached gem on Fedora Core 6

This is mostly for my own reference, but also because I couldn’t find any great help while googling the problem.

I’m working on switching from memcache-client to Evan Weaver’s libmemcached gem, and it’s gone well, except for one nagging error:

libmemcached.so.2: cannot open shared object file: No such file or directory - /opt/ruby-enterprise-1.8.6-20081215/lib/ruby/gems/1.8/gems/memcached-0.13/lib/rlibmemcached.so

libmemcached.so.2 was absolutely there, in my /usr/local/lib path. However, ldd was showing that rlibmemcached.so wasn’t properly linked to that library. The solution was the following:

[root@polaris libmemcached-0.25.14]# ./configure --prefix=/usr
[root@polaris libmemcached-0.25.14]# make && make install

ldd now shows the proper reference, and everything works. All better!

Syntactic sugar will occassionally kick your puppies.

Ruby’s awesome. It has sweet, concise syntax that makes for clean, readable code. One of these constructs is the trailing condition. In most languages where you might have to write something like:

if foo then
	do_stuff
end

Ruby will let you clean that up with:

do_stuff if foo

This works just nearly all the time, but I ran into an odd problem today, where the trailing conditions were producing behavior I didn’t want.

>> foobar
NameError: undefined local variable or method `foobar' for #<Object:0x92bc998>
        from (irb#1):2
>> foobar = true unless defined?(foobar)
=> nil
>> foobar
=> nil
>> unless defined?(foobar); foobar = true; end
=> true
>> foobar
=> true

Wait, what? Using the trailing conditional changes the order in which Ruby parses the statement, resulting in something like the following operations:

  1. Define foobar because it’s referenced, set it to nil
  2. Parse the unless conditional
  3. If the condition is true, set foobar to true

The kicker here is that because foobar’s assignment is the first thing parsed, it’s always initialized before you ever get to the defined? statement. So instead, we run the second piece of code:

unless defined?(foobar); foobar = true; end

This runs something like the following:

  1. Parse the unless condition.
  2. Define foobar because it’s referenced, set it to nil
  3. If the condition is true, set foobar to true

Obviously this is the desired behavior. Several lessons here:

  • Ruby initializes variables when they are parsed, not when the code path that contains them is run (in fact, it’ll even initialize variables that are in unreachable code paths!)
  • if condition then do_stuff end is not always the same as do_stuff if condition

It’s a bit of an edge case, but it’s an edge case that had me baffled. Hopefully this post saves you some frustration.

Powerful, easy, DRY, multi-format REST APIs: Part 2

Back in September, I wrote about making your REST APIs more flexible and easier to maintain. I’ve been working with this code with great success for the past few months, and have improved and tweaked it. It’s changed enough that it’s time for another blog post about it.

First off, the render method signature has changed. This is for full Rails compatiblity, including Rails 2.3. It’ll work as you’d expect without any unpleasant surprises. Secondly, there are some other optional render targets, designed for leveraging existing #to_xml handlers where you already like them.

Read More »

Graceful degredation: Using Gravatar as a fallback avatar with Paperclip

Lots of people use Paperclip for stuff like letting their users upload avatars. This is great – Paperclip is easy to use, quick to integrate, and painless to maintain.

However, Gravatar has a great selling point: The user gets an avatar without ever having to go set one on your site. They have an identity established the moment they sign on. You’ve seen it in Wordpress blogs (including this one!) and in products like Redmine – you just enter your name and email to comment, and you automagically have your Gravatar show up next to your post.

Fortunately, Paperclip is flexible enough to let us integrate Gravatars without too much of a hassle.

Read More »

Desuckifying Experts-Exchange

Experts Exchange minus the suck!

Experts Exchange minus the suck!


If you’ve ever searched for an answer to a programming problem, chances are good that you’ve run into results from experts-exchange. Everyone hates them. The information usually isn’t that good, and even if it is, you have to scroll past sixteen pages of ads and spam to get to them. Unfortunately, there’s the occasional nugget of info that’s what you’re looking for. There’s just too much crap to dig through to get to it.

We’re gonna fix that.
Read More »

Mass inserting data in Rails without killing your performance

Mass inserting is one of those operations that isn’t really well-supported by ActiveRecord, but which has to be done nonethless. You might say, “Well hey, I’ll just run a loop and create a bunch of AR objects, no sweat”.

That’ll work, but if speed is a factor, it might not be your best option.

ActiveRecord makes interface to the DB very easy, but it doesn’t necessarily make it fast. Instantiating an ActiveRecord object is costly, and if you do a lot of ‘em, that’s going to cause you to bump up against the garbage collector, which will significantly hinder performance. There are several options, though, depending on how much speed you need.

There are benchmarks at the bottom of the post, so if you’re just interested in those, scroll down.
Read More »