.

Coffee Powered

code and content

Syntactic sugar will occassionally kick your puppies.

Ruby’s awesome. It has sweet, concise syntax that makes for clean, readable code. One of these constructs is the trailing condition. In most languages where you might have to write something like:

if foo then
	do_stuff
end

Ruby will let you clean that up with:

do_stuff if foo

This works just nearly all the time, but I ran into an odd problem today, where the trailing conditions were producing behavior I didn’t want.

>> foobar
NameError: undefined local variable or method `foobar' for #<Object:0x92bc998>
        from (irb#1):2
>> foobar = true unless defined?(foobar)
=> nil
>> foobar
=> nil
>> unless defined?(foobar); foobar = true; end
=> true
>> foobar
=> true

Wait, what? Using the trailing conditional changes the order in which Ruby parses the statement, resulting in something like the following operations:

  1. Define foobar because it’s referenced, set it to nil
  2. Parse the unless conditional
  3. If the condition is true, set foobar to true

The kicker here is that because foobar‘s assignment is the first thing parsed, it’s always initialized before you ever get to the defined? statement. So instead, we run the second piece of code:

unless defined?(foobar); foobar = true; end

This runs something like the following:

  1. Parse the unless condition.
  2. Define foobar because it’s referenced, set it to nil
  3. If the condition is true, set foobar to true

Obviously this is the desired behavior. Several lessons here:

  • Ruby initializes variables when they are parsed, not when the code path that contains them is run (in fact, it’ll even initialize variables that are in unreachable code paths!)
  • if condition then do_stuff end is not always the same as do_stuff if condition

It’s a bit of an edge case, but it’s an edge case that had me baffled. Hopefully this post saves you some frustration.

Powerful, easy, DRY, multi-format REST APIs: Part 2

Back in September, I wrote about making your REST APIs more flexible and easier to maintain. I’ve been working with this code with great success for the past few months, and have improved and tweaked it. It’s changed enough that it’s time for another blog post about it.

First off, the render method signature has changed. This is for full Rails compatiblity, including Rails 2.3. It’ll work as you’d expect without any unpleasant surprises. Secondly, there are some other optional render targets, designed for leveraging existing #to_xml handlers where you already like them.

Read More »

Graceful degredation: Using Gravatar as a fallback avatar with Paperclip

Lots of people use Paperclip for stuff like letting their users upload avatars. This is great – Paperclip is easy to use, quick to integrate, and painless to maintain.

However, Gravatar has a great selling point: The user gets an avatar without ever having to go set one on your site. They have an identity established the moment they sign on. You’ve seen it in WordPress blogs (including this one!) and in products like Redmine – you just enter your name and email to comment, and you automagically have your Gravatar show up next to your post.

Fortunately, Paperclip is flexible enough to let us integrate Gravatars without too much of a hassle.

Read More »

Desuckifying Experts-Exchange

Experts Exchange minus the suck!

Experts Exchange minus the suck!


If you’ve ever searched for an answer to a programming problem, chances are good that you’ve run into results from experts-exchange. Everyone hates them. The information usually isn’t that good, and even if it is, you have to scroll past sixteen pages of ads and spam to get to them. Unfortunately, there’s the occasional nugget of info that’s what you’re looking for. There’s just too much crap to dig through to get to it.

We’re gonna fix that.
Read More »

Mass inserting data in Rails without killing your performance

Mass inserting is one of those operations that isn’t really well-supported by ActiveRecord, but which has to be done nonethless. You might say, “Well hey, I’ll just run a loop and create a bunch of AR objects, no sweat”.

That’ll work, but if speed is a factor, it might not be your best option.

ActiveRecord makes interface to the DB very easy, but it doesn’t necessarily make it fast. Instantiating an ActiveRecord object is costly, and if you do a lot of ‘em, that’s going to cause you to bump up against the garbage collector, which will significantly hinder performance. There are several options, though, depending on how much speed you need.

There are benchmarks at the bottom of the post, so if you’re just interested in those, scroll down.
Read More »

Quick tip – use anonymous blocks!

In tracking down a memory leak in one of our Rails apps today, I ran across an interesting post detailing the difference between anonymous and named blocks in Ruby, and the performance differences therein.

It’s definitely worth a look, especially if you’re running in a complex environment, where new closures will be large and unwieldy. It’s very easy, too. Any time you use:

def note(text, options = {}, &block)
  options[:class] = ((options[:class] || "") + " form-note").strip
  content_tag(:div, text, options, &block)
end

Instead, don’t explicitly name the block parameter; just yield to it, and you prevent all the messiness of creating a new Proc object.

def note(text, options = {})
  options[:class] = ((options[:class] || "") + " form-note").strip
  content_tag(:div, text, options) {|*block_args| yield(*block_args) if block_given? }
end

I don’t have benchmarks just yet, but anecdotally it has definitely slowed instance memory consumption in my apps. It’s worth taking a look at!

Re: Simple RoR+MySQL optimization

I recently ran across a rather bare post espousing some generic “optimization” techniques for Rails apps. It offered no education, no explanation, no benchmarks. So, I thought, why not put those claims to the test?
Read More »

Your friendly neighborhood DNS man

So there’s been this gossipy story making the rounds on the social news sites, that the McCain camp has unbelievably registered voteforthemilf.com and are redirecting it to their site! They’ve got traceroutes and everything! Oh! The sexism! Oh! The gall! Oh! The huge manatee!

Or, wait, no, maybe everyone running around screaming about this just doesn’t have a clue as to what that really means.

To be perfectly clear, I write this without any intent to provide any political bias, but to explain a technical subtlety that is apparently lost on many people, and which therefore bears a need for some edumacation.

Domain names are like nicknames. They’re for our convenience, since we remember google.com, but we’d have a hard time remembering 64.233.187.99 every time we wanted to search for something. However, we have to have systems in place that translate those nicknames into IP addresses. They’re like phone numbers for computers. To place a call to someone, you would look up their name in a phone book, and then get their phone number, and then dial the phone number on your phone. This is, in effect, what DNS is – a giant phone book.

First, a few terms.

  • DNS – Domain Name Server. A system that turns domain names into IP addresses. Think of it like your cell phone’s phone book. You look up “Mom” and it knows which phone number you want to call.
  • Top Level Domain Name Server – The servers that all computers get in contact with to find out which Authoritative Server holds the information they’re looking for
  • Authoritative Domain Name Server – The server that actually holds the IP address you’re looking for.

So, what happens when you type in a domain name?

  1. Your computer issues a request to a top level DNS server, asking for the DNS server that holds information for that domain.
  2. The DNS server you’re pointed to says “Oh, I know what IP this domain belongs to, here’s the IP address”
  3. Your computer makes a connection to the IP address specified by the Authoritative DNS Server

Imagine this scenario, then. You need a phone number for your friend, Joe. You don’t have it, but you know that Jenny, your socialite friend, would know someone who does. So, you call Jenny and say “Hey, do you know who has Joe’s phone number?” Jenny gives you Jane’s phone number. You then call Jane and say, “Hey, do you have Joe’s phone number?” Jane does, and gives it to you. You can then call Joe directly. This is basically how a DNS lookup works.

Now, the trick here is that IP addresses don’t get to decide what domain names point to them. So, I can register any domain name I want, and tell the DNS server responsible for that domain name that “Hey, this domain points to this IP address”. Then, that DNS server will return that IP address any time someone asks what IP address that domain belongs to.

It’s this subtlety that lets the above smear work. I can register any domain I want, and make it seamlessly redirect to any IP address I want. I could register diggsucks.com and point it to reddit.com, and as far as anyone could tell, diggsucks.com would go to reddit.com.

To go back to the voteforthemilf.com issue, if I may return to the phone book analogy, this is as if I called up the phone company and said “Hey, my name is voteforthemilf.com and my phone number is 555-123-4567″. They print it, and someone discovers it. They dial the number, and are connected to McCain campaign headquarters. The story here is that people immediately assume that because the phone number is that for the McCain campaign, it must have been the McCain campaign that put it there – a logical leap that is both bold and wrong.

For kicks, check out this domain that I’ve told my DNS server to redirect to John McCain’s website:

http://clearlymccain.coffeepowered.net

This is a subdomain, obviously, but it’s just as trivial to do it with any real domain.

And now you know how doman names work.

Stop using social information as passwords and security questions!

Hacked! I have a friend who recently had several of his online accounts compromised. The attackers weren’t particularly clever, didn’t use any special tools, didn’t install any viruses on his computer. All they needed was to see his public Facebook profile. From that, they were able to divine his birthday and security question answer – all that was needed to get into the mail account – and from there, they had access to every account online that he’d registered to that email address.

The problem is this: Your email is a de facto master password for everything you do online that’s tied to it. This means your MySpace account, your instant messenger account, maybe even your bank account. Once someone knows your email, they can start to attack weak points in your account. Hiding your email isn’t practical, so you need to make sure that your passwords and security question are rock solid secure.

A “security question” is a question that you provide an answer to, so that you can recover your password if you ever forget it. For example, a site might ask you to answer the question “What is your favorite color?” in order to start the password reset process. The concept is that by providing an answer to a question that only you would know, you create a “backup” password that you’ll be able to remember.

An extremely common “security question” might be something like “What is your mother’s middle name?” or “What street did you grow up on?” While these might have been reasonably secure in the past, they’re horribly insecure these days. The saturation of social information on the web makes it extremely easy to research these answers and arrive at questions in seconds. Facebook notes connections between yourself and your relatives, so how hard would it be to find your mom on Facebook, and then Google her name, or find one of her parents on Facebook?

In the case of a “favorite color” question, this one isn’t going to show up on your social networking profiles – maybe not explicitly, but an attacker has two direct attacks: First, look at how you’ve customized colors on your profile page. Do you use purple heavily? Maybe you have a fondness for green text. That has a strong potential to betray your answer. If that fails, the attacker has a pretty small set of potential answers – most people would answer that question with one answer in the set “black, blue, green, orange, pink, purple, red, yellow, or white”. 9 attempts and an attacker will be into the account in no time.

Famously, Sarah Palin, the Republican vice presidential nominee recently had her Yahoo! email account compromised. The attacker simply had to answer a single security question to gain access to her account: “What is my zip code?” How hard do you think it would be to find the address of a government official with Google? Might take six, maybe seven seconds top?

If you’re at all active online, there’s a lot of info about your social connections, family, pets, housing history, the works. Someone determined to get into one of your accounts won’t have a hard time finding your dog’s name (Ever put their name into a caption on one of your Facebook or Photobucket pictures?). Many, many people use pet names, pastimes, relatives’ names, and other social information as passwords and security questions. It’s no longer secure.

The solution is to choose passwords and security questions that are nonsensical. The best passwords are 8+ characters, and consist of upper and lower case letters, and a number, and even possibly some punctuation. If you have problems remembering nonsensical passwords, then use an transformation scheme. Presume that your password is “jeremy”, your brother-in-law’s name. Maybe your rule is that you shift each letter two down, and capitalize the third letter.

j -> l
e -> g
r -> t
e -> g
m -> o
y -> z

Your new password is “lgtgoz”, and if you capitalize the third letter, that becomes “lgTgoz”. It’s still not optimally secure, but it’s far, far better than it was before. Nobody is going to guess it, and you can still remember it by just remembering “jeremy” and your rule.

Ideally, you’ll use a different password for every login, always pick 8+ character passwords with a diverse character set, and have them be completely randomized. In reality, people don’t want to bother with that kind of maintenance, but at the very least, stop using social information for your security questions, passwords, and the like.

Powerful, easy, DRY, multi-format REST APIs

Rails’ baked-in REST support is great. Build your app right, and you can expose a programmatic interface to your users for free.

That said, many times providing views in non-HTML formats tends to be bulky and unwieldy. You end up with either very brittle representations of your data, or extremely bulky respond_to blocks in your controllers.

Fortunately, there’s a better way! We’re going to provide two new render targets, :to_yaml and :to_json which will let us write a single XML builder view, and then provide that view in XML, YAML, and JSON formats according to the consuming developer’s preferences.

In application.rb you’ll want to override the render method.

def render(opts = {}, &block)
  if opts[:to_yaml] then
    headers["Content-Type"] = "text/plain;"
    render :text => Hash.from_xml(render_to_string(:template => opts[:to_yaml], :layout => false)).to_yaml, :layout => false
  elsif opts[:to_json] then
    content = Hash.from_xml(render_to_string(:template => opts[:to_json], :layout => false)).to_json
    cbparam = params[:callback] || params[:jsonp]
    content = "#{cbparam}(#{content})" unless cbparam.blank?
    render :json => content, :layout => false
  else
    super opts, &block
  end
end

As you can see, we render a single XML view, and then load it to a hash from XML, and use Rails’ built-in Hash#to_json and Hash#to_yaml methods to provide the data in the desired format. There is a single glaring problem with this approach, though – Hash#from_xml is dog slow because it uses REXML. There’s a fantastic solution, though!

Courtesy of a blog post over at cobravsmongoose, we have a libxml drop-in for Hash#from_xml

First, install libxml and then faster_xml_simple.

Second, include a monkeypatch to Hash#from_xml with the following:

require 'faster_xml_simple'
class Hash
  def self.from_xml(xml)
    undasherize_keys(typecast_xml_value(FasterXmlSimple.xml_in(xml,
      'forcearray'   => false,
      'forcecontent' => true,
      'keeproot'     => true,
      'contentkey'   => '__content__')
    ))
  end
end

You can run the benchmarks if you’d like, but it’s orders of magnitude faster than REXML. Seriously. Don’t use REXML. It’s like trying to run a Ferrari off of a 9-volt battery.

Now, let’s say you have an action you want to provide HTML, XML, JSON, and YAML views for.

def index
  ...
  respond_to do |wants|
    wants.html
    wants.xml  { render :layout => false }
    wants.json { render :to_json => "posts/index.xml.builder" }
    wants.yaml { render :to_yaml => "posts/index.xml.builder" }
  end
end

Finally, throw together your index.xml.builder file as you best see fit.

xml.instruct! :xml, :version=>"1.0", :encoding=>"UTF-8"
xml.posts do
  @posts.each do |post|
    xml.post(:id => post.id) do
      xml.user(:id => post.user.id) +
      xml.content do
        post.post_body
      end
    end
  end
end

And all of a sudden, bam! You’ve got your posts available in HTML…

/posts/index

…and in XML, YAML, and JSON, along with the associated User. By using an XML builder, you can make the serialized data as complex and customized as you’d like. No more funky respond_to blocks, no more exposing data you don’t want to. Expose what you want, and just what you want, in several formats.


/posts/index.xml
/posts/index.yml
/posts/index.json

One final trick is that the JSON views accept an optional callback or jsonp parameter, which will cause the content to be passed to a Javascript function matching the passed parameter, as per the JSONP spec.

For example, if you have a /foo/bar.json view that would render the following JSON:

"{\"foo\":\"bar\"}"

Calling /foo/bar.json?jsonp=returnFunc would return the following:

returnFunc("{\"foo\":\"bar\"}")

Check out the JSONP spec for more.