.

Coffee Powered

code and content

When you have to store user passwords…

Today we got word of yet-another-database-hack-with-plaintext-passwords. This time, it’s RockYou, purveyor of many of those Facebook and Myspace apps you use. Oops.

Every time this comes up, everyone says “How naive! They should have been using salted hashed passwords!” This is true in any case where you don’t need to use the password again on an external service. With OAuth solutions becoming more and more popular, the need to collect and store user passwords is fortunately becoming more and more rare. However, it does need to happen sometimes, so how do you take the proper precautions when you do need to?

The first step is to encrypt your data before it is persisted into your database. This is pretty easy to do, and there are a number of methods for it. Here’s an example of something I used in a Rails app to provide encryption services.

require 'openssl'
require 'base64'
module Encryption
	class OpenSSL_Key
		PUBLIC_KEY_FILE = "#{RAILS_ROOT}/config/public.pem"
		PRIVATE_KEY_FILE = "#{RAILS_ROOT}/config/private.pem"

		def self.encrypt(data)
			@@public_key ||= OpenSSL::PKey::RSA.new(File.read(PUBLIC_KEY_FILE))
			encrypted_data = @@public_key.public_encrypt(data)
			Base64.encode64(encrypted_data)
		end

		def self.decrypt(data)
			@@private_key ||= OpenSSL::PKey::RSA.new(File.read(PRIVATE_KEY_FILE))
			decoded_data = Base64.decode64(data)
			@@private_key.private_decrypt(decoded_data)
		end
	end

	class OpenSSL_RSA
		IV64 = "xxxxxxxxxxxxxxxxxxxxxxxxxx==\n"
		KEY64 = "xxxxxxxxxxxxxxxxxxxxxxxxxx=\n"
		CIPHER = 'aes-256-cbc'

		def self.encrypt(data)
			@@iv ||= Base64.decode64(IV64)
			@@key ||= Base64.decode64(KEY64)

			cipher = OpenSSL::Cipher::Cipher.new(CIPHER)
			cipher.encrypt
			cipher.key = @@key
			cipher.iv = @@iv
			encrypted_data = cipher.update(data)
			encrypted_data << cipher.final
			Base64.encode64(encrypted_data)
		end

		def self.decrypt(data)
			@@iv ||= Base64.decode64(IV64)
			@@key ||= Base64.decode64(KEY64)

			cipher = OpenSSL::Cipher::Cipher.new(CIPHER)
			cipher.decrypt
			cipher.key = @@key
			cipher.iv = @@iv
			decrypted_data = cipher.update(Base64.decode64(data))
			decrypted_data << cipher.final
		end
	end
end

This provides two classes, Encryption::OpenSSL_Key and Encryption::OpenSSL_RSA which may be used to encrypt arbitrary strings. The OpenSSL_Key class uses a public/private keypair (in our example, read out of the Rails config directory), and the OpenSSL_RSA class uses an initialization vector and secret key. The latter is probably easier, since it means you don’t have to worry about keypairs, and since all the encrypt/decrypt is done locally, there isn’t any need for public public encryption.

Once you have that file in your project, using it is pretty simple.

# Our databse is going to have a field called encrypted_password. We'll use attr_accessor for the password itself.

class MySecretModal < ActiveRecord::Base
	before_save :encrypt_fields
	attr_accessor :password

	def password
		@decrypted_password ||= decrypt_field(:password)
	end

private

	def encrypt_fields
		write_attribute :encrypted_password, Encryption::OpenSSL_RSA.encrypt(@password)
	end

	def decrypt_field(field)
		Encryption::OpenSSL_RSA.decrypt read_attribute("encrypted_#{field}")
	end
end

The net result is that we can still get access to the raw password if we need to, but the content in the database will be RSA-encrypted against a secret key in our application. This is still vulnerable if the attacker gains access to the file containing your RSA IV/key, or if he gains access to your public/private keypair, but it is extremely resilient in the case that an attacker manages to simply dump your users table via SQL injection. You still need to practice good key management, and you absolutely should not use a technique this simplistic for storing financial data – there are a whole set of guidelines and procedures for that kind of information. However, for adding an extra layer of defense to save yourself and your customers from excess embarrassment in the case of a database breach, this is a quick, easy, and effective technique for hardening your data.

This is a rather raw implementation, and there are ways you could package it up so that you could transparently apply it to any number of models or fields, but the basic technique is solid. You could even use something like sql_crypt to easily protect sensitive fields. The technology is there, and “We needed to be able to re-use the password!” isn’t an excuse anymore. Stop storing plaintext passwords – just like backups, it’s just extra work until you need it, and then you’ll be glad you put that extra work in.

Multibyte string slicing for fun and profit

Ran into a small issue in one of my user models. I was using a helper to display a user’s first name, last initial. It looked something like this:

def display_name(user)
  "user.first_name #{user.last_name.slice(0,1)}"
end

Seems innocent enough, sure. Except…it doesn’t work in multibyte character sets. The first Cyrillic speaker to sign up blew that all up. When parsing an XML fragment with a name like this included, I was getting the following error:

ActionView::TemplateError: premature end of regular expression: /^\s*Елена\ �/

nokogiri (1.4.0) lib/nokogiri/xml/fragment_handler.rb:53:in `characters'

The issue, as it turned out, is that String#slice is a bytewise operation, not a character-wise operation like I’d so naively assumed. The issue is pretty easily to observe:

>> "Журинова".slice(0, 1)
=> "\320"

Fortunately, Rails has multibyte support baked in already, so it’s an easy mistake to correct:

def display_name(user)
  "user.first_name #{user.last_name.chars.first}"
end

And now…

>> "Журинова".chars.first
=> "Ж"

It’s very easy to make mistakes like this, and many times you may not even realize that they’re made unless you try to do something funny, like using it as a part of a regex. The safe operation is to never use String#slice or string subscripting on user data, but to instead treat all strings as multibyte strings. Very subtle, but the effects can be pretty nasty if you don’t.

System date considered important

I’ve been slamming my head against the wall for the past two hours. I had an OAuth connection to a remote service working just dandy in development, but as soon as I tried to use that exact same code with the exact same config and exact same gems in production…I was getting “401 unauthorized” errors back from the remote service when attempting to get a request token.

After an extremely tedious series of debugger checks to make sure my OAuth signature was right, I decided to just edit the oauth gem on my production box and add a little debugging statement to dump the HTTP request to stdout. What I found was…surprising.

>> OAuthConsumers::Netflix.new.consumer.get_request_token
opening connection to api.netflix.com...
opened
<- "POST /oauth/request_token HTTP/1.1\r\nAccept: */*\r\nConnection: close\r\nUser-Agent: OAuth gem v0.3.6\r\nAuthorization: OAuth oauth_nonce=\"E73sq4XMkG547EbuCB9GUfG4AtsjD2QFySwLPKj0tI\", oauth_callback=\"oob\", oauth_signature_method=\"HMAC-SHA1\", oauth_timestamp=\"1260049119\", oauth_consumer_key=\"xxxxxxxxxxxxx\", oauth_signature=\"QD5b5Oy8LFLvXWl%2B3R%2BQI0xlIcg%3D\", oauth_version=\"1.0\"\r\nContent-Length: 0\r\nHost: api.netflix.com\r\n\r\n"
-> "HTTP/1.1 401 Unauthorized\r\n"
-> "X-Lighty-Magnet-Uri-Path: /oauth/request_token\r\n"
-> "X-Mashery-Responder: proxyworker-i-e23bae8a.mashery.com\r\n"
-> "X-Mashery-Error-Code: ERR_401_TIMESTAMP_IS_INVALID\r\n"
-> "Content-Type: text/plain\r\n"
-> "Accept-Ranges: bytes\r\n"
-> "Content-Length: 20\r\n"
-> "Date: Sat, 05 Dec 2009 21:27:08 GMT\r\n"
-> "Server: Mashery Proxy\r\n"
-> "\r\n"
reading 20 bytes...
-> "Timestamp Is Invalid"
read 20 bytes
Conn close
OAuth::Unauthorized: 401 Unauthorized
        from /opt/ruby-enterprise-1.8.7-2009.10/lib/ruby/gems/1.8/gems/oauth-0.3.6/lib/oauth/consumer.rb:200:in `token_request'
        from /opt/ruby-enterprise-1.8.7-2009.10/lib/ruby/gems/1.8/gems/oauth-0.3.6/lib/oauth/consumer.rb:128:in `get_request_token'
        from (irb):1

Whoa there, there’s some info that the OAuth gem wasn’t giving back to me. “Timestamp is invalid.” Well then, a quick check of system time, and…oh, hey, it turns out that my system has drifted to about 10 minutes fast. Easily corrected, at least.

# ntpdate -b 0.centos.pool.ntp.org && service ntpd start

With that all done…

>> OAuthConsumers::Netflix.new.consumer.get_request_token
opening connection to api.netflix.com...
opened
<- "POST /oauth/request_token HTTP/1.1\r\nAccept: */*\r\nConnection: close\r\nUser-Agent: OAuth gem v0.3.6\r\nAuthorization: OAuth oauth_nonce=\"YIh5R3CBtAicneNREF5ZUcX80kao1zqRLLA5u8bQWA\", oauth_callback=\"oob\", oauth_signature_method=\"HMAC-SHA1\", oauth_timestamp=\"1260048573\", oauth_consumer_key=\"ksfa9rxmb8dzkxg4npwr74zv\", oauth_signature=\"%2B%2Fyd5sRsJ7qmmZWNRqSlCvByYxw%3D\", oauth_version=\"1.0\"\r\nContent-Length: 0\r\nHost: api.netflix.com\r\n\r\n"
-> "HTTP/1.1 200 OK\r\n"
-> "X-Lighty-Magnet-Uri-Path: /oauth/request_token\r\n"
-> "X-Mashery-Responder: proxyworker-i-7c31a414.mashery.com\r\n"
-> "Content-Type: text/plain\r\n"
-> "Server: Mashery_Server_Adapter_Query\r\n"
-> "Date: Sat, 05 Dec 2009 21:29:32 GMT\r\n"
-> "Accept-Ranges: bytes\r\n"
-> "Content-Length: 194\r\n"
-> "\r\n"
reading 194 bytes...
-> "oauth_token=xxxxxxxx&oauth_token_secret=xxxxxxxxx&application_name=xxxxx&login_url=https%3A%2F%2Fapi-user.netflix.com%2Foauth%2Flogin%3Foauth_token%3Dczjsmzw74nk2wy274g6drmwt"
read 194 bytes
Conn close

All better. Keep those datetimes synched, sports fans. Web services are becoming more and more interconnected, and if there’s one thing I’ve learned from heist movies, it’s that the first step in any successful job is to make sure your watches are synchronized. Nobody likes that guy who shows up 10 minutes late to everything!

Sweet-ass performance hacks: better_assets

HTTP overhead is expensive. DNS lookups are expensive. Start dropping a bunch of Twitter widgets, Google ads, and GetSatisfaction buttons into your killer new Web 2.0 social networking site and you’ll find that your painstakingly-optimized site has slowed to a crawl while the server sits there waiting on Amazon S3 to get its act together and serve you a 300-byte CSS file.

That sorta blows. Let’s not do it.

Introducing better_assets

better_assets is a monkeypatch to the Rails 2.3.2 AssetTagHelper to enable some additional functionality. The key points are:

* Time-based expiry of cached asset files, which is primarily useful for…
* Caching and combining of remote assets
* Finally, you can post-process combined assets with blocks passed to javascript_include_tag and stylesheet_link_tag.

Examples

It’s easy. You use it just like normal:

<%=javascript_include_tag(
  "jquery-1.3.2",
  "foo",
  :cache => "all") {|text| Packr.pack(text, :base62 => true) } %>

Whoa! What is this block madness? Why, that’s an extension to allow you to do whatever you want. In this example, we’re using jcoglan’s Packr library to automatically pack our generated Javascript. This can result in filesize being reduced by pretty massive amounts, and will result in appreciable performance benefits.

Well, that’s all fine and dandy, but it’s not my combined Javascript that’s killing me, it’s all those pesky DNS lookups for all my widget code and CSS. Never fear, you’re covered there, too.

<%=javascript_include_tag(
  "http://rpxnow.com/openid/v2/widget",
  "http://partner.googleadservices.com/gampad/google_service.js",
  "http://s3.amazonaws.com/getsatisfaction.com/feedback/feedback.js",
  "http://blippr.tags.crwdcntrl.net/cc.js",
  :cache => "remote", :lifetime => 12.hours) %>

Madness! Sheer madness! All those remote Javascript files are sucked down, combined, and cached as “remote.js”. It’ll automatically expire after 12 hours, and be re-cached after that. That way, you can get all the performance benefits of serving a single combined JS file without having to stress out that someone over at WidgetHeadquarters is going to change a piece of code and completely screw you over until you notice that your local Javascript file doesn’t match theirs six weeks later.

This, oddly enough, works for CSS files, too.

<%=stylesheet_link_tag(
  "http://s3.amazonaws.com/getsatisfaction.com/feedback/feedback.css",
  "http://s3.amazonaws.com/getsatisfaction.com/feedback/widget.css",
  :cache => "remote", :lifetime => 12.hours
) %>

No more stalling out at requests to Amazon’s S3 for CSS files! No more extraneous DNS requests or HTTP connections! No fuss, no muss, no headaches for you or you user.

All this, and it makes crispy bacon, too.*

To get it, just…wait for it. Very complex procedure ahead:

script/plugin install git://github.com/cheald/better_assets.git

Restart your app, and that’s it. Your assets are now approximately 163% more awesome, while being leaner and looking better in that fabulous summer swimsuit at the same time.

Score.

* Not really.

Fine tuning your garbage collector

If you’re familiar with Ruby at all, you know that it can be a little wacky when it comes to memory usage. Most of us have observed a Mongrel/Passenger instance that starts out small and then grows by leaps and bounds, eventually settling on some uncomfortably high number. We’re going to fix that with Ruby Enterprise Edition and Scrap.
Read More »

Quick tip: Strip URLs before parsing!

Rather than roll my own URL regexes, I prefer to let the existing libraries do the heavy lifting. Ruby has a uri library which is fantastic for parsing (and validating) URLs.

For example, something like this might be used in a model validation:

require 'uri'

def validate_url(url)
	parsed_uri = URI::parse(url)
rescue URI::InvalidURIError
	errors.add :url, "Sorry, that doesn't look like a valid URL"
end

I noticed a bit ago that I started getting invalid URL errors where there shouldn’t be any. After far too long spent in the library’s code, I realized my error: the URLs were being pasted with a trailing space. Stripping the string before attempting to parse it fixed it right up.

I’d argue that URI::parse should likely strip any incoming strings, but in the meantime, remember to strip your user input before trying to determine whether it’s valid or not, or you may end up with frustrated users.

Announcing Scrap

I do a lot of memory and garbage analysis on my Rails apps, and in upgrading to Rails 2.3, I discovered a practical use for the new Rails Metal middleware. Dumping memory stats to my log was just sorta unreadable in a practical scenario, and was more or less entirely unusable in production. Fortunately, Metal provides a really easy way to output readable information to the browser without invoking the full Rails stack. (It’s also an excuse to write a Metal endpoint because it’s new and shiny, but that’s beside the point.)

It’s up at github – installation is dead easy (assuming you’re on Rails 2.3+, of course) – just install the plugin, restart your app, and hit <your url>/stats/scrap in your browser. Bam, instant juicy memory goodness about your app at your fingertips. If you’d like an example of the output, good news! Check it out at http://tachyonsix.com/scrap.htm.

You can use it to troubleshoot heap leaks – just run a few requests, hit your Scrap URL, and see what your deltas look like. Seeing a huge growth in a certain type of object? Chances are pretty good that you have a heap leak, and can start tracking it down.

The request history can help you locate certain actions that might be causing spikes in memory usage. It’ll show the last N requests, along with memory and heap statistics before each request. If there’s a consistent memory usage leap after a certain action, chances are that it’s doing something naughty.

Want to get a bigger picture on what objects are hanging around? You can use the config/scrap.yml file to get Scrap to spit out more detailed reports on instances of a given class. There’s full documentation on it in the README.

Anyhow, give it a shot, let me know what you think.

Things to do when upgrading to Rails 2.3

I’m upgrading blippr to Rails 2.3. Here are some of the things that had to be changed to upgrade:

Switch the application entirely to LibXML for all its XML parsing needs

In config/environment.rb: Add the following

ActiveSupport::XmlMini.backend = 'LibXML'

This means that the faster_xml_simple monkeypatch is no longer needed. I don’t think we’re doing much else with XML on blippr, but it’ll be nice to have libxml-backed parsing all around. I must not use REXML. REXML is the app-killer. REXML is the little-death that brings total obliteration.

Fixes for will_paginate and SQL errors when counting records with a custom :select clause

* Upgrade will_paginate. Even after the upgrade, something about 2.3′s named scope handling was still breaking my app. I have a named scope like so:

  :select => "*, (blips.vote_score+2)/WEIGHT_FACTOR as weighted_score",
  :order => "weighted_score desc"

This was causing .paginate calls with this named scope to fail with an invalid SQL error. will_paginate should automatically clobber :select phrases before attempting to count records, but it wasn’t. The solution is to specify a :count condition to my .paginate calls with the right select clause.

Blip.best.paginate(:page => current_page, :per_page => 30, :count => {:select => "blips.id"})

In general, any paginate call with a :select specified seems to break. The :count clause fixes them.

Upgrade my libmemcached plugin

A lot of the internal session stuff has changed. We use Evan Weaver’s libmemcached client, and an upgraded copy of 37signals’ libmemcached store for Rails. The plugin’s been upgraded to work with 2.3, and provides a session store on top of the general Rails store.

Our caching config now looks something like this:

GENERAL_CACHE_SERVERS = ["localhost:11211"]
GENERAL_CACHE_OPTIONS = {:untaint => true}
SESSION_CACHE_SERVERS = ["localhost:11212"]
SESSION_CACHE_OPTIONS = { :prefix_key => "session:blippr" }
SESSION_MEMCACHE_CLIENT = Memcached.new(SESSION_CACHE_SERVERS, SESSION_CACHE_OPTIONS)

config.cache_store = :libmemcached_store, GENERAL_CACHE_SERVERS, GENERAL_CACHE_OPTIONS
config.action_controller.session_store = :libmemcached_store
config.action_controller.session = {
	:cache => SESSION_MEMCACHE_CLIENT,
	:expires_after => 86400
}

Works great with libmemcached, with separate memcached instances for fragments and sessions (so that an over-populated fragment store won’t start clobbering sessions).

Update query parsing

I parse query parameters for some funky filtering. In 2.2.2 I used:

ActionController::AbstractRequest.parse_query_parameters(query_string)

In 2.3, that becomes:

Rack::Utils.parse_query(query_string)

That’s about it for now, but as problems arise I’ll be sure to add them.

Monitoring Rails: Getting instant monitoring alerts

Monitoring is big. Having an automated daemon watch your stuff and make sure it’s running properly can let you sleep at night, knowing that if something blows up, there’s an ever-watchful guardian ready to wake you up so you can fix it.

There are a number of monitoring solutions that are popular these days, such as monit, god, and Nagios. They’re all fantastic, but sometimes you just want something simple and to-the-point, right?

Read More »

Installing the fauna libmemcached gem on Fedora Core 6

This is mostly for my own reference, but also because I couldn’t find any great help while googling the problem.

I’m working on switching from memcache-client to Evan Weaver’s libmemcached gem, and it’s gone well, except for one nagging error:

libmemcached.so.2: cannot open shared object file: No such file or directory - /opt/ruby-enterprise-1.8.6-20081215/lib/ruby/gems/1.8/gems/memcached-0.13/lib/rlibmemcached.so

libmemcached.so.2 was absolutely there, in my /usr/local/lib path. However, ldd was showing that rlibmemcached.so wasn’t properly linked to that library. The solution was the following:

[root@polaris libmemcached-0.25.14]# ./configure --prefix=/usr
[root@polaris libmemcached-0.25.14]# make && make install

ldd now shows the proper reference, and everything works. All better!