<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Coffee Powered &#187; Uncategorized</title>
	<atom:link href="http://www.coffeepowered.net/category/uncategorized/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.coffeepowered.net</link>
	<description>code and content</description>
	<lastBuildDate>Sun, 05 Sep 2010 20:38:39 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.1</generator>
		<item>
		<title>Write-once read-only fixtures for Rails tests</title>
		<link>http://www.coffeepowered.net/2010/09/05/write-once-read-only-fixtures-for-rails-tests/</link>
		<comments>http://www.coffeepowered.net/2010/09/05/write-once-read-only-fixtures-for-rails-tests/#comments</comments>
		<pubDate>Sun, 05 Sep 2010 20:38:39 +0000</pubDate>
		<dc:creator>Chris Heald</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.coffeepowered.net/?p=330</guid>
		<description><![CDATA[In the project I&#8217;m currently working on, I&#8217;m heavily using factory_girl to generate test data, rather than using the old Rails fixtures standby. However, I still have a set of read-only fixtures (which are used for testing read-only models against a legacy database). I&#8217;m using these in my tests, but since they are read only [...]]]></description>
			<content:encoded><![CDATA[<p>In the project I&#8217;m currently working on, I&#8217;m heavily using <a href="http://github.com/thoughtbot/factory_girl">factory_girl</a> to generate test data, rather than using the old Rails fixtures standby. However, I still have a set of read-only fixtures (which are used for testing read-only models against a legacy database). I&#8217;m using these in my tests, but since they are read only (like, seriously &#8211; the models are marked as by using <code>after_find</code> to call <code>readonly!</code>, ensuring that records will not be accidentally written), there&#8217;s no need to wipe and re-insert them per-test.</p>
<p>It&#8217;s not too hard to set up fixtures to be inserted once per test suite run &#8211;</p>
<p>In your test_helper.rb, above the <code>class ActiveSupport::TestCase</code> definition, add the following:</p>
<pre class="brush: ruby;">
Fixtures.reset_cache
fixtures_folder = File.join(RAILS_ROOT, 'test', 'fixtures')
fixtures = Dir[File.join(fixtures_folder, '*.yml')].map {|f| File.basename(f, '.yml') }
Fixtures.create_fixtures(fixtures_folder, fixtures)
Fixtures.reset_cache
</pre>
<p>Next, turn off transactional fixtures and comment out the fixtures macro:</p>
<pre class="brush: ruby;">
self.use_transactional_fixtures = false
# fixtures :all
</pre>
<p>That&#8217;s all there is to it. Your fixtures will be inserted into your test database once when test_helper is included for the first time, and then not again for the rest of the test suite run. This should speed your tests up substantially.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.coffeepowered.net/2010/09/05/write-once-read-only-fixtures-for-rails-tests/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>FlexAuth: Portable authentication for Battle.net</title>
		<link>http://www.coffeepowered.net/2010/06/10/flexauth-portable-authentication-for-battle-net/</link>
		<comments>http://www.coffeepowered.net/2010/06/10/flexauth-portable-authentication-for-battle-net/#comments</comments>
		<pubDate>Thu, 10 Jun 2010 22:58:14 +0000</pubDate>
		<dc:creator>Chris Heald</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.coffeepowered.net/?p=257</guid>
		<description><![CDATA[I&#8217;ve just released my first Android app, called FlexAuth. It&#8217;s mostly an excuse to learn Android development, but it does something useful, too &#8211; it serves as a souped-up mobile authenticator for Blizzard&#8217;s Battle.net login infrastructure. If you&#8217;d like the gory details, there&#8217;s a specification floating around on the internet that&#8217;ll help you understand the [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve just released my first Android app, called <a href="http://www.cyrket.com/p/android/com.chrisheald.flexauth/">FlexAuth</a>. It&#8217;s mostly an excuse to learn Android development, but it does something useful, too &#8211; it serves as a souped-up mobile authenticator for Blizzard&#8217;s Battle.net login infrastructure. If you&#8217;d like the gory details, <a href="http://bnetauth.freeportal.us/specification.html">there&#8217;s a specification floating around on the internet</a> that&#8217;ll help you understand the protocol.</p>
<p><a href="http://i.imgur.com/yW496.png"><img src="http://i.imgur.com/yW496.png" align="right" style="margin: 1em; width:200px;" /></a><br />
Mobile authenticators work by transforming a seed value (called the &#8220;token secret&#8221;) + the current time into your 8-digit authentication code. FlexAuth lets you set up multiple authenticators by providing the secret, or will let you have Blizzard generate one for you.</p>
<p>Why would you need this?</p>
<ul>
<li>You want to use a mobile authenticator, but don&#8217;t want to be locked out if you ever lose your phone (just setup a new token with your registered token secret).</li>
<li>You want to use multiple mechanisms to log in &#8211; maybe you need token authentication in a script, or you want to have the same authenticator values on multiple mobile phones.</li>
<li>You already have a token secret from another source and want to use it on your mobile phone.</li>
</ul>
<p>Obviously, these won&#8217;t apply to most people, but some folks will definitely find it useful.</p>
<hr style="clear: both;" />
<a href="http://i.imgur.com/NbAGQ.png"><img src="http://i.imgur.com/NbAGQ.png" align="right" style="margin: 1em; width:200px;" /></a><br />
Using it:</p>
<ol>
<li>Menu -> Add Account</li>
<li>Enter a name for this token/account. It can be whatever you&#8217;d like.</li>
<li>Either enter a serial + secret, or you can use the already-provided one, or generate a new one.</li>
<li>Save the token. You&#8217;ll notice that auth codes start generating right away.</li>
<li>It is highly recommended that you back up your token secret. If you uninstall the app, wipe your phone, etc, then you will lose the secret, and consequently lose the ability to generate auth codes. To back up a code, click into the token&#8217;s details, and long press on the secret to copy it. You can then paste it into a note or email or whatnot. To restore a token, simply generate a new token and use your backed up secret. It will generate compatible auth codes.</li>
</ol>
<p>All that said, <span style="color: #ff0000;">a word of caution</a>: <b>Never ever ever run authenticator software on the same machine that you&#8217;re logging in on.</b> It&#8217;s bad, it&#8217;s dumb, and you shouldn&#8217;t do it. Keep your authentication token generation on a separate device if you value your account.</p>
<p>If <a href="http://www.wowwiki.com/Battle.net_Mobile_Authenticator#Desktop_port">any particular same-machine authentication scheme</a> gained any measure of popularity, it would be targeted by malware and your authenticator would be useless. Don&#8217;t do it.</p>
<p>Other than that, enjoy!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.coffeepowered.net/2010/06/10/flexauth-portable-authentication-for-battle-net/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Serving files out of GridFS, part 2</title>
		<link>http://www.coffeepowered.net/2010/02/24/serving-files-out-of-gridfs-part-2/</link>
		<comments>http://www.coffeepowered.net/2010/02/24/serving-files-out-of-gridfs-part-2/#comments</comments>
		<pubDate>Wed, 24 Feb 2010 11:44:24 +0000</pubDate>
		<dc:creator>Chris Heald</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.coffeepowered.net/?p=244</guid>
		<description><![CDATA[Since my initial experiments with GridFS and nginx-gridfs, I discovered a rather downer of a dealbreaker: compiling Passenger and nginx-gridfs into the same nginx binary makes nginx very unhappy. It hard-freezes (as in, blocks forever) when you request a GridFS file with Passenger enabled. Oops. So, I sat down and fixed gridfs-fuse. You can grab [...]]]></description>
			<content:encoded><![CDATA[<p>Since my initial experiments with <a href="http://www.mongodb.org/display/DOCS/GridFS+Specification">GridFS</a> and <a href="http://github.com/mdirolf/nginx-gridfs">nginx-gridfs</a>, I discovered a rather downer of a dealbreaker: compiling <a href="http://www.modrails.com/">Passenger</a> and nginx-gridfs into the same <a href="http://nginx.org/">nginx</a> binary makes nginx very unhappy. It hard-freezes (as in, blocks forever) when you request a GridFS file with Passenger enabled. Oops.</p>
<p>So, I sat down and fixed gridfs-fuse. You can grab <a href="http://github.com/cheald/gridfs-fuse">my branch at GitHub</a>. I made a few changes that make it ideal for serving files out of a GridFS DB, with a few caveats.<br />
<span id="more-244"></span></p>
<h2>Installation and Configuration</h2>
<p>Building it is relatively simple.</p>
<ol>
<li>Install scons, the Python SConstruct utility (on Fedora/CentOS/RHEL, <code>yum install scons</code>)</li>
<li>Extract or symlink a copy of your <a href="http://www.mongodb.org/display/DOCS/Home">mongodb</a> install to <code>/opt/mongo</code></li>
<li>Run <code>scons</code></li>
<li>If all builds well, yay. If not, fix any missing dependencies or path issues. Edit SConstruct to change any paths that you need to.</li>
<li>Create a mount point for your GridFS filesystem; I used /mnt/gridfs (<code>sudo mkdir /mnt/gridfs</code>)</li>
<li>chown your mount point to your webserver&#8217;s user. If you run Apache, this is probably <code>apache</code>. If you run nginx, it&#8217;s probably <code>nobody</code>. (<code>sudo chown nobody.nobody /mnt/gridfs</code>)</li>
<li>Mount the database to the mount point.
<pre class="brush: bash;">
sudo -u nobody ./mount_gridfs --db=your_database --host=localhost /mnt/gridfs
</pre>
<p>Change the user and db parameters as required.
</li>
<li>Configure your webserver to serve files appropriately. In my case, I have <a href="http://github.com/jnicklas/carrierwave">carrierwave</a> set up to write files to <code>uploads/model/_id/filename.png</code>, and carrierwave is configured to use <code>/images/gfs</code> as my base URL. This means that for a given file, I might end up with a path like <code>/images/gfs/uploads/user/avatar/4b8475cc69e0dc57e7000005/thumb_untitled-20.png</code>. To cause the GridFS files to be served off of the mount point, I just symlinked the mount to /images/gfs.
<pre class="brush: bash;">
cd public/images
ln -s /mnt/gridfs gfs
</pre>
</li>
</ol>
<p>Once that&#8217;s all set up, you should be able to use your webserver to serve images directly out of your Mongo database, and at pretty fair rates, too!</p>
<h2>143% Unscientific Benchmarks</h2>
<pre class="brush: plain;">
[chris@polaris gridfs-fuse]# ab -n 5000 -c 25 http://advice:81/images/gfs/uploads/user/avatar/4b8347a698db740b30000057/thumb_adrine-big.png

Server Software:        nginx/0.8.33
Server Hostname:        advice
Server Port:            81

Document Path:          /images/gfs/uploads/user/avatar/4b8347a698db740b30000057/thumb_adrine-big.png
Document Length:        14332 bytes

Concurrency Level:      25
Time taken for tests:   5.029 seconds
Complete requests:      5000
Failed requests:        0
Write errors:           0
Total transferred:      72725000 bytes
HTML transferred:       71660000 bytes
Requests per second:    994.22 [#/sec] (mean)
Time per request:       25.145 [ms] (mean)
Time per request:       1.006 [ms] (mean, across all concurrent requests)
Transfer rate:          14121.93 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.1      0       1
Processing:    16   25   1.4     25      52
Waiting:        2   24   1.4     24      52
Total:         17   25   1.4     25      53

Percentage of the requests served within a certain time (ms)
  50%     25
  66%     25
  75%     25
  80%     25
  90%     25
  95%     26
  98%     27
  99%     32
 100%     53 (longest request)
</pre>
<h2>Caveats</h2>
<p>To get this working, I had to hack in directory support. GridFS stores files with paths, but doesn&#8217;t store them in a hierarchy; Fuse navigates a filesystem, which is hierarchical. In order to overcome this, I made gridfs-fuse respond to directory requests as valid. For a given file, gridfs-fuse will walk the following path hierarchy:</p>
<p><code>GET /uploads/user/avatar/4b8347a698db740b30000057/thumb_adrine-big.png</code><br />
Check for <code>uploads</code>, directory exists<br />
Check for <code>uploads/user</code>, directory exists<br />
Check for <code>uploads/user/avatar/</code>, directory exists<br />
Check for <code>uploads/avatar/4b8347a698db740b30000057</code>, directory exists<br />
Check for <code>uploads/user/avatar/4b8347a698db740b30000057/thumb_adrine-big.png</code>, file exists, return file.</p>
<p>There are two things to be aware of here:</p>
<ol>
<li>The deeper your path hierarchy, the more steps gridfs-fuse will take to find your file. Less directory nesting means faster file serving. The performance difference won&#8217;t be massive, but it&#8217;s there.</li>
<li><strong>/!\ Big giant hack. /!\</strong> <em>gridfs-fuse assumes that any path part with a period in it is the path leaf</em>. This is done so that we don&#8217;t have to keep querying the DB with regexes, which degrades performance by about 90% in my testing. Always, always, always make sure your filenames have a period in them, and make sure your directories do not have a period in them. This is a rather hefty set of caveats, but if you&#8217;ll stick to them, you will be rewarded with easy GridFS file serving.</li>
</ol>
<h3>What happens if I don&#8217;t follow those rules?</h3>
<p>A few things happen. If you put periods in directory names, you&#8217;ll get 404s. They&#8217;ll be fast 404s, but they&#8217;ll be 404s. Even if a filepath is valid, like <code>/images/foo.bar/baz/bin.png</code>, gridfs-fuse will short-circuit at <code>images/foo.bar</code>, assuming that is the leaf of the hierarchy.</p>
<p>If you don&#8217;t put a period in your filenames, then gridfs-fuse will keep returning &#8220;yup, that&#8217;s a directory&#8221;, even when your webserver requests <code>/images/foo.bar/baz/bin.png/index.html</code> and then <code>/images/foo.bar/baz/bin.png/index.html/index.html</code> and then <code>/images/foo.bar/baz/bin.png/index.html/index.html/index.html</code>, and so forth. There&#8217;s a built-in stop at 10 levels deep &#8211; at 10 levels, gridfs-fuse gives up and just returns a 404, but it&#8217;ll take you a relatively long time to get there, and it&#8217;s really very highly recommended that you don&#8217;t do that.</p>
<h2>What about when gridfs-fuse isn&#8217;t running?</h2>
<p>Never fear, that&#8217;s easily fixed. Just use a Rack or Rails Metal middleware to serve images from GridFS. This is <strong>massively</strong> slower than serving files through gridfs-fuse, but at least your visitors won&#8217;t be treated to a site full of broken images if your mount point goes away for whatever reason. I&#8217;m using the following Metal endpoint. Just throw it into app/metals/gridfs.rb, add <code>config.metals = ["Gridfs"]</code> into your environment.rb, and you&#8217;re off to the races.</p>
<pre class="brush: ruby;">
# rails metal to be used with carrierwave (gridfs) and MongoMapper

require 'mongo'
require 'mongo/gridfs'

# Allow the metal piece to run in isolation
require(File.dirname(__FILE__) + &quot;/../../config/environment&quot;) unless defined?(Rails)

class Gridfs
  def self.call(env)
    if env[&quot;PATH_INFO&quot;] =~ /^\/images\/gfs\/(.+)$/
      key = $1
      if ::GridFS::GridStore.exist?(MongoMapper.database, key)
        ::GridFS::GridStore.open(MongoMapper.database, key, 'r') do |file|
          [200, {'Content-Type' =&gt; file.content_type}, [file.read]]
        end
      else
        [404, {'Content-Type' =&gt; 'text/plain'}, ['File not found.']]
      end
    else
      [404, {'Content-Type' =&gt; 'text/plain'}, ['File not found.']]
    end
  end
end
</pre>
<p>(I didn&#8217;t write that, but I can&#8217;t find the source to give credit at the moment).</p>
<p>That gives you a highly performant front-end solution with a reliable fallback. For any given request, the following should happen:</p>
<ol>
<li>Your webserver attempts to load the file out of GridFS. If it can&#8217;t be found (likely due to a missing mountpoint), then&#8230;</li>
<li>The request will fall through to your Metal handler. It will then attempt to serve it from GridFS.</li>
<li>If it still can&#8217;t be found, the request falls through to your Rails app.</li>
</ol>
<p>To prevent step 3 from happening, you might want to change line 18 of the Metal handler to return a 200 and read out a generic &#8220;missing image&#8221; image of some sort. That&#8217;ll prevent 404s from invoking a hit to your app.</p>
<p>Stick a CDN in front of it all, and you have a high-performance file upload solution with automatic replication and sharding that you can treat like any other piece of web data. Hooray!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.coffeepowered.net/2010/02/24/serving-files-out-of-gridfs-part-2/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Serving files out of GridFS</title>
		<link>http://www.coffeepowered.net/2010/02/17/serving-files-out-of-gridfs/</link>
		<comments>http://www.coffeepowered.net/2010/02/17/serving-files-out-of-gridfs/#comments</comments>
		<pubDate>Wed, 17 Feb 2010 20:54:11 +0000</pubDate>
		<dc:creator>Chris Heald</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.coffeepowered.net/?p=233</guid>
		<description><![CDATA[GridFS is a nifty little feature in MongoDB that allows you to store files of all shapes and sizes in Mongo itself, getting the benefits of Mongo&#8217;s sharding and replication. However, since they&#8217;re in a database, and not on the filesystem directly, how do we serve them? There are lots of benchmarks and numbers under [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.mongodb.org/display/DOCS/GridFS+Specification">GridFS</a> is a nifty little feature in <a href="http://www.mongodb.org/display/DOCS/Home">MongoDB</a> that allows you to store files of all shapes and sizes in Mongo itself, getting the benefits of Mongo&#8217;s sharding and replication. However, since they&#8217;re in a database, and not on the filesystem directly, how do we serve them?</p>
<p>There are lots of benchmarks and numbers under the cut. Keep reading!</p>
<p><span id="more-233"></span></p>
<p>Right now, there are three options:</p>
<ol>
<li>Use a &#8220;low-level&#8221; script handler, like a Rack script or Rails Metal handler to serve them out of the database</li>
<li>Use something like <a href="http://github.com/mikejs/gridfs-fuse/">gridfs-fuse</a> to mount the database as a filesystem, and read it with the Fileserver directly</li>
<li>Use something like <a href="http://github.com/mdirolf/nginx-gridfs">nginx-gridfs</a> to talk directly to MongoDB from your webserver.</li>
</ol>
<p>I wasn&#8217;t able to get gridfs-fuse to build on my system, but I was able to build the nginx module. The question, of course, is how fast are you going be serving files with each solution?</p>
<h2>Filesystem read through Apache</h2>
<p>First, I&#8217;ll establish a baseline. I&#8217;m running Apache as my frontend server, and we&#8217;ll use ab to benchmark its throughput.</p>
<pre class="brush: ruby;">[chris@polaris conf]# ab -n 50000 -c 10 http://advice/images/embed/alliance-60.png

Server Software:        Apache/2.2.13
Server Hostname:        advice
Server Port:            80

Document Path:          /images/embed/normal_alliance-60.png
Document Length:        31596 bytes

Concurrency Level:      10
Time taken for tests:   1.904 seconds
Complete requests:      5000
Failed requests:        0
Write errors:           0
Total transferred:      159463760 bytes
HTML transferred:       158043192 bytes
Requests per second:    2625.37 [#/sec] (mean)
Time per request:       3.809 [ms] (mean)
Time per request:       0.381 [ms] (mean, across all concurrent requests)
Transfer rate:          81767.87 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    1   0.4      1       4
Processing:     1    3   0.5      3       6
Waiting:        0    1   0.4      1       4
Total:          2    4   0.4      4       8

Percentage of the requests served within a certain time (ms)
  50%      4
  66%      4
  75%      4
  80%      4
  90%      4
  95%      4
  98%      5
  99%      5
 100%      8 (longest request)
</pre>
<p>Nice and fast, like like we&#8217;d expect.</p>
<h2>Filesystem read through nginx</h2>
<pre class="brush: ruby;">[chris@polaris conf]# ab -n 50000 -c 10 http://advice:81/images/embed/normal_alliance-60.png

Server Software:        nginx/0.8.33
Server Hostname:        advice
Server Port:            81

Document Path:          /images/embed/normal_alliance-60.png
Document Length:        31596 bytes

Concurrency Level:      10
Time taken for tests:   7.623 seconds
Complete requests:      50000
Failed requests:        0
Write errors:           0
Total transferred:      1590513618 bytes
HTML transferred:       1579863192 bytes
Requests per second:    6559.31 [#/sec] (mean)
Time per request:       1.525 [ms] (mean)
Time per request:       0.152 [ms] (mean, across all concurrent requests)
Transfer rate:          203763.10 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.2      0       9
Processing:     1    1   0.4      1      11
Waiting:        0    0   0.1      0       9
Total:          1    1   0.5      1      12

Percentage of the requests served within a certain time (ms)
  50%      1
  66%      1
  75%      1
  80%      2
  90%      2
  95%      2
  98%      3
  99%      3
 100%     12 (longest request)
</pre>
<p>nginx <i>screams</i>. At 6500 requests/sec, it&#8217;s blisteringly fast.</p>
<h2>GridFS read through nginx-gridfs</h2>
<pre class="brush: ruby;">[chris@polaris conf]# ab -n 5000 -c 10 http://advice:81/images/gfs/uploads/user/avatar/4b7b2c0e98db7475fc000003/normal_alliance-60.png

Server Software:        nginx/0.8.33
Server Hostname:        advice
Server Port:            81

Document Path:          /images/gfs/uploads/user/avatar/4b7b2c0e98db7475fc000003/normal_alliance-60.png
Document Length:        31596 bytes

Concurrency Level:      10
Time taken for tests:   4.613 seconds
Complete requests:      5000
Failed requests:        0
Write errors:           0
Total transferred:      158580000 bytes
HTML transferred:       157980000 bytes
Requests per second:    1083.88 [#/sec] (mean)
Time per request:       9.226 [ms] (mean)
Time per request:       0.923 [ms] (mean, across all concurrent requests)
Transfer rate:          33570.65 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.0      0       1
Processing:     1    9   4.7      9     103
Waiting:        1    9   4.7      9     102
Total:          2    9   4.7      9     103

Percentage of the requests served within a certain time (ms)
  50%      9
  66%      9
  75%      9
  80%      9
  90%      9
  95%      9
  98%      9
  99%     11
 100%    103 (longest request)
</pre>
<p>Definitely a lot slower, but still very respectable. 1051 requests/sec is going to be more than adequate for most purposes, particularly if fronted with a CDN.</p>
<p>And finally&#8230;</p>
<h2>Rails Metal handler</h2>
<p>The nice thing about the Rails metal handler solution is that it&#8217;s easy. No recompiling, just drop the handler into your project and you&#8217;re off to the races. That said&#8230;</p>
<pre class="brush: ruby;">[chris@polaris nginx-gridfs]$ ab -n 250 -c 4  http://advice/images/gfs/uploads/user/avatar/4b7b2c0e98db7475fc000003/normal_alliance-60.png

Server Software:        Apache/2.2.13
Server Hostname:        advice
Server Port:            80

Document Path:          /images/gfs/uploads/user/avatar/4b7b2c0e98db7475fc000003/normal_alliance-60.png
Document Length:        31596 bytes

Concurrency Level:      4
Time taken for tests:   4.646 seconds
Complete requests:      250
Failed requests:        0
Write errors:           0
Total transferred:      7960000 bytes
HTML transferred:       7899000 bytes
Requests per second:    53.81 [#/sec] (mean)
Time per request:       74.338 [ms] (mean)
Time per request:       18.585 [ms] (mean, across all concurrent requests)
Transfer rate:          1673.10 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.1      0       1
Processing:    15   74  75.6     34     287
Waiting:        0   72  75.8     30     276
Total:         15   74  75.6     34     288

Percentage of the requests served within a certain time (ms)
  50%     34
  66%     39
  75%    139
  80%    192
  90%    201
  95%    210
  98%    239
  99%    245
 100%    288 (longest request)
</pre>
<p>I obviously ran far fewer requests this go-round. The reason is pretty obvious &#8211; running 5000 requests through the Ruby stack would have taken approximately <em>forever</em>. At 53 requests per second, this is not an attractive solution, particularly if you consider the CPU overhead that it&#8217;s incurring.</p>
<h2>Conclusions</h2>
<table class='data' border='1'>
<tr>
<th>Solution</th>
<th>Requests/second</th>
<th>% Apache FS</th>
<th>% Nginx FS</th>
<th>% Nginx GridFS</th>
<th>% Apache Ruby</th>
</tr>
<tr>
<td>Filesystem via Apache</th>
<td>2625.37</td>
<td>-</td>
<td>40.03%</td>
<td>242.22%</td>
<td>4,878.96%</td>
</tr>
<tr>
<td>Filesystem via Nginx</th>
<td>6559.31</td>
<td>249.84%</td>
<td>-</td>
<td>605.17%</td>
<td>12,189.76%</td>
</tr>
<tr>
<td>GridFS via nginx module</th>
<td>1083.88</td>
<td>41.28%</td>
<td>16.52%</td>
<td>-</td>
<td>2014.27%</td>
</td>
</tr>
<tr>
<td>Rails metal handler via Passenger</th>
<td>53.81</td>
<td>2.05%</td>
<td>0.82%</td>
<td>4.96%</td>
<td>-</td>
</tr>
</table>
<p>If you&#8217;re looking to abstract away from storing files on a filesystem, GridFS is a feasable solution. It can really crank some mean output numbers, and though it&#8217;s not up to par with a raw filesystem read, also consider that in many production environments, such a raw filesystem read might be happening via an NFS or GFS share, which is going to massively degrade the performance of that request. Given the no-hassle store-and-forget-about-it solution that GridFS offers, even when faced with the challenge of multi-server replication, it seems that you can get enough performance out of it to justify it as a solution.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.coffeepowered.net/2010/02/17/serving-files-out-of-gridfs/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>counter_cache for MongoMapper</title>
		<link>http://www.coffeepowered.net/2010/02/15/counter_cache-for-mongomapper/</link>
		<comments>http://www.coffeepowered.net/2010/02/15/counter_cache-for-mongomapper/#comments</comments>
		<pubDate>Tue, 16 Feb 2010 02:30:48 +0000</pubDate>
		<dc:creator>Chris Heald</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.coffeepowered.net/?p=229</guid>
		<description><![CDATA[I&#8217;ve started playing with MongoMapper, and it&#8217;s quite excellent, but it does suffer very much from being young. There are lots of pieces missing that veterans of ActiveRecord will take for granted. I&#8217;ve been working around or patching them, for the most part, but I felt that my solution to `:counter_cache` deserved a post. In [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve started playing with <a href="http://github.com/jnunemaker/mongomapper">MongoMapper</a>, and it&#8217;s quite excellent, but it does suffer very much from being young. There are lots of pieces missing that veterans of ActiveRecord will take for granted. I&#8217;ve been working around or patching them, for the most part, but I felt that my solution to `:counter_cache` deserved a post.</p>
<p>In short, I didn&#8217;t want to hack around with the MongoMapper associations code, so I just implemented my own little ride-along version.</p>
<pre class="brush: ruby;">
module SecretProject
  module CounterCache
    module ClassMethods
      def counter_cache(field)
        class_eval &lt;&lt;-EOF
          after_create &quot;increment_counter_for_#{field}&quot;
          after_destroy &quot;decrement_counter_for_#{field}&quot;
        EOF
      end
    end

    module InstanceMethods
      def method_missing(method, *args)
        if matches = method.to_s.match(/^(in|de)crement_counter_for_(.*)$/) then
          dir = matches[1] == &quot;in&quot; ? 1 : -1
          parent_association = matches[2]
          if parent = self.send(parent_association) then
            name = &quot;#{self.class.to_s.tableize}_count&quot;
            if parent.respond_to?(name)
              parent.collection.update({:_id =&gt; parent._id}, {&quot;$inc&quot; =&gt; {name =&gt; dir}})
            end
          end
        else
          super
        end
      end
    end

    def self.included(receiver)
      receiver.extend         ClassMethods
      receiver.send :include, InstanceMethods
    end
  end
end
</pre>
<p>Throw that into your lib directory, load it with an initializer, and then you can use it something like so:</p>
<pre class="brush: ruby;">
class Foo
  include MongoMapper::Document
  include SecretProject::CounterCache

  belongs_to :user
  counter_cache :user  # Will cause a foos_count field on the owning user to be maintained when a Foo is created or deleted.
end
</pre>
<p>This&#8217;ll only increment a counter if you&#8217;ve defined one on your parent object, via <code>key :foos_count, Integer</code> or similar, just so that it doesn&#8217;t go around updating every model you might associate it with.</p>
<p>Yay.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.coffeepowered.net/2010/02/15/counter_cache-for-mongomapper/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Safe action caching with Memcached</title>
		<link>http://www.coffeepowered.net/2010/02/10/safe-action-caching-with-memcached/</link>
		<comments>http://www.coffeepowered.net/2010/02/10/safe-action-caching-with-memcached/#comments</comments>
		<pubDate>Thu, 11 Feb 2010 04:04:04 +0000</pubDate>
		<dc:creator>Chris Heald</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.coffeepowered.net/?p=223</guid>
		<description><![CDATA[I&#8217;ve started using action caching more aggressively, to handle a large volume of not-signed-in search traffic. It composes a significant chunk of my site&#8217;s total traffic, but there&#8217;s no good reason to be recomputing full pages for all those long-tail hits. So, the obvious thing is to just implement a quick action cache. # Controller [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve started using action caching more aggressively, to handle a large volume of not-signed-in search traffic. It composes a significant chunk of my site&#8217;s total traffic, but there&#8217;s no good reason to be recomputing full pages for all those long-tail hits. So, the obvious thing is to just implement a quick action cache.</p>
<pre class="brush: ruby;">
# Controller
caches_action :show, :unless =&gt; :user?, :expires_in =&gt; 24.hours
</pre>
<pre class="brush: ruby;">
# Sweeper
expire_action :controller =&gt; &quot;nodes&quot;, :action =&gt; &quot;show&quot;, :id =&gt; record.to_param
</pre>
<p>This all works dandy, but I generate pretty URLs, which means sometimes there are characters in the URL that Memcached doesn&#8217;t like. A few minutes after deploying my patch, I started getting IMs from my logger bot telling me things were unhappy.</p>
<pre class="brush: ruby;">
blippr. com: [#1265856785] ArgumentError: illegal character in key &quot;views/m.blippr.com/apps/346562-PicFo g.mobile&quot;
blippr. com: [#1265857710] ArgumentError: illegal character in key &quot;views/www.blippr.com/apps/336714-µTorrent  &quot;
blippr. com: [#1265857897] ArgumentError: illegal character in key &quot;views/www.blippr.com/apps/337076-ustre am&quot;
blippr. com: [#1265857924] ArgumentError: illegal character in key &quot;views/www.blippr.com/apps/336714-µTorrent  &quot;
</pre>
<p>That&#8217;s memcached complaining about the hash keys we&#8217;re giving to it. This just won&#8217;t do. We could just regex out &#8220;bad&#8221; characters, but that means potential collisions, and potentially leaves edge cases. Why not just hash it instead?</p>
<p>A quick monkey patch later:</p>
<pre class="brush: ruby;">
class ActionController::Caching::Actions::ActionCachePath
	def path
		@cached_path ||= Digest::SHA1.hexdigest(@path)
	end
end
</pre>
<p>And we&#8217;re all dandy. Now, rather than caching by path, the path is hashed, and the hash is used as the path key. Since hashes will always be hexadecimal characters, we know that it&#8217;ll never make memcached unhappy.</p>
<pre class="brush: ruby;">
Path is blippr.com/movies/6696-The-Silence-of-the-Lambs...
Cached fragment hit: views/9111cdefca4a52cb0e3a5ebac4f618127a30efd0 (1.1ms)
</pre>
<p>There is an argument for not using this technique if you&#8217;re using file-based caching, since it means your cached bits won&#8217;t be segregated into directories, but memcached doesn&#8217;t support expiry by regex anyhow, so there&#8217;s no good reason to not use it in this case.</p>
<p>Enjoy!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.coffeepowered.net/2010/02/10/safe-action-caching-with-memcached/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Multibyte string slicing for fun and profit</title>
		<link>http://www.coffeepowered.net/2009/12/06/multibyte-string-slicing-for-fun-and-profit/</link>
		<comments>http://www.coffeepowered.net/2009/12/06/multibyte-string-slicing-for-fun-and-profit/#comments</comments>
		<pubDate>Sun, 06 Dec 2009 20:53:33 +0000</pubDate>
		<dc:creator>Chris Heald</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.coffeepowered.net/?p=200</guid>
		<description><![CDATA[Ran into a small issue in one of my user models. I was using a helper to display a user&#8217;s first name, last initial. It looked something like this: def display_name(user) &#34;user.first_name #{user.last_name.slice(0,1)}&#34; end Seems innocent enough, sure. Except&#8230;it doesn&#8217;t work in multibyte character sets. The first Cyrillic speaker to sign up blew that all [...]]]></description>
			<content:encoded><![CDATA[<p>Ran into a small issue in one of my user models. I was using a helper to display a user&#8217;s first name, last initial. It looked something like this:</p>
<pre class="brush: ruby;">
def display_name(user)
  &quot;user.first_name #{user.last_name.slice(0,1)}&quot;
end
</pre>
<p>Seems innocent enough, sure. Except&#8230;it doesn&#8217;t work in multibyte character sets. The first Cyrillic speaker to sign up blew that all up. When parsing an XML fragment with a name like this included, I was getting the following error: </p>
<pre class="brush: ruby;">
ActionView::TemplateError: premature end of regular expression: /^\s*Елена\ �/

nokogiri (1.4.0) lib/nokogiri/xml/fragment_handler.rb:53:in `characters'</pre>
<p>The issue, as it turned out, is that String#slice is a bytewise operation, not a character-wise operation like I&#8217;d so naively assumed. The issue is pretty easily to observe:</p>
<pre class="brush: ruby;">&gt;&gt; &quot;Журинова&quot;.slice(0, 1)
=&gt; &quot;\320&quot;</pre>
<p>Fortunately, Rails has multibyte support baked in already, so it&#8217;s an easy mistake to correct:</p>
<pre class="brush: ruby;">
def display_name(user)
  &quot;user.first_name #{user.last_name.chars.first}&quot;
end
</pre>
<p>And now&#8230;</p>
<pre class="brush: ruby;">&gt;&gt; &quot;Журинова&quot;.chars.first
=&gt; &quot;Ж&quot;</pre>
<p>It&#8217;s very easy to make mistakes like this, and many times you may not even realize that they&#8217;re made unless you try to do something funny, like using it as a part of a regex. The safe operation is to never use String#slice or string subscripting on user data, but to instead treat all strings as multibyte strings. Very subtle, but the effects can be pretty nasty if you don&#8217;t.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.coffeepowered.net/2009/12/06/multibyte-string-slicing-for-fun-and-profit/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>System date considered important</title>
		<link>http://www.coffeepowered.net/2009/12/05/system-date-considered-important/</link>
		<comments>http://www.coffeepowered.net/2009/12/05/system-date-considered-important/#comments</comments>
		<pubDate>Sat, 05 Dec 2009 21:38:10 +0000</pubDate>
		<dc:creator>Chris Heald</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.coffeepowered.net/?p=193</guid>
		<description><![CDATA[I&#8217;ve been slamming my head against the wall for the past two hours. I had an OAuth connection to a remote service working just dandy in development, but as soon as I tried to use that exact same code with the exact same config and exact same gems in production&#8230;I was getting &#8220;401 unauthorized&#8221; errors [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve been slamming my head against the wall for the past two hours. I had an OAuth connection to a remote service working just dandy in development, but as soon as I tried to use that exact same code with the exact same config and exact same gems in production&#8230;I was getting &#8220;401 unauthorized&#8221; errors back from the remote service when attempting to get a request token.</p>
<p>After an extremely tedious series of debugger checks to make sure my OAuth signature was right, I decided to just edit the oauth gem on my production box and add a little debugging statement to dump the HTTP request to stdout. What I found was&#8230;surprising.</p>
<pre class="brush: ruby;">
&gt;&gt; OAuthConsumers::Netflix.new.consumer.get_request_token
opening connection to api.netflix.com...
opened
&lt;- &quot;POST /oauth/request_token HTTP/1.1\r\nAccept: */*\r\nConnection: close\r\nUser-Agent: OAuth gem v0.3.6\r\nAuthorization: OAuth oauth_nonce=\&quot;E73sq4XMkG547EbuCB9GUfG4AtsjD2QFySwLPKj0tI\&quot;, oauth_callback=\&quot;oob\&quot;, oauth_signature_method=\&quot;HMAC-SHA1\&quot;, oauth_timestamp=\&quot;1260049119\&quot;, oauth_consumer_key=\&quot;xxxxxxxxxxxxx\&quot;, oauth_signature=\&quot;QD5b5Oy8LFLvXWl%2B3R%2BQI0xlIcg%3D\&quot;, oauth_version=\&quot;1.0\&quot;\r\nContent-Length: 0\r\nHost: api.netflix.com\r\n\r\n&quot;
-&gt; &quot;HTTP/1.1 401 Unauthorized\r\n&quot;
-&gt; &quot;X-Lighty-Magnet-Uri-Path: /oauth/request_token\r\n&quot;
-&gt; &quot;X-Mashery-Responder: proxyworker-i-e23bae8a.mashery.com\r\n&quot;
-&gt; &quot;X-Mashery-Error-Code: ERR_401_TIMESTAMP_IS_INVALID\r\n&quot;
-&gt; &quot;Content-Type: text/plain\r\n&quot;
-&gt; &quot;Accept-Ranges: bytes\r\n&quot;
-&gt; &quot;Content-Length: 20\r\n&quot;
-&gt; &quot;Date: Sat, 05 Dec 2009 21:27:08 GMT\r\n&quot;
-&gt; &quot;Server: Mashery Proxy\r\n&quot;
-&gt; &quot;\r\n&quot;
reading 20 bytes...
-&gt; &quot;Timestamp Is Invalid&quot;
read 20 bytes
Conn close
OAuth::Unauthorized: 401 Unauthorized
        from /opt/ruby-enterprise-1.8.7-2009.10/lib/ruby/gems/1.8/gems/oauth-0.3.6/lib/oauth/consumer.rb:200:in `token_request'
        from /opt/ruby-enterprise-1.8.7-2009.10/lib/ruby/gems/1.8/gems/oauth-0.3.6/lib/oauth/consumer.rb:128:in `get_request_token'
        from (irb):1
</pre>
<p>Whoa there, there&#8217;s some info that the OAuth gem wasn&#8217;t giving back to me. &#8220;Timestamp is invalid.&#8221; Well then, a quick check of system time, and&#8230;oh, hey, it turns out that my system has drifted to about 10 minutes fast. Easily corrected, at least.</p>
<pre class="brush: ruby;"># ntpdate -b 0.centos.pool.ntp.org &amp;&amp; service ntpd start</pre>
<p>With that all done&#8230;</p>
<pre class="brush: ruby;">&gt;&gt; OAuthConsumers::Netflix.new.consumer.get_request_token
opening connection to api.netflix.com...
opened
&lt;- &quot;POST /oauth/request_token HTTP/1.1\r\nAccept: */*\r\nConnection: close\r\nUser-Agent: OAuth gem v0.3.6\r\nAuthorization: OAuth oauth_nonce=\&quot;YIh5R3CBtAicneNREF5ZUcX80kao1zqRLLA5u8bQWA\&quot;, oauth_callback=\&quot;oob\&quot;, oauth_signature_method=\&quot;HMAC-SHA1\&quot;, oauth_timestamp=\&quot;1260048573\&quot;, oauth_consumer_key=\&quot;ksfa9rxmb8dzkxg4npwr74zv\&quot;, oauth_signature=\&quot;%2B%2Fyd5sRsJ7qmmZWNRqSlCvByYxw%3D\&quot;, oauth_version=\&quot;1.0\&quot;\r\nContent-Length: 0\r\nHost: api.netflix.com\r\n\r\n&quot;
-&gt; &quot;HTTP/1.1 200 OK\r\n&quot;
-&gt; &quot;X-Lighty-Magnet-Uri-Path: /oauth/request_token\r\n&quot;
-&gt; &quot;X-Mashery-Responder: proxyworker-i-7c31a414.mashery.com\r\n&quot;
-&gt; &quot;Content-Type: text/plain\r\n&quot;
-&gt; &quot;Server: Mashery_Server_Adapter_Query\r\n&quot;
-&gt; &quot;Date: Sat, 05 Dec 2009 21:29:32 GMT\r\n&quot;
-&gt; &quot;Accept-Ranges: bytes\r\n&quot;
-&gt; &quot;Content-Length: 194\r\n&quot;
-&gt; &quot;\r\n&quot;
reading 194 bytes...
-&gt; &quot;oauth_token=xxxxxxxx&amp;oauth_token_secret=xxxxxxxxx&amp;application_name=xxxxx&amp;login_url=https%3A%2F%2Fapi-user.netflix.com%2Foauth%2Flogin%3Foauth_token%3Dczjsmzw74nk2wy274g6drmwt&quot;
read 194 bytes
Conn close</pre>
<p>All better. Keep those datetimes synched, sports fans. Web services are becoming more and more interconnected, and if there&#8217;s one thing I&#8217;ve learned from heist movies, it&#8217;s that the first step in any successful job is to make sure your watches are synchronized. Nobody likes that guy who shows up 10 minutes late to everything!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.coffeepowered.net/2009/12/05/system-date-considered-important/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Sweet-ass performance hacks: better_assets</title>
		<link>http://www.coffeepowered.net/2009/07/29/sweet-ass-performance-hacks-better_assets/</link>
		<comments>http://www.coffeepowered.net/2009/07/29/sweet-ass-performance-hacks-better_assets/#comments</comments>
		<pubDate>Wed, 29 Jul 2009 10:29:18 +0000</pubDate>
		<dc:creator>Chris Heald</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.coffeepowered.net/?p=183</guid>
		<description><![CDATA[HTTP overhead is expensive. DNS lookups are expensive. Start dropping a bunch of Twitter widgets, Google ads, and GetSatisfaction buttons into your killer new Web 2.0 social networking site and you&#8217;ll find that your painstakingly-optimized site has slowed to a crawl while the server sits there waiting on Amazon S3 to get its act together [...]]]></description>
			<content:encoded><![CDATA[<p>HTTP overhead is expensive. DNS lookups are expensive. Start dropping a bunch of Twitter widgets, Google ads, and GetSatisfaction buttons into your killer new Web 2.0 social networking site and you&#8217;ll find that your painstakingly-optimized site has slowed to a crawl while the server sits there waiting on Amazon S3 to get its act together and serve you a 300-byte CSS file.</p>
<p>That sorta blows. Let&#8217;s not do it.</p>
<h2>Introducing better_assets</h2>
<p><a href="http://github.com/cheald/better_assets/tree/master">better_assets</a> is a monkeypatch to the Rails 2.3.2 AssetTagHelper to enable some additional functionality. The key points are:</p>
<p>* Time-based expiry of cached asset files, which is primarily useful for&#8230;<br />
* Caching and combining of remote assets<br />
* Finally, you can post-process combined assets with blocks passed to <code>javascript_include_tag</code> and <code>stylesheet_link_tag</code>.</p>
<h3>Examples</h3>
<p>It&#8217;s easy. You use it just like normal:</p>
<pre class="brush: ruby;">&lt;%=javascript_include_tag(
  &quot;jquery-1.3.2&quot;,
  &quot;foo&quot;,
  :cache =&gt; &quot;all&quot;) {|text| Packr.pack(text, :base62 =&gt; true) } %&gt;</pre>
<p>Whoa! What is this block madness? Why, that&#8217;s an extension to allow you to do whatever you want. In this example, we&#8217;re using <a href="http://blog.jcoglan.com/2009/02/22/packr-31-improved-compression-and-private-variable-support/">jcoglan&#8217;s Packr library</a> to automatically pack our generated Javascript. This can result in filesize being reduced by pretty massive amounts, and will result in appreciable performance benefits.</p>
<p>Well, that&#8217;s all fine and dandy, but it&#8217;s not my combined Javascript that&#8217;s killing me, it&#8217;s all those pesky DNS lookups for all my widget code and CSS. Never fear, you&#8217;re covered there, too.</p>
<pre class="brush: ruby;">
&lt;%=javascript_include_tag(
  &quot;http://rpxnow.com/openid/v2/widget&quot;,
  &quot;http://partner.googleadservices.com/gampad/google_service.js&quot;,
  &quot;http://s3.amazonaws.com/getsatisfaction.com/feedback/feedback.js&quot;,
  &quot;http://blippr.tags.crwdcntrl.net/cc.js&quot;,
  :cache =&gt; &quot;remote&quot;, :lifetime =&gt; 12.hours) %&gt;
</pre>
<p>Madness! Sheer madness! All those remote Javascript files are sucked down, combined, and cached as &#8220;remote.js&#8221;. It&#8217;ll automatically expire after 12 hours, and be re-cached after that. That way, you can get all the performance benefits of serving a single combined JS file without having to stress out that someone over at WidgetHeadquarters is going to change a piece of code and completely screw you over until you notice that your local Javascript file doesn&#8217;t match theirs six weeks later.</p>
<p>This, oddly enough, works for CSS files, too.</p>
<pre class="brush: ruby;">
&lt;%=stylesheet_link_tag(
  &quot;http://s3.amazonaws.com/getsatisfaction.com/feedback/feedback.css&quot;,
  &quot;http://s3.amazonaws.com/getsatisfaction.com/feedback/widget.css&quot;,
  :cache =&gt; &quot;remote&quot;, :lifetime =&gt; 12.hours
) %&gt;
</pre>
<p>No more stalling out at requests to Amazon&#8217;s S3 for CSS files! No more extraneous DNS requests or HTTP connections! No fuss, no muss, no headaches for you or you user.</p>
<p>All this, and it makes crispy bacon, too.*</p>
<p>To get it, just&#8230;wait for it. Very complex procedure ahead:</p>
<pre class="brush: ruby;">
script/plugin install git://github.com/cheald/better_assets.git
</pre>
<p>Restart your app, and that&#8217;s it. Your assets are now approximately 163% more awesome, while being leaner and looking better in that fabulous summer swimsuit at the same time.</p>
<p>Score.</p>
<p>* Not really.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.coffeepowered.net/2009/07/29/sweet-ass-performance-hacks-better_assets/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Fine tuning your garbage collector</title>
		<link>http://www.coffeepowered.net/2009/06/13/fine-tuning-your-garbage-collector/</link>
		<comments>http://www.coffeepowered.net/2009/06/13/fine-tuning-your-garbage-collector/#comments</comments>
		<pubDate>Sun, 14 Jun 2009 02:52:13 +0000</pubDate>
		<dc:creator>Chris Heald</dc:creator>
				<category><![CDATA[Rails]]></category>
		<category><![CDATA[Ruby]]></category>
		<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.coffeepowered.net/?p=173</guid>
		<description><![CDATA[If you&#8217;re familiar with Ruby at all, you know that it can be a little wacky when it comes to memory usage. Most of us have observed a Mongrel/Passenger instance that starts out small and then grows by leaps and bounds, eventually settling on some uncomfortably high number. We&#8217;re going to fix that with Ruby [...]]]></description>
			<content:encoded><![CDATA[<p>If you&#8217;re familiar with Ruby at all, you know that it can be a little wacky when it comes to memory usage. Most of us have observed a Mongrel/Passenger instance that starts out small and then grows by leaps and bounds, eventually settling on some uncomfortably high number. We&#8217;re going to fix that with <a href="http://www.rubyenterpriseedition.com/">Ruby Enterprise Edition</a> and <a href="http://github.com/cheald/scrap/tree/master">Scrap</a>.<br />
<span id="more-173"></span><br />
The Ruby garbage collector&#8217;s behavior is controlled by a number of constants. In the MRI, these are compiled into Ruby itself, and don&#8217;t change. However, if you&#8217;re using REE you can override them with environment variables on startup. It&#8217;s terribly handy.</p>
<h3>First, the boring documentation</h3>
<p>All the juicy information is available <a href="http://www.rubyenterpriseedition.com/documentation.html#_garbage_collector_performance_tuning">in the documentation</a>, but I&#8217;m going to just go over the key points real quick.</p>
<p><code>RUBY_HEAP_MIN_SLOTS</code>: This is the number of &#8220;heap slots&#8221; that each Ruby instance starts up with. One heap slot can hold one Ruby object. By default, this is 10,000. By controlling this value, we can get our apps to stabilize very quickly. More on this later.</p>
<p><code>RUBY_HEAP_SLOTS_INCREMENT</code>: Once Ruby has allocated <code>RUBY_HEAP_MIN_SLOTS</code> objects on its first heap, it will have to allocate a second heap to make room for more. This variable controls the size of this second heap, and sets the baseline for future heaps, as well.</p>
<p><code>RUBY_HEAP_SLOTS_GROWTH_FACTOR</code>: For heaps #3 and onward, Ruby uses <code>RUBY_HEAP_SLOTS_INCREMENT</code> and this value to determine the size to allocate for the new heap. By default, this is 1.8, meaning that your third heap will end up with 10,000 * 1.8 = 18,000 slots in it.</p>
<p><code>RUBY_HEAP_FREE_MIN</code>: After each garbage collection run, if the number of free slots is less than <code>RUBY_HEAP_FREE_MIN</code>, a new heap will be allocated. The default is 4096.</p>
<p>So, let&#8217;s look at this practically. Presume that we have a Rails process that is going to require 50,000 Ruby objects before it&#8217;s fully initialized. The allocation process, when at defaults, will look something like this:</p>
<p>Allocate 10,000 slots (10,000 total available)<br />
Allocate 10,000 slots (20,000 total available)<br />
Allocate 18,000 slots (38,000 total available)<br />
Allocate 68,400 slots (106,400 total available)</p>
<p>So, we end up with about 53% more slots than we actually needed, and it took us four heap allocations to even boot the process. Surely we can do better.</p>
<h3>Enter Scrap.</h3>
<p><a href="http://github.com/cheald/scrap/tree/master">Scrap</a> is a little <a href="http://weblog.rubyonrails.org/2008/12/17/introducing-rails-metal">Metal</a> handler I wrote for tracking memory usage and garbage statistics over an instance&#8217;s lifetime. Installing it is trivial &#8211; just drop it into your vendor directory, restart your app, and navigate to <code>http://yoururl.com/stats/scrap</code>.</p>
<p>With this in hand, we can peek our memory usage and see what we can see.</p>
<p>There are some stats at the top, but for our purposes, we&#8217;re interested in the per-request garbage statistics. The newest request is near the top of the file, and the oldest request is at the bottom of the file. The last 50 requests are tracked. Each request looks something like this:</p>
<pre><code>
[71.92 MB] GET /apps/176568-WordPress

Number of objects    : 817571 (658305 AST nodes, 80.52%)
Heap slot size       : 20
GC cycles so far     : 503
Number of heaps      : 7
Total size of objects: 15968.18 KB
Total size of heaps  : 18036.81 KB (2068.63 KB = 11.47% unused)
Leading free slots   : 27104 (529.38 KB = 2.93%)
Trailing free slots  : 1 (0.02 KB = 0.00%)
Number of contiguous groups of 16 slots: 2829 (4.90%)
Number of terminal objects: 4307 (0.47%)
</code></pre>
<p>Key points here for the time being are <code>Number of objects</code> and <code>Number of heaps</code>. When we look at the number of objects &#8211; in this case, 817,000, it&#8217;s obvious that we&#8217;re going to have to allocate a number of heaps to handle all those objects. Rails&#8217; boot-up cost is fairly significant, and the default Ruby settings just really don&#8217;t cut it here. As you can see, we&#8217;ve allocated 7 heaps, and we&#8217;re using 15.9 of 18.0 MB allocated to the heap. Once a heap is allocated, it&#8217;s never de-allocated, so we&#8217;re perma-stuck at 18 MB of heap usage. Note that this isn&#8217;t the size of all the data in the program &#8211; just the space allocated for objects. A string that contains 100MB of data will only consume 20 bytes (that&#8217;s the &#8220;heap slot size &#8211; the amount of memory each object on the heap consumes&#8221;) on the heap. </p>
<p>However, what if we could just allocate the whole startup cost in the initial heap, and save ourselves the problems of having to reallocate so often?</p>
<p>We note that we have 891k slots allocated, so we can guesstimate at a number to set our initial allocation to. In my production app, I set mine to 1,250,000 &#8211; I was observing peaks around the 1,100,000 mark, and just increased it by 10% and rounded up.</p>
<p>So, my first custom environment variable is </p>
<p><code>RUBY_HEAP_MIN_SLOTS=1250000</code></p>
<p>And it results in something like this on the app&#8217;s first boot:</p>
<p>[137.99 MB] GET /movies/7505-Star-Wars-Episode-V-The-Empire-Strikes-Back</p>
<pre><code>Number of objects    : 933037 (664785 AST nodes, 71.25%)
Heap slot size       : 20
GC cycles so far     : 12
Number of heaps      : 1
Total size of objects: 18223.38 KB
Total size of heaps  : 24414.08 KB (6190.70 KB = 25.36% unused)
Leading free slots   : 316963 (6190.68 KB = 25.36%)
Trailing free slots  : 0 (0.00 KB = 0.00%)
Number of contiguous groups of 16 slots: 19810 (25.36%)
Number of terminal objects: 25941 (2.08%)</code></pre>
<p>Yowza, a full 25% of my heap is unused after boot. But&#8230;well, that&#8217;s okay. We&#8217;ve only allocated 1 heap, and later on, my object allocation grows to around 1,100,000. This is still 15k under the heap size, and I&#8217;ve set <code>RUBY_HEAP_FREE_MIN=12500</code> (1% of the initial size), so if I have less than 12,500 heap objects free after a GC cycle, a new heap will be allocated. Stabilizing there means that I end up with 1 heap for the lifetime of my app, and I end up sitting just under the threshold that&#8217;d cause a new heap to be born. If I have a leak, or a super heavy action or something, though, that might kick me over my limit and require a new heap. So, we come to&#8230;</p>
<p><code>RUBY_HEAP_SLOTS_INCREMENT=100000</code></p>
<p>This value says &#8220;Hey, if you have to allocate a second heap, start with this many slots&#8221;. If we go over our limit of 1.25 million slots, we&#8217;ll allocate a second heap that&#8217;s about 8% the size of the original. That seems awfully small, but consider that we&#8217;re hoping to never get to that heap.</p>
<p>Should we end up using that entire second heap, then we have to worry about our third setting, <code>RUBY_HEAP_SLOTS_GROWTH_FACTOR=1</code>. This says &#8220;Each new heap should be 1.0 as large as the previous heap.&#8221; In this case, it means I&#8217;ll keep allocating 100k-slot heaps until the cows come home. In an untuned environment, this could be bad &#8211; we would either end up having to do a <em>ton</em> of allocations to get to our target, or we would overallocate very badly. However, because we know our app&#8217;s memory requirements, and know about where we want it to end up, a relatively small, linear growth factor is just what the doctor ordered here.</p>
<h3>Okay, now what?</h3>
<p>So, we have a collection of settings with which to run our app. Great! Now, how do we use it?</p>
<p>Fortunately, it&#8217;s easy.</p>
<pre><code>
pushd `which ruby | xargs dirname`
sudo vim ruby-with-env
</code></pre>
<p>We&#8217;re going to create a little bash script with the following:</p>
<pre class="brush: ruby;">
#!/bin/bash
export RUBY_HEAP_MIN_SLOTS=1250000
export RUBY_HEAP_SLOTS_INCREMENT=100000
export RUBY_HEAP_SLOTS_GROWTH_FACTOR=1
export RUBY_GC_MALLOC_LIMIT=30000000
export RUBY_HEAP_FREE_MIN=12500
exec &quot;/opt/ree/bin/ruby&quot; &quot;$@&quot;
</pre>
<p>Note that last line &#8211; the path will have to match the path to your Ruby executable, which fortunately, should be in the directory that you&#8217;re in.</p>
<p>Save it, don&#8217;t forget to <code>chmod a+x ruby-with-env</code>, and then edit your Apache or nginx configuration.</p>
<p>Under nginx, you&#8217;ll have a line like this:</p>
<p><code>passenger_ruby /opt/ruby-enterprise-1.8.6-20090610/bin/ruby;</code></p>
<p>Just change it to use your new wrapper script, like so:</p>
<p><code>passenger_ruby /opt/ruby-enterprise-1.8.6-20090610/bin/ruby-with-env;</code></p>
<p>The process is similarly easy for Apache &#8211; the line you need is something like:</p>
<p><code>PassengerRuby /opt/ruby-enterprise-1.8.6-20090610/bin/ruby</code></p>
<p>It might be in either your <code>httpd.conf</code> or <code>conf.d/passenger.conf</code>.</p>
<p>Once you&#8217;re all edited up, restart your webserver, and congratulations, you&#8217;ve got a fine-tuned garbage collector humming along with your app.</p>
<h3>Taking out the garbage</h3>
<p>&#8220;But Chris!&#8221;, you say, &#8220;There&#8217;s a variable in there that you didn&#8217;t talk about! What gives?&#8221; You are indeed correct, astute reader. We&#8217;ve thus far avoided the <code>RUBY_GC_MALLOC_LIMIT</code> variable. This is a handle little setting that lets you tell Ruby how often to clean up after itself. Ruby is written in C, and C uses <code>malloc</code> to allocate memory. Ruby just keeps a little counter each time it allocates an object with malloc, and it runs its garbage collector after so many malloc calls have been made. I haven&#8217;t found a great way to tune this one yet, except via experimentation, but here&#8217;s what to know about it:</p>
<ol>
<li>The lower this value is, the more often your garbage collector runs. Garbage collection is slow. Garbage collection is painfully slow. If a user is waiting on garbage collection, they are going to become impatient. You want as few users waiting on garbage collection as possible.</li>
<li>The higher this value is, the more memory Ruby will allocate before it tries to clean up after itself. If this value is too high, you&#8217;ll have dead objects hanging around eating up heap space, and possibly causing Ruby to crap itself and allocate a new heap. This is bad.</li>
<li>To tune this value, you want to find the happy medium, wherein you stabilize under your initial heap allocation value, but with as few garbage collection passes as possible. Read up on <a href="http://blog.evanweaver.com/articles/2009/04/09/ruby-gc-tuning/">Evan Weaver&#8217;s blog</a> for some more in-depth analysis of what garbage collection frequency tuning can do to your app&#8217;s performance.
<li>If you have excess memory and want a faster app, err on the side of this being too high. If you are on a tight memory budget, and would prefer slower actions in exchange for not blowing your heap and allocating a whole new one, err on the side of this being too low.</li>
<li>Recommended values for this are all over the board. Evan recommends a setting of 50 million. I&#8217;m using a setting of 30 million. The Ruby default is 8 million. You&#8217;ll have to play around and find what works best for you. Just pay attention to how many requests there are in between that &#8220;GC cycles so far&#8221; number incrementing in Scrap, and you&#8217;ll be able to measure approximately how often you&#8217;re entering a GC cycle.
</ol>
<p>Good luck with it, and have fun!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.coffeepowered.net/2009/06/13/fine-tuning-your-garbage-collector/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
	</channel>
</rss>

<!-- Performance optimized by W3 Total Cache. Learn more: http://www.w3-edge.com/wordpress-plugins/

Minified using memcached
Page Caching using memcached
Database Caching 15/18 queries in 0.009 seconds using memcached

Served from: www.coffeepowered.net @ 2010-09-09 10:22:18 -->