<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Coffee Powered &#187; Chris Heald</title>
	<atom:link href="http://www.coffeepowered.net/author/cheald/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.coffeepowered.net</link>
	<description>code and content</description>
	<lastBuildDate>Sun, 05 Sep 2010 20:38:39 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.1</generator>
		<item>
		<title>Write-once read-only fixtures for Rails tests</title>
		<link>http://www.coffeepowered.net/2010/09/05/write-once-read-only-fixtures-for-rails-tests/</link>
		<comments>http://www.coffeepowered.net/2010/09/05/write-once-read-only-fixtures-for-rails-tests/#comments</comments>
		<pubDate>Sun, 05 Sep 2010 20:38:39 +0000</pubDate>
		<dc:creator>Chris Heald</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.coffeepowered.net/?p=330</guid>
		<description><![CDATA[In the project I&#8217;m currently working on, I&#8217;m heavily using factory_girl to generate test data, rather than using the old Rails fixtures standby. However, I still have a set of read-only fixtures (which are used for testing read-only models against a legacy database). I&#8217;m using these in my tests, but since they are read only [...]]]></description>
			<content:encoded><![CDATA[<p>In the project I&#8217;m currently working on, I&#8217;m heavily using <a href="http://github.com/thoughtbot/factory_girl">factory_girl</a> to generate test data, rather than using the old Rails fixtures standby. However, I still have a set of read-only fixtures (which are used for testing read-only models against a legacy database). I&#8217;m using these in my tests, but since they are read only (like, seriously &#8211; the models are marked as by using <code>after_find</code> to call <code>readonly!</code>, ensuring that records will not be accidentally written), there&#8217;s no need to wipe and re-insert them per-test.</p>
<p>It&#8217;s not too hard to set up fixtures to be inserted once per test suite run &#8211;</p>
<p>In your test_helper.rb, above the <code>class ActiveSupport::TestCase</code> definition, add the following:</p>
<pre class="brush: ruby;">
Fixtures.reset_cache
fixtures_folder = File.join(RAILS_ROOT, 'test', 'fixtures')
fixtures = Dir[File.join(fixtures_folder, '*.yml')].map {|f| File.basename(f, '.yml') }
Fixtures.create_fixtures(fixtures_folder, fixtures)
Fixtures.reset_cache
</pre>
<p>Next, turn off transactional fixtures and comment out the fixtures macro:</p>
<pre class="brush: ruby;">
self.use_transactional_fixtures = false
# fixtures :all
</pre>
<p>That&#8217;s all there is to it. Your fixtures will be inserted into your test database once when test_helper is included for the first time, and then not again for the rest of the test suite run. This should speed your tests up substantially.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.coffeepowered.net/2010/09/05/write-once-read-only-fixtures-for-rails-tests/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Pain-free CSS3 with Sass and CSSPie</title>
		<link>http://www.coffeepowered.net/2010/09/02/pain-free-css3-with-sass-and-csspie/</link>
		<comments>http://www.coffeepowered.net/2010/09/02/pain-free-css3-with-sass-and-csspie/#comments</comments>
		<pubDate>Thu, 02 Sep 2010 18:20:20 +0000</pubDate>
		<dc:creator>Chris Heald</dc:creator>
				<category><![CDATA[CSS]]></category>
		<category><![CDATA[chrome]]></category>
		<category><![CDATA[css]]></category>
		<category><![CDATA[css3]]></category>
		<category><![CDATA[csspie]]></category>
		<category><![CDATA[firefox]]></category>
		<category><![CDATA[internet explorer]]></category>
		<category><![CDATA[sass]]></category>

		<guid isPermaLink="false">http://www.coffeepowered.net/?p=316</guid>
		<description><![CDATA[So, you have a great design for a site. Lots of rounded corners, soft shadows, and beautiful gradients. &#8220;This&#8217;ll be fun!&#8221;, you think. Enter IE. &#8220;Oh, crap&#8221;, you think. Modern web design in IE is a pain in the rear. Fortunately, we have modern tools that make it a not-pain. SASS (Syntactically Awesome Stylesheets) is [...]]]></description>
			<content:encoded><![CDATA[<p>So, you have a great design for a site. Lots of rounded corners, soft shadows, and beautiful gradients. &#8220;This&#8217;ll be fun!&#8221;, you think.</p>
<p>Enter IE.</p>
<p>&#8220;Oh, crap&#8221;, you think.</p>
<p>Modern web design in IE is a pain in the rear. Fortunately, we have modern tools that make it a not-pain.</p>
<ul>
<li><a href="http://sass-lang.com/">SASS</a> (Syntactically Awesome Stylesheets) is a macro language for CSS. It lets you express CSS as nested rules, and gives you mix-ins, functionality extensions, variables, partials, and a whole lot more.</li>
<li><a href="http://css3pie.com/">CSSPie</a> is a set of behaviors for Internet Explorer that gives you CSS3 visual styles without really slow Javascript hacks like <a href="http://www.curvycorners.net/">CurvyCorners</a>.</li>
</ul>
<p>When combined, the two are a shot of <em>liquid awesome</em> injected directly into your brain.</p>
<p>I&#8217;ve settled on a fairly standard setup for my projects. I have:</p>
<ul>
<li>My <code>application.sass</code> file.</li>
<li>My <code>_mixins.sass</code> partial.</li>
<li>My <code>PIE.htc</code> behavior file.</li>
</ul>
<p>Macros is very straightforward:</p>
<pre class="brush: css;">
@mixin pie
  behavior: url(/behaviors/PIE.htc)
.pie
  +pie

@mixin shadows($color: #aaa, $x: 1px, $y: 2px, $spread: 2px)
  @extend .pie
  -moz-box-shadow: $color $x $y $spread
  -webkit-box-shadow: $color $x $y $spread
  box-shadow: $color $x $y $spread

@mixin inset-shadows($color: #aaa, $x: 1px, $y: 1px, $spread: 1px)
  @extend .pie
  -moz-box-shadow: inset $x $y $spread $color
  -webkit-box-shadow: inset $x $y $spread $color
  box-shadow: inset $x $y $spread $color

@mixin corners($tl: 5px, $tr: nil, $br: nil, $bl: nil)
  @extend .pie
  @if $tr == nil
    $tr: $tl
  @if $br == nil
    $br: $tl
  @if $bl == nil
    $bl: $tl
  -moz-border-radius: $tl $tr $br $bl
  -webkit-border-top-left-radius: $tl
  -webkit-border-bottom-left-radius: $bl
  -webkit-border-top-right-radius: $tr
  -webkit-border-bottom-right-radius: $br
  border-radius: $tl $tr $br $bl

@mixin vertical-gradient($start: #000, $end: #ccc)
  @extend .pie
  background: $end
  background: -webkit-gradient( linear, left top, left bottom, color-stop(0, $start), color-stop(1, $end) )
  background: -moz-linear-gradient(center top, $start 0%, $end 100%)
  -pie-background: linear-gradient(90deg, $start, $end)
</pre>
<p>What&#8217;s going on here? We&#8217;re defining several mix-ins for Sass:</p>
<pre class="brush: css;">
+shadows([color, [x, [y, [spread]]]])

+inset-shadows([color, [x, [y, [spread]]]])

+corners(size)

+corners(topleft, topright, bottomright, bottomleft)

+vertical-gradient(start, end)
</pre>
<p>Now, in your CSS, you can just do the following:</p>
<pre class="brush: css;">body
  font:
    family: Arial
    size: 10pt

.box
  +corners
  +shadows(#ccc)
  +vertical-gradient(#eee, #fff)

  h3
    color: #444

.dark-box
  +corners(20px)
  +shadows(#888, 4px, 4px, 4px)
  +vertical-gradient(#444, #000)
  color: #fff
  h3
    color: #fff

.box, .dark-box
  padding: 1em
  margin-bottom: 1em
</pre>
<p>This expands to:</p>
<pre class="brush: css;">
.pie, .box, .dark-box {
  behavior: url(/projects/PIE.htc);
}

body {
  font-family: Arial;
  font-size: 10pt;
}

.box {
  -moz-border-radius: 5px 5px 5px 5px;
  -webkit-border-top-left-radius: 5px;
  -webkit-border-bottom-left-radius: 5px;
  -webkit-border-top-right-radius: 5px;
  -webkit-border-bottom-right-radius: 5px;
  border-radius: 5px 5px 5px 5px;
  -moz-box-shadow: #cccccc 1px 2px 2px;
  -webkit-box-shadow: #cccccc 1px 2px 2px;
  box-shadow: #cccccc 1px 2px 2px;
  background: white;
  background: -webkit-gradient(linear, left top, left bottom, color-stop(0, #eeeeee), color-stop(1, white));
  background: -moz-linear-gradient(center top, #eeeeee 0%, white 100%);
  -pie-background: linear-gradient(270deg, #eeeeee, white);
}
.box h3 {
  color: #444444;
}

.dark-box {
  -moz-border-radius: 20px 20px 20px 20px;
  -webkit-border-top-left-radius: 20px;
  -webkit-border-bottom-left-radius: 20px;
  -webkit-border-top-right-radius: 20px;
  -webkit-border-bottom-right-radius: 20px;
  border-radius: 20px 20px 20px 20px;
  -moz-box-shadow: #888888 4px 4px 4px;
  -webkit-box-shadow: #888888 4px 4px 4px;
  box-shadow: #888888 4px 4px 4px;
  background: black;
  background: -webkit-gradient(linear, left top, left bottom, color-stop(0, #444444), color-stop(1, black));
  background: -moz-linear-gradient(center top, #444444 0%, black 100%);
  -pie-background: linear-gradient(270deg, #444444, black);
  color: white;
}
.dark-box h3 {
  color: white;
}

.box, .dark-box {
  padding: 1em;
  margin-bottom: 1em;
}
</pre>
<p><a href="http://www.coffeepowered.net/projects/sass-mixins.php">Check out the live demo.</a></p>
<p>Here are screenshots of the demo in Chrome 6, Firefox 4.0b3, Internet Explorer 8. Can you tell which browser is which?</p>
<p><a href="http://www.coffeepowered.net/wp-content/uploads/2010/09/comps.png"><img src="http://www.coffeepowered.net/wp-content/uploads/2010/09/comps.png" alt="" title="comps" width="467" height="798" class="aligncenter size-full wp-image-323" /></a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.coffeepowered.net/2010/09/02/pain-free-css3-with-sass-and-csspie/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Debugging memory leaks in Ruby with GDB, round 2.</title>
		<link>http://www.coffeepowered.net/2010/08/23/debugging-memory-leaks-in-ruby-with-gdb-round-2/</link>
		<comments>http://www.coffeepowered.net/2010/08/23/debugging-memory-leaks-in-ruby-with-gdb-round-2/#comments</comments>
		<pubDate>Mon, 23 Aug 2010 23:02:22 +0000</pubDate>
		<dc:creator>Chris Heald</dc:creator>
				<category><![CDATA[Ruby]]></category>
		<category><![CDATA[debugging]]></category>
		<category><![CDATA[gdb]]></category>

		<guid isPermaLink="false">http://www.coffeepowered.net/?p=310</guid>
		<description><![CDATA[In part 1, I described how I located leaky Sets in MongoMapper by diffing the Ruby ObjectSpace with GDB. Today, I&#8217;m going to show you how to solve the problems that those sorts of diffs can reveal. In today&#8217;s example, we&#8217;re tracking leaky sets. In particular, a set is holding onto class references. We are [...]]]></description>
			<content:encoded><![CDATA[<p>In <a href="http://www.coffeepowered.net/2010/08/23/mongomapper-development-mode-and-memory-leaks/">part 1</a>, I described how I located leaky <code>Set</code>s in MongoMapper by diffing the Ruby ObjectSpace with GDB. Today, I&#8217;m going to show you how to solve the problems that those sorts of diffs can reveal. In today&#8217;s example, we&#8217;re tracking leaky sets. In particular, a set is holding onto class references. We are going to:</p>
<ol>
<li>Set up object creation tracking on the Set class</li>
<li>Find the leaky instance of our set using GDB</li>
<li>Locate what code created that Set instance</li>
</ol>
<p><span id="more-310"></span><br />
First things first. Check out Sean Bradly&#8217;s <a href="http://drunkhobo.com/~sean/gctrack.rb.html">GCTrack</a> module. I&#8217;m using a modified version of it to include object IDs in the tracker struct, so that we can perform reverse-lookups on an instance.</p>
<pre class="brush: ruby;">
# Sean Bradly (rhythmx at gmail) - 2010
# Modified by Chris Heald (cheald@gmail.com) - August 2010
module GCTrack
  # Class to track an object's lifecycle and where it was created
  class Tracepoint &lt; Struct.new(:num, :call, :insp, :id)
    include Comparable
    # make it so the obj.num works as a simple key
    def &lt;=&gt;(obj)
      if obj.class &lt;= self.class
        self.num &lt;=&gt; obj.num
      else
        self.num &lt;=&gt; obj
      end
    end
  end

  def self.included(klass)

    # Hook initialize to track this object
    klass.class_eval do
      alias_method :old_init, :initialize

      def initialize(*args)
        # Call original init
        old_init(*args)
        # trace creation and setup finalizer callback
        final_proc = self.class.gct_setup_finalizer(self, caller, inspect)
        # define the finalizer
        ObjectSpace.define_finalizer(self, final_proc)
      end
    end

    # Add some metaclass foo to track all the objects.
    # You cant track in an instance w/o creating circular
    # references and breaking Ruby's GC

    (
    class &lt;&lt; klass;
      self;
    end).class_eval do
      def gct_setup_finalizer(obj, caller, insp)
        # create/log a tracepoint for this object
        num = gct_objnum
        tpoint = Tracepoint.new(num, caller, insp, obj.object_id)
        gct_add(tpoint)
        # call option init cb
        if self.respond_to?(:created)
          self.send(:created, tpoint)
        end
        obj = nil # don't let the proc track a ref to 'obj'
        # this callback happens at GC time
        final_proc = Proc.new do
          gct_del(tpoint)
          # call optional final cb
          if self.respond_to?(:deleted)
            self.send(:deleted, tpoint)
          end
        end
      end

      def gct_objnum
        @num ||= -1
        @num  +=  1
      end

      def gct_addlog
        @addlog ||= []
      end

      def gct_dellog
        @dellog ||= []
      end

      def gct_add(num)
        gct_addlog &lt;&lt; num
      end

      def gct_del(num)
        gct_dellog &lt;&lt; num
      end

      def gct_active
        gct_addlog - gct_dellog
      end

      def gct_origin(id)
        x = gct_addlog.select {|g| g.id == id }.first
        x.nil? ? nil : x.call.join(&quot;\n&quot;)
      end

      def gct_orphan_report
        # Get the leaked objects
        rpt     = &quot;&quot;
        leaked  = gct_active
        callers = leaked.map { |o| o.call }.sort.uniq
        # Iterate over source lines that have leaked objs
        callers.each do |calla|
          leaked_here = leaked.find_all { |o| o.call == calla }
          rpt &lt;&lt; &quot;==== #{leaked_here.size} leaked objects from:\n\n&quot;
          calla.each { |l| rpt &lt;&lt; &quot;    #{l}\n&quot; }
          rpt &lt;&lt; &quot;\n&quot;
          rpt &lt;&lt; &quot;    == object data:\n\n&quot;

          leaked_here.each do |o|
            rpt &lt;&lt; &quot;    num: #{o.num}, inspect: #{o.insp}\n&quot;
          end
          rpt &lt;&lt; &quot;\n&quot;
        end
        rpt
      end

    end
  end
end
</pre>
<p>Additionally, you&#8217;ll want to patch the object you&#8217;re interested in tracking.</p>
<pre class="brush: ruby;">
class Set
  include GCTrack
end
</pre>
<p>You&#8217;ll want to include both of those at the top of your config.rb, before your initializer block. If you&#8217;re tracking an object defined in your Rails app, just include GCTrack in the object definition, rather than in a monkeypatch.</p>
<p>Next, you need to be sure that you have your GDB macros set up properly. /root/.gdbinit should have the following:</p>
<pre class="brush: bash;">
define eval
  call(rb_p(rb_eval_string_protect($arg0,(int*)0)))
end

define redirect_stdout
  call rb_eval_string(&quot;$_old_stdout, $stdout = $stdout, File.open('/tmp/ruby-debug.' + Process.pid.to_s, 'a'); $stdout.sync = true&quot;)
end
</pre>
<p>(re)start your application, hit the leaky action a few times, and find its PID using <code>ps ax</code> or <code>top</code>. Once you have that, attach to the process with gdb, redirect your process&#8217;s stdout to the tmp file with <code>redirect_stdout</code> and dump all <code>Set</code>s in your object space.</p>
<pre class="brush: bash;">
(gdb) attach 12345
(gdb) redirect_stdout
$8 = 2
(gdb) eval &quot;GC.start&quot;
(gdb) eval &quot;ObjectSpace.each_object {|o| puts \&quot;#{o.class.name}, #{o.inspect} -- #{o.object_id}\&quot; if o.is_a?(Set) }; puts '----'&quot;
</pre>
<p>Now that we have that, let&#8217;s look at the tmp file, and locate the leaked set instance.</p>
<pre class="brush: plain;">
Set, #&lt;Set: {Achievement, Achievement, EmbeddedComment, Achievement, EmbeddedComment, EmbeddedComment}&gt; -- 78376680
</pre>
<p>There, at the end, is our object ID. Now back over to gdb.</p>
<pre class="brush: bash;">
(gdb) eval &quot;puts Set.gct_addlog.select {|g| g.id == 78376680}.first.call.join(\&quot;\n\&quot;)&quot;
</pre>
<p>Your tmp file should now have a stack trace for that object&#8217;s allocation now. Track it down and beat it into submission.</p>
<pre class="brush: plain;">
/opt/ruby-enterprise-1.8.7-2010.02/lib/ruby/gems/1.8/gems/mongo_mapper-0.8.3/lib/mongo_mapper/support/descendant_appends.rb:15:in `new'
</pre>
<p>You can do this for just about any object you&#8217;d like. Just mix in your GCTrack and easily find where in code a particular instance of an object was created. Just like that, debugging memory leaks goes from hunting guppies in the Atlantic to shooting fish in a barrel.</p>
<p>Have fun!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.coffeepowered.net/2010/08/23/debugging-memory-leaks-in-ruby-with-gdb-round-2/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>MongoMapper, Development Mode, and Memory Leaks</title>
		<link>http://www.coffeepowered.net/2010/08/23/mongomapper-development-mode-and-memory-leaks/</link>
		<comments>http://www.coffeepowered.net/2010/08/23/mongomapper-development-mode-and-memory-leaks/#comments</comments>
		<pubDate>Mon, 23 Aug 2010 11:15:24 +0000</pubDate>
		<dc:creator>Chris Heald</dc:creator>
				<category><![CDATA[Ruby]]></category>
		<category><![CDATA[debugging]]></category>
		<category><![CDATA[fixes]]></category>
		<category><![CDATA[gdb]]></category>
		<category><![CDATA[memory leak]]></category>
		<category><![CDATA[mongomapper]]></category>

		<guid isPermaLink="false">http://www.coffeepowered.net/?p=308</guid>
		<description><![CDATA[If you&#8217;ve worked with MongoMapper for a while, you&#8217;ve probably noticed that in complex apps, there are horrific memory leaks in development that magically disappear in production mode. While this is all well and good, and it&#8217;s rather handy that things Just Work in production, don&#8217;t you wish you didn&#8217;t have to restart your app [...]]]></description>
			<content:encoded><![CDATA[<p>If you&#8217;ve worked with MongoMapper for a while, you&#8217;ve probably noticed that in complex apps, there are horrific memory leaks in development that magically disappear in production mode. While this is all well and good, and it&#8217;s rather handy that things Just Work in production, don&#8217;t you wish you didn&#8217;t have to restart your app server every 15 requests in development?</p>
<p>I set out to track down the cause tonight, and have both fixed the problem and gotten some handy experience debugging Rails apps with gdb.</p>
<p><span id="more-308"></span></p>
<h2>The Solution</h2>
<p>First off, if you just want the fix, here it is. You probably have a middleware to clear your identity maps already. We&#8217;re just going to modify that. In my case, it&#8217;s <code>lib/mongo_mapper/per_request_identity_map_clear.rb</code>.</p>
<pre class="brush: ruby;">
module MongoMapper
  class PerRequestMapClear
    def initialize(app)
      @app = app
    end

    def call(env)
      if Rails.configuration.cache_classes
        MongoMapper::Plugins::IdentityMap.clear
      else
        MongoMapper::Document.descendants.each do |m|
          m.descendants.clear if m.respond_to? :descendants
        end
        MongoMapper::Document.descendants.clear
        MongoMapper::EmbeddedDocument.descendants.clear
        MongoMapper::Plugins::IdentityMap.clear
        MongoMapper::Plugins::IdentityMap.models.clear
      end
      @app.call(env)
    end
  end
end
</pre>
<p>In particular, these two lines:</p>
<pre class="brush: ruby;">
  MongoMapper::Document.descendants.clear
  MongoMapper::Plugins::IdentityMap.models.clear
</pre>
<p>Make sure you get that into your middleware stack and all your MongoMapper memory leaking issues will magically disappear.</p>
<h2>The Problem</h2>
<p>You see, MongoMapper is doing some <em>dag-nasty evil stuff</em> with class references. Namely, it&#8217;s holding onto them in class variables that don&#8217;t get reloaded per-request in Rails. Simply:</p>
<ol>
<li>Rails loads your <code>User</code> model. This loads up a ton of Validations, MongoMapper::Keys, procs, and whatever else you have in your model definition. It&#8217;s not really light. That&#8217;s okay because we load it only once in production.</li>
<li>This is a <code>MongoMapper::Document</code>, which causes User (the class reference) to be inserted into the <code>MongoMapper::Document.descendants</code> class attribute, which happens to be a Set object.</li>
<li>Next time you refresh a page, Rails re-loads your <em>User</em> model. This creates a <em>new and separate User class reference</em>. It does not replace the previous <em>User</em> class reference. It is not equal to your previous User class reference. They are, as far as Ruby cares, separate objects.</li>
<li>MongoMapper happily sticks that class reference into its <code>descendants</code> class attribute. You now have two separate copies of User. Since MongoMapper is holding onto a reference to your old User class, Ruby can never garbage collect it. The old User class and all of its huge cascading chain of referenced objects are now &#8220;leaked&#8221;.</li>
<li>Your memory usage is increasing by 4mb per request now.</li>
</ol>
<p>I had app instances reaching nearly 1GB of RAM usage after light testing. I finally noticed it when my development machine kicked into swap and actions that took 80ms to run were taking 8000ms to run. Hm. That might be a problem!</p>
<h2>Debugging leaks with GDB</h2>
<p>GDB is an amazing tool. With a few macros, you can make hacking around in a live Ruby instance pretty painless. Crack open your <code>/root/.gdbinit</code> file and add a few macros:</p>
<pre class="brush: bash;">
define eval
  call(rb_p(rb_eval_string_protect($arg0,(int*)0)))
end

define redirect_stdout
  call rb_eval_string(&quot;$_old_stdout, $stdout = $stdout, File.open('/tmp/ruby-debug.' + Process.pid.to_s, 'a'); $stdout.sync = true&quot;)
end
</pre>
<p>Now, we&#8217;re going to attach to your running Ruby process. This needs to be done as root.</p>
<pre class="brush: bash;">
[root@polaris ~]# gdb
(gdb) attach 12019
(gdb) redirect_stdout
(gdb) eval &quot;ObjectSpace.each_object {|o| puts \&quot;#{o.class.name}, #{o.inspect} -- #{o.object_id}\&quot; unless o.is_a?(String) }; puts '----'&quot;
</pre>
<p>This will effectively dump all non-String objects in the attached Ruby process to <code>tmp/ruby-debug.12019</code>. This takes a little bit, but it lets you come up with some handy data for parsing later.</p>
<p>To get data we can compare, we&#8217;ll need to dump the environment for multiple requests:</p>
<pre class="brush: bash;">
[root@polaris ~]# gdb
(gdb) attach 12019
(gdb) redirect_stdout
(gdb) eval &quot;GC.start&quot;
(gdb) eval &quot;ObjectSpace.each_object {|o| puts \&quot;#{o.class.name}, #{o.inspect} -- #{o.object_id}\&quot; unless o.is_a?(String) }; puts '----'&quot;
(gdb) detach

(run some requests)

(gdb) attach 12019
(gdb) redirect_stdout
(gdb) eval &quot;GC.start&quot;
(gdb) eval &quot;ObjectSpace.each_object {|o| puts \&quot;#{o.class.name}, #{o.inspect} -- #{o.object_id}\&quot; unless o.is_a?(String) }; puts '----'&quot;
(gdb) detach
</pre>
<p>At this point, you&#8217;ll have two ObjectSpace dumps in your temp file. For my purposes, I hacked up a quick little script to parse those dumps, and to output all objects that were not present in both dumps. Since I&#8217;m invoking GC.start, in theory, this should help me find my leaked objects.</p>
<pre class="brush: ruby;">
runs = [[]]
open(ARGV[0]).each do |line|
  if line == &quot;----\n&quot; then
    runs &lt;&lt; []
  elsif line.match &quot;--&quot; then
    runs.last &lt;&lt; line.strip
  end
end

diff = []
runs.each do |run|
  diff = (diff - run) | (run - diff)
end
diff.sort.map {|d| puts d }</pre>
<p>Not very pretty, but it does the job. Just a quick invocation to <code>ruby find_leaked.rb /tmp/ruby-debug.12019 > leaked</code> (well, not that quick, it took a minute to run) and I effectively had an ObjectSpace diff I could pore through.</p>
<p>There&#8217;s a lot of stuff in there. In particular, you&#8217;re going to notice that you have a LOT of Array, Hash, and MatchData objects (perhaps potential optimization points for future Rails releases?). While we may be interested in those, try to cull out the things that obviously aren&#8217;t a problem just for readability&#8217;s sake.</p>
<p>I pored through the diff looking for things related to MongoMapper or my models. After not too long, I came across these lines:</p>
<pre class="brush: ruby;">
Set, #&lt;Set: {Achievement, EmbeddedComment, Achievement, EmbeddedComment, EmbeddedComment, Achievement, Achievement, EmbeddedComment, EmbeddedComment, Achievement, EmbeddedComment, EmbeddedComment, Achievement, Achievement, Achievement, EmbeddedComment, EmbeddedComment, Achievement}&gt; -- 92034680
Set, #&lt;Set: {Achievement, EmbeddedComment, Achievement, EmbeddedComment, EmbeddedComment, Achievement, Achievement, EmbeddedComment, EmbeddedComment, Achievement, EmbeddedComment, EmbeddedComment, Achievement, Achievement, EmbeddedComment, Achievement, Achievement, EmbeddedComment, EmbeddedComment, Achievement}&gt; -- 92034680
</pre>
<p>Whoa there. What? Why do I have Sets with multiple references to <code>Achievement</code> and <code>EmbeddedComment</code>? That doesn&#8217;t smell right. I suspect the problem lies in MongoMapper, so let&#8217;s grep the MongoMapper codebase for Set.</p>
<pre class="brush: bash;">
[chris@polaris lib]$ grep Set * -R
mongo_mapper/plugins/identity_map.rb:        @models ||= Set.new
mongo_mapper/plugins/modifiers.rb:          modifier_update('$addToSet', args)
mongo_mapper/plugins/protected.rb:          self.write_inheritable_attribute(:attr_protected, Set.new(attrs) + (protected_attributes || []))
mongo_mapper/plugins/accessible.rb:          self.write_inheritable_attribute(:attr_accessible, Set.new(attrs) + (accessible_attributes || []))
mongo_mapper/support/descendant_appends.rb:        @descendants ||= Set.new
mongo_mapper/connection.rb:      raise 'Set config before connecting. MongoMapper.config = {...}' unless defined?(@@config)
mongo_mapper/connection.rb:      raise 'Set config before connecting. MongoMapper.config = {...}' if config.blank?
mongo_mapper/extensions/set.rb:    module Set
mongo_mapper/extensions/set.rb:class Set
mongo_mapper/extensions/set.rb:  extend MongoMapper::Extensions::Set
</pre>
<p>Great, we have a hit list to look through. IdentityMap is worth looking at; a new set is created there, and its naming indicates that it&#8217;s for holding models, probably model references (it is). <code>mongo_mapper/support/descendant_appends.rb</code> is much the same deal. We can ignore <code>mongo_mapper/plugins/accessible.rb</code> since we can guess that the <code>attrs</code> being passed are symbols, rather than those class references we saw in the ObjectSpace diff.</p>
<p>Let&#8217;s crack open <code>descendant_appends.rb</code></p>
<pre class="brush: ruby;">
module MongoMapper
  module Support
    module DescendantAppends
      def included(model)
        extra_extensions.each { |extension| model.extend(extension) }
        extra_inclusions.each { |inclusion| model.send(:include, inclusion) }
        descendants &lt;&lt; model
      end

      # @api public
      def descendants
        @descendants ||= Set.new
      end
</pre>
<p>Oh dear. And there it is. Every time <code>MongoMapper::Support::DescendantAppends</code> is included in a model (which is via <code>MongoMapper::Document</code> and <code>MongoMapper:EmbeddedDocument</code>), <em>a reference to the including class</em> is stored in a <em>class variable</em>.</p>
<p>Since we know that Rails reloads models per-request in development mode, and we know that each copy of a model&#8217;s class is not considered equivalent to the other copies of that class, it&#8217;s easy to see what happens here: We end up with sets like in our object dump, with however many orphaned old copies of our models, and all their various and sundry associated models.</p>
<p>And so, we arrive at our solution. By clearing the <em>descendants</em> set on every request where we are reloading our models, we ensure that there are not references to old copies of models left hanging around leaking memory.</p>
<p>My development instances are now running solid at 68mb apiece, rather than 1gb apiece. As you can imagine, the difference in response speed (and thus, productivity) is substantial.</p>
<p>Hope this helps. Have fun with gdb &#8211; it&#8217;s an obscenely powerful tool, and used properly, can give you a purely nutty amount of information which can be invaluable in tracking down memory leaks and related problems.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.coffeepowered.net/2010/08/23/mongomapper-development-mode-and-memory-leaks/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>WillPaginate and custom paging.</title>
		<link>http://www.coffeepowered.net/2010/08/07/willpaginate-and-custom-paging/</link>
		<comments>http://www.coffeepowered.net/2010/08/07/willpaginate-and-custom-paging/#comments</comments>
		<pubDate>Sat, 07 Aug 2010 22:57:19 +0000</pubDate>
		<dc:creator>Chris Heald</dc:creator>
				<category><![CDATA[Rails]]></category>
		<category><![CDATA[Ruby]]></category>
		<category><![CDATA[will_paginate]]></category>

		<guid isPermaLink="false">http://www.coffeepowered.net/?p=300</guid>
		<description><![CDATA[will_paginate is the de facto Rails paging plugin, and with good reason &#8211; it&#8217;s solid, fast, and reliable. Everyone I know uses it, but a lot of people don&#8217;t use it to its full power. I recently discovered some very cool functionality it includes &#8211; the WillPaginate::Collection class can be used as a custom paginator [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://wiki.github.com/mislav/will_paginate/">will_paginate</a> is the de facto Rails paging plugin, and with good reason &#8211; it&#8217;s solid, fast, and reliable. Everyone I know uses it, but a lot of people don&#8217;t use it to its full power.</p>
<p>I recently discovered some <em>very</em> cool functionality it includes &#8211; the <code>WillPaginate::Collection</code> class can be used as a custom paginator for effectively any enumerable collection. It&#8217;s very simple, too. I recently used it to build pages of the most popular tags on posts in my database. My data store is MongoDB, and I&#8217;m fetching an array consisting of two-element arrays, <code>[tag, tag_count]</code>. To use will_paginate&#8217;s functionality with this, I just use the following:</p>
<pre class="brush: ruby;">
tags = Post.tag_counts(nil, {:sort =&gt; [&quot;value&quot;, &quot;descending&quot;]}) # Return an array of tag/count pairs. Custom function, so it can't leverage the finder on Post.
@topics = WillPaginate::Collection.create(current_page, 20, tags.length) do |pager|
	pager.replace(tags.slice(pager.offset, pager.offset + pager.per_page))
end
</pre>
<p><code>current_page</code> is a helper that derives the current page from the request parameters. The rest of it is self-explanitory. I can now use <code>@topics</code> in my page just as I&#8217;d use a paginated result set from the database.</p>
<pre class="brush: ruby;">
- @topics.each do |topic|
    # ...
=will_paginate @topics
</pre>
<p>Bam. Doesn&#8217;t get much easier than that. You can get exceptionally creative with it, too. Effectively, all you need to know is:</p>
<ul>
<li>WillPaginate::Collection#new takes 3 parameters: the current page, the per-page count, and optionally, the total number of entries.</li>
<li>The <code>pager</code> block variable exposes <code>offset</code> and <code>per_page</code> properties, prime for passing into a DB query or slicing an enumerable with</li>
<li>Call pager.replace(sub-array) with the current page&#8217;s set of elements.</li>
</ul>
<p>That&#8217;s literally all there is to it. Now you can have easy pagination on just about any collection you can conceive of. Let WillPaginate handle all the heavy lifting and such. If you&#8217;ve done enough pagination by hand, you&#8217;ll probably appreciate the easy beauty of this particular method.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.coffeepowered.net/2010/08/07/willpaginate-and-custom-paging/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Setting up replica sets with MongoDB 1.6</title>
		<link>http://www.coffeepowered.net/2010/08/06/setting-up-replica-sets-with-mongodb-1-6/</link>
		<comments>http://www.coffeepowered.net/2010/08/06/setting-up-replica-sets-with-mongodb-1-6/#comments</comments>
		<pubDate>Fri, 06 Aug 2010 09:26:13 +0000</pubDate>
		<dc:creator>Chris Heald</dc:creator>
				<category><![CDATA[MongoDB]]></category>

		<guid isPermaLink="false">http://www.coffeepowered.net/?p=284</guid>
		<description><![CDATA[Introduction MongoDB 1.6 was released today, and it includes, among other things it includes support for the incredible sexy replica sets feature &#8211; basically master/slave replication on crack with automatic failover and the like. I&#8217;m setting it up, and figured I&#8217;d document the pieces as I walk through them. My test deploy is going to [...]]]></description>
			<content:encoded><![CDATA[<h3>Introduction</h3>
<p><a href="http://www.mongodb.org/">MongoDB</a> 1.6 <a href="http://www.mongodb.org/downloads">was released today</a>, and it includes, among other things it includes support for the incredible sexy <a href="http://www.mongodb.org/display/DOCS/Replica+Sets">replica sets feature</a> &#8211; basically master/slave replication on crack with automatic failover and the like. I&#8217;m setting it up, and figured I&#8217;d document the pieces as I walk through them.</p>
<p>My test deploy is going to consist of two nodes and one arbiter; production will have several more potential nodes. We aren&#8217;t worrying about sharding at this point, but 1.6 brings automatic sharding with it, as well, so we can enable that at a later point if we need to.<br />
<span id="more-284"></span></p>
<h3>Installation</h3>
<p>Installation is very easy. 10gen offers a <a href="http://www.mongodb.org/display/DOCS/CentOS+and+Fedora+Packages">yum repo</a>, so it&#8217;s as easy as adding the repo to <code>/etc/yum.repos.d</code> and then running <code>yum install mongo-stable mongo-server-stable</code>.</p>
<p>Once installed, <code>mongo --version</code> confirms that we&#8217;re on 1.6. Time to boot up our nodes.</p>
<h3>Configuration</h3>
<p>For staging, we&#8217;re going to run both replica nodes and the arbiter on a single machine. This means 3 configs.</p>
<p>I have 3 config files in <code>/etc/mongod/</code> &#8211; <code>mongo.node1.conf</code>, <code>mongo.node2.conf</code>, and <code>mongo.arbiter.conf</code>. As follows:</p>
<pre class="brush: plain;">
# mongo.node1.conf
replSet=my_replica_set
logpath=/var/log/mongo/mongod.node1.log
port = 27017
logappend=true
dbpath=/var/lib/mongo/node1
fork = true
rest = true
</pre>
<pre class="brush: plain;">
# mongo.node2.conf
replSet=my_replica_set
logpath=/var/log/mongo/mongod.node2.log
port = 27018
logappend=true
dbpath=/var/lib/mongo/node2
fork = true
</pre>
<pre class="brush: plain;">
# mongo.arbiter.conf
replSet=my_replica_set
logpath=/var/log/mongo/mongod.arbiter.log
port = 27019
logappend=true
dbpath=/var/lib/mongo/arbiter
fork = true
oplogSize = 1
</pre>
<h3>Starting it up</h3>
<p>Then we just fire up our daemons:</p>
<pre class="brush: plain;">
mongod -f /etc/mongod/mongo.node1.conf
mongod -f /etc/mongod/mongo.node2.conf
mongod -f /etc/mongod/mongo.arbiter.conf
</pre>
<p>Once we spin up the servers, they need a bit to allocate files and start listening. I tried to connect a bit too early, and got the following:</p>
<pre class="brush: bash;">
[root@261668-db3 mongo]# mongo
MongoDB shell version: 1.6.0
connecting to: test
Fri Aug  6 03:48:40 Error: couldn't connect to server 127.0.0.1} (anon):1137
exception: connect failed
</pre>
<h3>Configuring replica set members</h3>
<p>Once you can connect to the mongo console, and we need to set up the replica set. If you have a compliant configuration, then you can just call <code>rs.initiate()</code> and everything will get spun up. If you don&#8217;t, though, you&#8217;ll need to specify your initial configuration.</p>
<p>This is where I hit my first problem; the hostname as the system defines it didn&#8217;t resolve. This was resulting in the following:</p>
<pre class="brush: jscript;">
[root@261668-db3 init.d]# mongo --port 27017
MongoDB shell version: 1.6.0
connecting to: 127.0.0.1:27017/test
&gt; rs.initiate();
{
        &quot;info2&quot; : &quot;no configuration explicitly specified -- making one&quot;,
        &quot;errmsg&quot; : &quot;couldn't initiate : need members up to initiate, not ok : 261668-db3.db3.domain.com:27017&quot;,
        &quot;ok&quot; : 0
}
</pre>
<p>The solution, then, is to specify the members, and to use a resolvable internal name. Note that you do NOT include the arbiter&#8217;s information; you don&#8217;t want to add it to the replica set early as a full-fledged member.</p>
<pre class="brush: jscript;">
&gt; cfg = {_id: &quot;my_replica_set&quot;, members: [{_id: 0, host: &quot;db3:27017&quot;}, {_id: 1, host: &quot;db3:27018&quot;}] }
&gt; rs.initiate(cfg);
{
        &quot;info&quot; : &quot;Config now saved locally.  Should come online in about a minute.&quot;,
        &quot;ok&quot; : 1
}
</pre>
<p>Bingo. We&#8217;re in business.</p>
<h3>Configuring the replica set arbiter</h3>
<p>If the replica set master fails, a new master is elected. To be elected, a replica master needs to have at least floor(<em>n</em> / 2) + 1 votes, where <em>n</em> is the number of active nodes in the cluster. In a paired setup, if the master were to fail, then the remaining slave wouldn&#8217;t be able to elect itself to the new master, since it would only have 1 vote. Thus, we run an arbiter, which is a special lightweight, no-data-contained node whose only job is to be a tiebreaker. It will vote with the orphaned slave and elect it to the new master, so that the slave can continue duties while the old master is offline.</p>
<pre class="brush: jscript;">
&gt; rs.addArb(&quot;db3:27019&quot;)
{
        &quot;startupStatus&quot; : 6,
        &quot;errmsg&quot; : &quot;Received replSetInitiate - should come online shortly.&quot;,
        &quot;ok&quot; : 0
}
</pre>
<h3>Updated driver usage</h3>
<p>Once we&#8217;re set up, the Ruby Mongo connection code is updated to connect to a replica set rather than a single server.</p>
<p>Before:</p>
<pre class="brush: ruby;">
MongoMapper.connection = Mongo::Connection.new(&quot;db3&quot;, 27017)
</pre>
<p>After</p>
<pre class="brush: ruby;">
MongoMapper.connection = Mongo::Connection.multi([[&quot;db3&quot;, 27017], [&quot;db3&quot;, 27018]])
</pre>
<p>This will attempt to connect to each of the defined servers, and get a list of all the visible nodes, then find the master. Since you don&#8217;t have to specify the full list, you don&#8217;t have to update your connection info each time you change the machines in the set. All it needs is at least one connectable server (even a slave) and the driver will figure out the master from there.</p>
<h3>Conclusion</h3>
<p>That&#8217;s about all there is to it! We&#8217;re now up and running with a replica set. We can add new slaves to the replica set, force a new master, take nodes in the cluster down, and all that jazz without impacting your app. You can even set up replica slaves in other data centers for zero-effort offsite backup. If your DB server exploded, you could point your app at the external datacenter&#8217;s node and keep running while you replace your local database server. Once your new server is up, just bring it online and re-add its node back into your replica set. Data will be transparently synched back to your local node. Once the sync is complete, you can re-elect your local node as the master, and all is well again.</p>
<p>Congratulations &#8211; enjoy your new replica set!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.coffeepowered.net/2010/08/06/setting-up-replica-sets-with-mongodb-1-6/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Click mapping with HTML5 and node.js</title>
		<link>http://www.coffeepowered.net/2010/08/03/click-mapping-with-html5-and-node-js/</link>
		<comments>http://www.coffeepowered.net/2010/08/03/click-mapping-with-html5-and-node-js/#comments</comments>
		<pubDate>Tue, 03 Aug 2010 21:36:50 +0000</pubDate>
		<dc:creator>Chris Heald</dc:creator>
				<category><![CDATA[MongoDB]]></category>
		<category><![CDATA[jQuery]]></category>
		<category><![CDATA[node.js]]></category>

		<guid isPermaLink="false">http://www.coffeepowered.net/?p=273</guid>
		<description><![CDATA[I was recently in need of a click mapping solution, and didn&#8217;t like most of the solutions I came across. They had huge dependency chains and were generally unwieldy, or they didn&#8217;t work that well, or they were external services that I had to pay for&#8230;until I ran across heatmapthing. Now we&#8217;re talking. Client-side rendering [...]]]></description>
			<content:encoded><![CDATA[<p>I was recently in need of a click mapping solution, and didn&#8217;t like most of the solutions I came across. They had huge dependency chains and were generally unwieldy, or they didn&#8217;t work that well, or they were external services that I had to pay for&#8230;until I ran across <a href="http://heatmapthing.heroku.com/">heatmapthing</a>. Now we&#8217;re talking. Client-side rendering of JSON location data &#8211; we&#8217;re in business!</p>
<p>First things first. If this is TL;DR for you, <a href="http://coffeepowered.net/projects/clickhax/">here&#8217;s the demo</a>, or click the &#8220;Click Heatmap&#8221; button in the corner of this page.</p>
<p>My first iteration was an endpoint in my current Rails app, which handled saving/sending of click data. That worked fine, but for something as lightweight and common as a click, I didn&#8217;t want to be invoking my full Rails stack. I&#8217;ve also been meaning to play with <a href="http://nodejs.org/">node.js</a>&#8230;well hey, there&#8217;s an opportunity here!</p>
<p>First, I had to get the client code working. I modified the existing heatmap code into a jQuery plugin, which handled all the setup/transmission/rendering of data. This enables you to do something like so:</p>
<pre><code>$("#body_wrapper").clickhax({ trigger: "#showHeatmap", endpoint: "/map" });</code></pre>
<p>What that does is attach the handlers to your wrapper element, and sets up the HTML5 canvas to display over that. Events will be sent to <code>/map</code> (which, in this example, is ProxyPassed to my node.js daemon), and clicking an element with an id of <code>showHeatmap</code> causes the heatmap data to be fetched and rendered on the client. The client itself just takes a raw JSON dataset and performs smoothing and rendering with it. It&#8217;s fairly basic canvas work &#8211; the majority of the heavy lifting is non-graphical &#8211; but still, it&#8217;ll only work in browsers that support the canvas tag. Sorry, oldschool IE users.</p>
<p>Okay, great, that&#8217;s working, what about the backend now?</p>
<p>Node.js is remarkably easy to get up and running on, and with the addition of the <a href="http://expressjs.com/">Express</a> package, it behaves an awful lot like Sinatra. I&#8217;m using MongoDB as my backend store for this, which is handy, since it natively speaks JSON, and there are client libraries for node. Using the npm utility, I quickly had them installed and was up and running.</p>
<p>You can <a href="http://github.com/cheald/clickhax/blob/master/clickhax-daemon.js">see the code on GitHub</a>, but I&#8217;ll touch on the key points here first.</p>
<p>The biggest gotcha I ran into this was in my treatment of the database connection handling. It took me a little while to recognize that the calls being made are <strong>asynchronous</strong>. This is important. This is very <em>important</em>. Rather than writing it top-down like a Ruby script, I had to, as you can see, use the provided callback chains. In particular, in my get(&#8220;/&#8221;) handler, I was performing the query and then immediately trying to iterate the cursor &#8211; this doesn&#8217;t work! You have to iterate in the callback. (In my defense, it was late and my brain was foggy!)</p>
<p>The code is pretty straightforward, though. When you post to your endpoint, it accepts x and y parameters and parses out the referring URL as the click target page. The plugin computes the click as an offset from the top left corner of your wrapped element, so if you have a fixed-width wrapper, your click data remains consistent even with differing monitor sizes. Data is quantitized to 5px before storage, and storage is done with upserts and MongoDB&#8217;s atomic increment; multiple clicks in the same 5px square will simply increment a counter in that record, rather than saving a record per click.</p>
<p>Positions are indicated by assuming a maximum width of 3000px. This allows us to store positions as single integers, rather than position pairs or strings. The client plugin is aware of this, and can reverse a given index into an x/y pair accordingly. The getter simply constructs a hash of <code>{position: click_count}</code> and sends that to the client. The client then applies a blur pattern on top of those points to generate a smoothed heightfield, and then normalizes that heightfield to the 0..255 range. Those heights are then mapped to colors and rendered onto the canvas. That&#8217;s all there is to it!</p>
<p>Quantitizing to 5px squares means that for my 600&#215;400 demo, I have 9600 potential squares, and each square takes 6-11 bytes of JSON to represent. Thus, even for a fully saturated clickmap, I should only ever have to receive/compute/render 103kb worth of data. That number of obviously increases as you increase the size of the target area &#8211; 960&#215;2000 would be a maximum of 825kb of data for a fully saturated clickmap. However, in practice, full saturation should be a non-concern. Your clicks will be focused around interactable elements, and due to the atomic increment counters, heatmaps should remain light and snappy both for inserts and fetches, regardless of the number of clicks in a page.</p>
<p>If you don&#8217;t already have node.js and mongodb, the setup may be a bit more involved, but you could use PHP/MySQL, or Rails with SQLite or whatever as your endpoint server. The front and back ends are relatively decoupled, and can be re-used independently of each other.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.coffeepowered.net/2010/08/03/click-mapping-with-html5-and-node-js/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>FlexAuth: Portable authentication for Battle.net</title>
		<link>http://www.coffeepowered.net/2010/06/10/flexauth-portable-authentication-for-battle-net/</link>
		<comments>http://www.coffeepowered.net/2010/06/10/flexauth-portable-authentication-for-battle-net/#comments</comments>
		<pubDate>Thu, 10 Jun 2010 22:58:14 +0000</pubDate>
		<dc:creator>Chris Heald</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.coffeepowered.net/?p=257</guid>
		<description><![CDATA[I&#8217;ve just released my first Android app, called FlexAuth. It&#8217;s mostly an excuse to learn Android development, but it does something useful, too &#8211; it serves as a souped-up mobile authenticator for Blizzard&#8217;s Battle.net login infrastructure. If you&#8217;d like the gory details, there&#8217;s a specification floating around on the internet that&#8217;ll help you understand the [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve just released my first Android app, called <a href="http://www.cyrket.com/p/android/com.chrisheald.flexauth/">FlexAuth</a>. It&#8217;s mostly an excuse to learn Android development, but it does something useful, too &#8211; it serves as a souped-up mobile authenticator for Blizzard&#8217;s Battle.net login infrastructure. If you&#8217;d like the gory details, <a href="http://bnetauth.freeportal.us/specification.html">there&#8217;s a specification floating around on the internet</a> that&#8217;ll help you understand the protocol.</p>
<p><a href="http://i.imgur.com/yW496.png"><img src="http://i.imgur.com/yW496.png" align="right" style="margin: 1em; width:200px;" /></a><br />
Mobile authenticators work by transforming a seed value (called the &#8220;token secret&#8221;) + the current time into your 8-digit authentication code. FlexAuth lets you set up multiple authenticators by providing the secret, or will let you have Blizzard generate one for you.</p>
<p>Why would you need this?</p>
<ul>
<li>You want to use a mobile authenticator, but don&#8217;t want to be locked out if you ever lose your phone (just setup a new token with your registered token secret).</li>
<li>You want to use multiple mechanisms to log in &#8211; maybe you need token authentication in a script, or you want to have the same authenticator values on multiple mobile phones.</li>
<li>You already have a token secret from another source and want to use it on your mobile phone.</li>
</ul>
<p>Obviously, these won&#8217;t apply to most people, but some folks will definitely find it useful.</p>
<hr style="clear: both;" />
<a href="http://i.imgur.com/NbAGQ.png"><img src="http://i.imgur.com/NbAGQ.png" align="right" style="margin: 1em; width:200px;" /></a><br />
Using it:</p>
<ol>
<li>Menu -> Add Account</li>
<li>Enter a name for this token/account. It can be whatever you&#8217;d like.</li>
<li>Either enter a serial + secret, or you can use the already-provided one, or generate a new one.</li>
<li>Save the token. You&#8217;ll notice that auth codes start generating right away.</li>
<li>It is highly recommended that you back up your token secret. If you uninstall the app, wipe your phone, etc, then you will lose the secret, and consequently lose the ability to generate auth codes. To back up a code, click into the token&#8217;s details, and long press on the secret to copy it. You can then paste it into a note or email or whatnot. To restore a token, simply generate a new token and use your backed up secret. It will generate compatible auth codes.</li>
</ol>
<p>All that said, <span style="color: #ff0000;">a word of caution</a>: <b>Never ever ever run authenticator software on the same machine that you&#8217;re logging in on.</b> It&#8217;s bad, it&#8217;s dumb, and you shouldn&#8217;t do it. Keep your authentication token generation on a separate device if you value your account.</p>
<p>If <a href="http://www.wowwiki.com/Battle.net_Mobile_Authenticator#Desktop_port">any particular same-machine authentication scheme</a> gained any measure of popularity, it would be targeted by malware and your authenticator would be useless. Don&#8217;t do it.</p>
<p>Other than that, enjoy!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.coffeepowered.net/2010/06/10/flexauth-portable-authentication-for-battle-net/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Serving files out of GridFS, part 2</title>
		<link>http://www.coffeepowered.net/2010/02/24/serving-files-out-of-gridfs-part-2/</link>
		<comments>http://www.coffeepowered.net/2010/02/24/serving-files-out-of-gridfs-part-2/#comments</comments>
		<pubDate>Wed, 24 Feb 2010 11:44:24 +0000</pubDate>
		<dc:creator>Chris Heald</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.coffeepowered.net/?p=244</guid>
		<description><![CDATA[Since my initial experiments with GridFS and nginx-gridfs, I discovered a rather downer of a dealbreaker: compiling Passenger and nginx-gridfs into the same nginx binary makes nginx very unhappy. It hard-freezes (as in, blocks forever) when you request a GridFS file with Passenger enabled. Oops. So, I sat down and fixed gridfs-fuse. You can grab [...]]]></description>
			<content:encoded><![CDATA[<p>Since my initial experiments with <a href="http://www.mongodb.org/display/DOCS/GridFS+Specification">GridFS</a> and <a href="http://github.com/mdirolf/nginx-gridfs">nginx-gridfs</a>, I discovered a rather downer of a dealbreaker: compiling <a href="http://www.modrails.com/">Passenger</a> and nginx-gridfs into the same <a href="http://nginx.org/">nginx</a> binary makes nginx very unhappy. It hard-freezes (as in, blocks forever) when you request a GridFS file with Passenger enabled. Oops.</p>
<p>So, I sat down and fixed gridfs-fuse. You can grab <a href="http://github.com/cheald/gridfs-fuse">my branch at GitHub</a>. I made a few changes that make it ideal for serving files out of a GridFS DB, with a few caveats.<br />
<span id="more-244"></span></p>
<h2>Installation and Configuration</h2>
<p>Building it is relatively simple.</p>
<ol>
<li>Install scons, the Python SConstruct utility (on Fedora/CentOS/RHEL, <code>yum install scons</code>)</li>
<li>Extract or symlink a copy of your <a href="http://www.mongodb.org/display/DOCS/Home">mongodb</a> install to <code>/opt/mongo</code></li>
<li>Run <code>scons</code></li>
<li>If all builds well, yay. If not, fix any missing dependencies or path issues. Edit SConstruct to change any paths that you need to.</li>
<li>Create a mount point for your GridFS filesystem; I used /mnt/gridfs (<code>sudo mkdir /mnt/gridfs</code>)</li>
<li>chown your mount point to your webserver&#8217;s user. If you run Apache, this is probably <code>apache</code>. If you run nginx, it&#8217;s probably <code>nobody</code>. (<code>sudo chown nobody.nobody /mnt/gridfs</code>)</li>
<li>Mount the database to the mount point.
<pre class="brush: bash;">
sudo -u nobody ./mount_gridfs --db=your_database --host=localhost /mnt/gridfs
</pre>
<p>Change the user and db parameters as required.
</li>
<li>Configure your webserver to serve files appropriately. In my case, I have <a href="http://github.com/jnicklas/carrierwave">carrierwave</a> set up to write files to <code>uploads/model/_id/filename.png</code>, and carrierwave is configured to use <code>/images/gfs</code> as my base URL. This means that for a given file, I might end up with a path like <code>/images/gfs/uploads/user/avatar/4b8475cc69e0dc57e7000005/thumb_untitled-20.png</code>. To cause the GridFS files to be served off of the mount point, I just symlinked the mount to /images/gfs.
<pre class="brush: bash;">
cd public/images
ln -s /mnt/gridfs gfs
</pre>
</li>
</ol>
<p>Once that&#8217;s all set up, you should be able to use your webserver to serve images directly out of your Mongo database, and at pretty fair rates, too!</p>
<h2>143% Unscientific Benchmarks</h2>
<pre class="brush: plain;">
[chris@polaris gridfs-fuse]# ab -n 5000 -c 25 http://advice:81/images/gfs/uploads/user/avatar/4b8347a698db740b30000057/thumb_adrine-big.png

Server Software:        nginx/0.8.33
Server Hostname:        advice
Server Port:            81

Document Path:          /images/gfs/uploads/user/avatar/4b8347a698db740b30000057/thumb_adrine-big.png
Document Length:        14332 bytes

Concurrency Level:      25
Time taken for tests:   5.029 seconds
Complete requests:      5000
Failed requests:        0
Write errors:           0
Total transferred:      72725000 bytes
HTML transferred:       71660000 bytes
Requests per second:    994.22 [#/sec] (mean)
Time per request:       25.145 [ms] (mean)
Time per request:       1.006 [ms] (mean, across all concurrent requests)
Transfer rate:          14121.93 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.1      0       1
Processing:    16   25   1.4     25      52
Waiting:        2   24   1.4     24      52
Total:         17   25   1.4     25      53

Percentage of the requests served within a certain time (ms)
  50%     25
  66%     25
  75%     25
  80%     25
  90%     25
  95%     26
  98%     27
  99%     32
 100%     53 (longest request)
</pre>
<h2>Caveats</h2>
<p>To get this working, I had to hack in directory support. GridFS stores files with paths, but doesn&#8217;t store them in a hierarchy; Fuse navigates a filesystem, which is hierarchical. In order to overcome this, I made gridfs-fuse respond to directory requests as valid. For a given file, gridfs-fuse will walk the following path hierarchy:</p>
<p><code>GET /uploads/user/avatar/4b8347a698db740b30000057/thumb_adrine-big.png</code><br />
Check for <code>uploads</code>, directory exists<br />
Check for <code>uploads/user</code>, directory exists<br />
Check for <code>uploads/user/avatar/</code>, directory exists<br />
Check for <code>uploads/avatar/4b8347a698db740b30000057</code>, directory exists<br />
Check for <code>uploads/user/avatar/4b8347a698db740b30000057/thumb_adrine-big.png</code>, file exists, return file.</p>
<p>There are two things to be aware of here:</p>
<ol>
<li>The deeper your path hierarchy, the more steps gridfs-fuse will take to find your file. Less directory nesting means faster file serving. The performance difference won&#8217;t be massive, but it&#8217;s there.</li>
<li><strong>/!\ Big giant hack. /!\</strong> <em>gridfs-fuse assumes that any path part with a period in it is the path leaf</em>. This is done so that we don&#8217;t have to keep querying the DB with regexes, which degrades performance by about 90% in my testing. Always, always, always make sure your filenames have a period in them, and make sure your directories do not have a period in them. This is a rather hefty set of caveats, but if you&#8217;ll stick to them, you will be rewarded with easy GridFS file serving.</li>
</ol>
<h3>What happens if I don&#8217;t follow those rules?</h3>
<p>A few things happen. If you put periods in directory names, you&#8217;ll get 404s. They&#8217;ll be fast 404s, but they&#8217;ll be 404s. Even if a filepath is valid, like <code>/images/foo.bar/baz/bin.png</code>, gridfs-fuse will short-circuit at <code>images/foo.bar</code>, assuming that is the leaf of the hierarchy.</p>
<p>If you don&#8217;t put a period in your filenames, then gridfs-fuse will keep returning &#8220;yup, that&#8217;s a directory&#8221;, even when your webserver requests <code>/images/foo.bar/baz/bin.png/index.html</code> and then <code>/images/foo.bar/baz/bin.png/index.html/index.html</code> and then <code>/images/foo.bar/baz/bin.png/index.html/index.html/index.html</code>, and so forth. There&#8217;s a built-in stop at 10 levels deep &#8211; at 10 levels, gridfs-fuse gives up and just returns a 404, but it&#8217;ll take you a relatively long time to get there, and it&#8217;s really very highly recommended that you don&#8217;t do that.</p>
<h2>What about when gridfs-fuse isn&#8217;t running?</h2>
<p>Never fear, that&#8217;s easily fixed. Just use a Rack or Rails Metal middleware to serve images from GridFS. This is <strong>massively</strong> slower than serving files through gridfs-fuse, but at least your visitors won&#8217;t be treated to a site full of broken images if your mount point goes away for whatever reason. I&#8217;m using the following Metal endpoint. Just throw it into app/metals/gridfs.rb, add <code>config.metals = ["Gridfs"]</code> into your environment.rb, and you&#8217;re off to the races.</p>
<pre class="brush: ruby;">
# rails metal to be used with carrierwave (gridfs) and MongoMapper

require 'mongo'
require 'mongo/gridfs'

# Allow the metal piece to run in isolation
require(File.dirname(__FILE__) + &quot;/../../config/environment&quot;) unless defined?(Rails)

class Gridfs
  def self.call(env)
    if env[&quot;PATH_INFO&quot;] =~ /^\/images\/gfs\/(.+)$/
      key = $1
      if ::GridFS::GridStore.exist?(MongoMapper.database, key)
        ::GridFS::GridStore.open(MongoMapper.database, key, 'r') do |file|
          [200, {'Content-Type' =&gt; file.content_type}, [file.read]]
        end
      else
        [404, {'Content-Type' =&gt; 'text/plain'}, ['File not found.']]
      end
    else
      [404, {'Content-Type' =&gt; 'text/plain'}, ['File not found.']]
    end
  end
end
</pre>
<p>(I didn&#8217;t write that, but I can&#8217;t find the source to give credit at the moment).</p>
<p>That gives you a highly performant front-end solution with a reliable fallback. For any given request, the following should happen:</p>
<ol>
<li>Your webserver attempts to load the file out of GridFS. If it can&#8217;t be found (likely due to a missing mountpoint), then&#8230;</li>
<li>The request will fall through to your Metal handler. It will then attempt to serve it from GridFS.</li>
<li>If it still can&#8217;t be found, the request falls through to your Rails app.</li>
</ol>
<p>To prevent step 3 from happening, you might want to change line 18 of the Metal handler to return a 200 and read out a generic &#8220;missing image&#8221; image of some sort. That&#8217;ll prevent 404s from invoking a hit to your app.</p>
<p>Stick a CDN in front of it all, and you have a high-performance file upload solution with automatic replication and sharding that you can treat like any other piece of web data. Hooray!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.coffeepowered.net/2010/02/24/serving-files-out-of-gridfs-part-2/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Serving files out of GridFS</title>
		<link>http://www.coffeepowered.net/2010/02/17/serving-files-out-of-gridfs/</link>
		<comments>http://www.coffeepowered.net/2010/02/17/serving-files-out-of-gridfs/#comments</comments>
		<pubDate>Wed, 17 Feb 2010 20:54:11 +0000</pubDate>
		<dc:creator>Chris Heald</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.coffeepowered.net/?p=233</guid>
		<description><![CDATA[GridFS is a nifty little feature in MongoDB that allows you to store files of all shapes and sizes in Mongo itself, getting the benefits of Mongo&#8217;s sharding and replication. However, since they&#8217;re in a database, and not on the filesystem directly, how do we serve them? There are lots of benchmarks and numbers under [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.mongodb.org/display/DOCS/GridFS+Specification">GridFS</a> is a nifty little feature in <a href="http://www.mongodb.org/display/DOCS/Home">MongoDB</a> that allows you to store files of all shapes and sizes in Mongo itself, getting the benefits of Mongo&#8217;s sharding and replication. However, since they&#8217;re in a database, and not on the filesystem directly, how do we serve them?</p>
<p>There are lots of benchmarks and numbers under the cut. Keep reading!</p>
<p><span id="more-233"></span></p>
<p>Right now, there are three options:</p>
<ol>
<li>Use a &#8220;low-level&#8221; script handler, like a Rack script or Rails Metal handler to serve them out of the database</li>
<li>Use something like <a href="http://github.com/mikejs/gridfs-fuse/">gridfs-fuse</a> to mount the database as a filesystem, and read it with the Fileserver directly</li>
<li>Use something like <a href="http://github.com/mdirolf/nginx-gridfs">nginx-gridfs</a> to talk directly to MongoDB from your webserver.</li>
</ol>
<p>I wasn&#8217;t able to get gridfs-fuse to build on my system, but I was able to build the nginx module. The question, of course, is how fast are you going be serving files with each solution?</p>
<h2>Filesystem read through Apache</h2>
<p>First, I&#8217;ll establish a baseline. I&#8217;m running Apache as my frontend server, and we&#8217;ll use ab to benchmark its throughput.</p>
<pre class="brush: ruby;">[chris@polaris conf]# ab -n 50000 -c 10 http://advice/images/embed/alliance-60.png

Server Software:        Apache/2.2.13
Server Hostname:        advice
Server Port:            80

Document Path:          /images/embed/normal_alliance-60.png
Document Length:        31596 bytes

Concurrency Level:      10
Time taken for tests:   1.904 seconds
Complete requests:      5000
Failed requests:        0
Write errors:           0
Total transferred:      159463760 bytes
HTML transferred:       158043192 bytes
Requests per second:    2625.37 [#/sec] (mean)
Time per request:       3.809 [ms] (mean)
Time per request:       0.381 [ms] (mean, across all concurrent requests)
Transfer rate:          81767.87 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    1   0.4      1       4
Processing:     1    3   0.5      3       6
Waiting:        0    1   0.4      1       4
Total:          2    4   0.4      4       8

Percentage of the requests served within a certain time (ms)
  50%      4
  66%      4
  75%      4
  80%      4
  90%      4
  95%      4
  98%      5
  99%      5
 100%      8 (longest request)
</pre>
<p>Nice and fast, like like we&#8217;d expect.</p>
<h2>Filesystem read through nginx</h2>
<pre class="brush: ruby;">[chris@polaris conf]# ab -n 50000 -c 10 http://advice:81/images/embed/normal_alliance-60.png

Server Software:        nginx/0.8.33
Server Hostname:        advice
Server Port:            81

Document Path:          /images/embed/normal_alliance-60.png
Document Length:        31596 bytes

Concurrency Level:      10
Time taken for tests:   7.623 seconds
Complete requests:      50000
Failed requests:        0
Write errors:           0
Total transferred:      1590513618 bytes
HTML transferred:       1579863192 bytes
Requests per second:    6559.31 [#/sec] (mean)
Time per request:       1.525 [ms] (mean)
Time per request:       0.152 [ms] (mean, across all concurrent requests)
Transfer rate:          203763.10 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.2      0       9
Processing:     1    1   0.4      1      11
Waiting:        0    0   0.1      0       9
Total:          1    1   0.5      1      12

Percentage of the requests served within a certain time (ms)
  50%      1
  66%      1
  75%      1
  80%      2
  90%      2
  95%      2
  98%      3
  99%      3
 100%     12 (longest request)
</pre>
<p>nginx <i>screams</i>. At 6500 requests/sec, it&#8217;s blisteringly fast.</p>
<h2>GridFS read through nginx-gridfs</h2>
<pre class="brush: ruby;">[chris@polaris conf]# ab -n 5000 -c 10 http://advice:81/images/gfs/uploads/user/avatar/4b7b2c0e98db7475fc000003/normal_alliance-60.png

Server Software:        nginx/0.8.33
Server Hostname:        advice
Server Port:            81

Document Path:          /images/gfs/uploads/user/avatar/4b7b2c0e98db7475fc000003/normal_alliance-60.png
Document Length:        31596 bytes

Concurrency Level:      10
Time taken for tests:   4.613 seconds
Complete requests:      5000
Failed requests:        0
Write errors:           0
Total transferred:      158580000 bytes
HTML transferred:       157980000 bytes
Requests per second:    1083.88 [#/sec] (mean)
Time per request:       9.226 [ms] (mean)
Time per request:       0.923 [ms] (mean, across all concurrent requests)
Transfer rate:          33570.65 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.0      0       1
Processing:     1    9   4.7      9     103
Waiting:        1    9   4.7      9     102
Total:          2    9   4.7      9     103

Percentage of the requests served within a certain time (ms)
  50%      9
  66%      9
  75%      9
  80%      9
  90%      9
  95%      9
  98%      9
  99%     11
 100%    103 (longest request)
</pre>
<p>Definitely a lot slower, but still very respectable. 1051 requests/sec is going to be more than adequate for most purposes, particularly if fronted with a CDN.</p>
<p>And finally&#8230;</p>
<h2>Rails Metal handler</h2>
<p>The nice thing about the Rails metal handler solution is that it&#8217;s easy. No recompiling, just drop the handler into your project and you&#8217;re off to the races. That said&#8230;</p>
<pre class="brush: ruby;">[chris@polaris nginx-gridfs]$ ab -n 250 -c 4  http://advice/images/gfs/uploads/user/avatar/4b7b2c0e98db7475fc000003/normal_alliance-60.png

Server Software:        Apache/2.2.13
Server Hostname:        advice
Server Port:            80

Document Path:          /images/gfs/uploads/user/avatar/4b7b2c0e98db7475fc000003/normal_alliance-60.png
Document Length:        31596 bytes

Concurrency Level:      4
Time taken for tests:   4.646 seconds
Complete requests:      250
Failed requests:        0
Write errors:           0
Total transferred:      7960000 bytes
HTML transferred:       7899000 bytes
Requests per second:    53.81 [#/sec] (mean)
Time per request:       74.338 [ms] (mean)
Time per request:       18.585 [ms] (mean, across all concurrent requests)
Transfer rate:          1673.10 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.1      0       1
Processing:    15   74  75.6     34     287
Waiting:        0   72  75.8     30     276
Total:         15   74  75.6     34     288

Percentage of the requests served within a certain time (ms)
  50%     34
  66%     39
  75%    139
  80%    192
  90%    201
  95%    210
  98%    239
  99%    245
 100%    288 (longest request)
</pre>
<p>I obviously ran far fewer requests this go-round. The reason is pretty obvious &#8211; running 5000 requests through the Ruby stack would have taken approximately <em>forever</em>. At 53 requests per second, this is not an attractive solution, particularly if you consider the CPU overhead that it&#8217;s incurring.</p>
<h2>Conclusions</h2>
<table class='data' border='1'>
<tr>
<th>Solution</th>
<th>Requests/second</th>
<th>% Apache FS</th>
<th>% Nginx FS</th>
<th>% Nginx GridFS</th>
<th>% Apache Ruby</th>
</tr>
<tr>
<td>Filesystem via Apache</th>
<td>2625.37</td>
<td>-</td>
<td>40.03%</td>
<td>242.22%</td>
<td>4,878.96%</td>
</tr>
<tr>
<td>Filesystem via Nginx</th>
<td>6559.31</td>
<td>249.84%</td>
<td>-</td>
<td>605.17%</td>
<td>12,189.76%</td>
</tr>
<tr>
<td>GridFS via nginx module</th>
<td>1083.88</td>
<td>41.28%</td>
<td>16.52%</td>
<td>-</td>
<td>2014.27%</td>
</td>
</tr>
<tr>
<td>Rails metal handler via Passenger</th>
<td>53.81</td>
<td>2.05%</td>
<td>0.82%</td>
<td>4.96%</td>
<td>-</td>
</tr>
</table>
<p>If you&#8217;re looking to abstract away from storing files on a filesystem, GridFS is a feasable solution. It can really crank some mean output numbers, and though it&#8217;s not up to par with a raw filesystem read, also consider that in many production environments, such a raw filesystem read might be happening via an NFS or GFS share, which is going to massively degrade the performance of that request. Given the no-hassle store-and-forget-about-it solution that GridFS offers, even when faced with the challenge of multi-server replication, it seems that you can get enough performance out of it to justify it as a solution.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.coffeepowered.net/2010/02/17/serving-files-out-of-gridfs/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

<!-- Performance optimized by W3 Total Cache. Learn more: http://www.w3-edge.com/wordpress-plugins/

Minified using memcached
Page Caching using memcached
Database Caching 9/18 queries in 0.012 seconds using memcached

Served from: www.coffeepowered.net @ 2010-09-07 06:30:19 -->