<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Coffee Powered &#187; optimization</title>
	<atom:link href="http://www.coffeepowered.net/tag/optimization/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.coffeepowered.net</link>
	<description>code and content</description>
	<lastBuildDate>Mon, 09 Jan 2012 18:32:06 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3</generator>
		<item>
		<title>Re: Simple RoR+MySQL optimization</title>
		<link>http://www.coffeepowered.net/2008/09/30/re-simple-rormysql-optimization/</link>
		<comments>http://www.coffeepowered.net/2008/09/30/re-simple-rormysql-optimization/#comments</comments>
		<pubDate>Tue, 30 Sep 2008 21:42:44 +0000</pubDate>
		<dc:creator>Chris Heald</dc:creator>
				<category><![CDATA[Rails]]></category>
		<category><![CDATA[Ruby]]></category>
		<category><![CDATA[garabge collector]]></category>
		<category><![CDATA[mysql]]></category>
		<category><![CDATA[optimization]]></category>
		<category><![CDATA[orm]]></category>

		<guid isPermaLink="false">http://www.coffeepowered.net/?p=58</guid>
		<description><![CDATA[I recently ran across a rather bare post espousing some generic &#8220;optimization&#8221; techniques for Rails apps. It offered no education, no explanation, no benchmarks. So, I thought, why not put those claims to the test? find_by_sql versus find_by_x First, Konstantin claims that Model#find_by_field is slower than Model#find_by_sql. This one is hard to dispute; the first [...]]]></description>
			<content:encoded><![CDATA[<p>I recently ran across a <a href="http://guruonrails.com/blog/simple-ror-mysql-optimization">rather bare post</a> espousing some generic &#8220;optimization&#8221; techniques for Rails apps. It offered no education, no explanation, no benchmarks. So, I thought, why not put those claims to the test?<br />
<span id="more-58"></span></p>
<h2>find_by_sql versus find_by_x</h2>
<p>First, Konstantin claims that <code>Model#find_by_field</code> is slower than <code>Model#find_by_sql</code>. This one is hard to dispute; the first will invoke method_missing and spend time generating SQL, while the latter simply executes a statement. Is cutting the knees out from under your ORM worth the time saved? Let&#8217;s see!</p>
<pre class="brush: ruby; title: ; notranslate">
require 'benchmark'

def measure_find_by_sql_vs_orm(num = 1000)
  puts &quot;find_by_sql (#{num}x)&quot;
  puts Benchmark.measure {
    num.times { User.find_by_sql &quot;select * from users where id = 123&quot; }
  }

  puts &quot;find_by_id (#{num}x)&quot;
  puts Benchmark.measure {
    num.times { User.find_by_id 123 }
  }
end

measure_find_by_sql_vs_orm(10000)
</pre>
<p>Let&#8217;s run this a few times.</p>
<pre><code>
[chris@polaris benchmarks]$ script/runner benchmark.rb
find_by_sql (10000x)
  2.290000   0.540000   2.830000 (  4.452150)
find_by_id (10000x)
  4.660000   0.400000   5.060000 (  6.766629)

[chris@polaris benchmarks]$ script/runner benchmark.rb
find_by_sql (10000x)
  2.300000   0.480000   2.780000 (  4.473950)
find_by_id (10000x)
  4.520000   0.560000   5.080000 (  6.837272)

[chris@polaris benchmarks]$ script/runner benchmark.rb
find_by_sql (10000x)
  2.170000   0.540000   2.710000 (  4.419207)
find_by_id (10000x)
  4.580000   0.540000   5.120000 (  6.881676)

find_by_sql: Averages 4.44 sec for 10,000 queries
find_by_id: Averages 6.83 sec for 10,000 queries
</code></pre>
<p>Conclusion the first: Using the ORM to build SQL adds some overhead; in my tests, 2.47 sec/10,000 queries, or 0.000247 seconds per query. Is this worth optimizing out? Yeah, probably not. In fact, the productivity lost by using <code>find_by_sql</code> is likely going to end up costing the project more.</p>
<h2>IDs and numbers in quotes</h2>
<p>Second, they claim that quoting values in your SQL statements slows down your queries. This one struck me as just a <em>little</em> out there. Let&#8217;s see what the benchmarks say.</p>
<pre class="brush: ruby; title: ; notranslate">
require 'benchmark'

def measure_select_with_quotes(num = 1000)
  puts &quot;Without quotes (#{num}x):&quot;
  db = ActiveRecord::Base.connection.instance_variable_get :@connection
  puts Benchmark.measure {
    num.times { db.query(&quot;select * from users where id = 123&quot;) {} }
  }

  puts &quot;With quotes (#{num}x):&quot;
  puts Benchmark.measure {
    num.times { db.query(&quot;select * from users where id = \&quot;123\&quot;&quot;) {} }
  }
end

measure_select_with_quotes(10000)
</pre>
<p>And the results:</p>
<pre><code>
[chris@polaris benchmarks]$ script/runner benchmark.rb
Without quotes (10000x):
  0.690000   0.340000   1.030000 (  2.639554)
With quotes (10000x):
  0.670000   0.290000   0.960000 (  2.655049)

[chris@polaris benchmarks]$ script/runner benchmark.rb
Without quotes (10000x):
  0.570000   0.320000   0.890000 (  2.654003)
With quotes (10000x):
  0.550000   0.400000   0.950000 (  2.617369)
</code></pre>
<p>Well, that&#8217;s certainly interesting. In 10,000 queries, an average difference of about 3/100ths of a second. Certainly not worth combing through your codebase as an optimization point.</p>
<p>Conclusion the second: The performance gain from quoted versus non-quoted field values is so small to be inconsequential.</p>
<p>On a side note, there is a <b>very</b> interesting subtlety here. Observe the difference between</p>
<pre class="brush: ruby; title: ; notranslate">
num.times { db.query(&quot;select * from users where id = 123&quot;) {} }
</pre>
<p>and </p>
<pre class="brush: ruby; title: ; notranslate">
num.times { db.query(&quot;select * from users where id = 123&quot;) }
</pre>
<p>The former passes the <code>Mysql::Result</code> object to a block, and frees it after the block terminates. The latter does not, and the returned <code>Mysql::Result</code> object remains in scope for the entire pass of the benchmark. This subtlety makes a massive difference.</p>
<pre class="brush: ruby; title: ; notranslate">
def measure_select_with_free(num = 1000)
  db = ActiveRecord::Base.connection.instance_variable_get :@connection

  puts &quot;Query with block, result immediately freed&quot;
  puts Benchmark.measure {
    num.times { db.query(&quot;select * from users where id = 123&quot;) {} }
  }

  puts &quot;Query without block, result remains in scope&quot;
  puts Benchmark.measure {
    num.times { db.query(&quot;select * from users where id = 123&quot;) }
  }
end
</pre>
<pre><code>
[chris@polaris benchmarks]$ script/runner benchmark.rb
Query with block, result immediately freed
  0.060000   0.040000   0.100000 (  0.267983)
Query without block, result remains in scope
  5.040000   0.050000   5.090000 (  5.266476)
</code></pre>
<p>Whoa damn. Ruby&#8217;s GC is <i>slaughtering</i> performance there. Just adding a pair of curly braces makes the benchmark run <i>20 times faster</i>.</p>
<h2>It&#8217;s better to request only specific column</h2>
<p>Finally, Konstantin mentions that selecting only specific fields from a table is faster. This is a truth in both MySQL and in the ActiveRecord ORM, for a number of reasons. However, he says that</p>
<blockquote><p>Person.find_by_name(&#8220;Name&#8221;).phone_number. It would be much faster if you use: Person.find_by_sql(&#8220;SELECT persons.phone_number WHERE persons.name = &#8216;Name&#8217;&#8221;) </p></blockquote>
<p>Why not just use the :select option that ActiveRecord provides?</p>
<pre class="brush: ruby; title: ; notranslate">
Person.find_by_name(&quot;Name&quot;, :select =&gt; &quot;phone_number&quot;)
</pre>
<p>Let&#8217;s test those assumptions.</p>
<pre class="brush: ruby; title: ; notranslate">
def measure_single_field_select(num = 1000)
  puts &quot;Find with all fields&quot;
  puts Benchmark.measure {
    num.times { User.find_by_id(123)}
  }

  puts &quot;Find with one field, with :select&quot;
  puts Benchmark.measure {
    num.times { User.find_by_id(123, :select =&gt; &quot;email&quot;)}
  }
end
</pre>
<pre><code>
[chris@polaris benchmarks]$ script/runner benchmark.rb
Find with all fields
  0.720000   0.060000   0.780000 (  0.963273)
Find with one field, with :select
  0.310000   0.010000   0.320000 (  0.364554)

[chris@polaris benchmarks]$ script/runner benchmark.rb
Find with all fields
  0.710000   0.110000   0.820000 (  1.014548)
Find with one field, with :select
  0.260000   0.020000   0.280000 (  0.351761)
</code></pre>
<p>Very significant difference there&#8230;and we didn&#8217;t have to bypass the ORM to get it, either.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.coffeepowered.net/2008/09/30/re-simple-rormysql-optimization/feed/</wfw:commentRss>
		<slash:comments>10</slash:comments>
		</item>
	</channel>
</rss>

