Rails How Did Tenderlove and Others Speed Up Rails?

larrylv · October 08, 2014 · Last by larrylv replied at October 12, 2014 · 6156 hits
Topic has been selected as the excellent topic by the admin.

Blog Link: http://blog.larrylv.com/how-did-tenderlove-and-others-speed-up-rails/

Rails 4.2.0 beta1 was released August 20, 2014. And according to dhh's release post, and I quote,

a lot of common queries are now no less than twice as fast in Rails 4.2!

So, what did Rails team -- or more specifically -- tenderlove (Aaron Patterson) do to improve Rails/ActiveRecord so much? Let's find out through some commits.

Performance Tools

Here are some tools Aaron has used for measuring performance according to his Cascadia Ruby 2014 talk:

You should definitely checkout these tools. It would be very useful in your daily Ruby/Rails development.

GitHub Commit: drastically reduce object allocations

Inside the tag_option method, the value variable was html escaped first, then interpolated into a string.

def tag_option(key, value, escape)
    value = value.join(" ") if value.is_a?(Array)
    value = ERB::Util.h(value) if escape # html escaped here
    %(#{key}="#{value}")                 # interpolated into a string here.
end

And if digging into html_escape (alias as h) method, we will see that there will be a AS::SafeBuffer object allocated:

def html_escape(s)
    s = s.to_s
    if s.html_safe?
        s
    else
        s.gsub(HTML_ESCAPE_REGEXP, HTML_ESCAPE).html_safe # a String allocated first(by String#gsub),
                                                           # then a SafeBuffer object allocated.
    end
end

So for tag_option, there will always be an AS::SafeBuffer useless object allocated. This could be solved by adding another escape method but doesn't wrap string with an AS::SafeBuffer. And tag_option should call that method instead of the old one.

And this tiny change reduced the AS::SafeBuffer objects from 1527 per request to about 500 per request according to Aaron's benchmark. It is trully drastically, awesome!

GitHub Commit: No need for another hash allocation / merge!

This commit is very simple, but it should really attrack our attention when writing Ruby codes.

Hash#merge! will allocate a new hash, but with Hash#[] this would not happen. And accoriding to my benchmark(I wrote a blog about Performance Differences in Ruby, you may would like to check it out) , it really matters.

def slow
  (1..10).inject({}) { |h, e| h.merge!(e => e) }
end

def fast
  (1..10).inject({}) { |h, e| h[e] = e; h }
end
slow    72613.7 (±9.9%) i/s -     364662 in   5.082934s
fast   158245.6 (±7.1%) i/s -     796005 in   5.056857s

There are a large number of commits like this in Rails repo recently (because performance really matters, right?).

  • Fewer hash allocations when calling url_for by Aaron.
  • This commit by @sferik, changing Hash#keys.each to Hash#each_key. Because Hash#keys.each will allocate an array of keys, but Hash#each_key iterates through the keys without allocating a new array. I also benchmark on this in my article I mentioned above.
  • And this commit by Aaron is also the same, use Hash#each_key to avoid some objects allocation.
  • Or this one.

@sferik gave a talk about these skills at Baruco 2014, and he is also who made me want to blog and benchmark these in my article. The video has not been released, but you should definitely check out his slides Writing Fast Ruby.

GitHub Commit: reduce object allocations

This commit is basically same with the previous one. It's about performance differences on how to use Hash#zip. And Aaron's commit message explains it all.

x = [1,2,3,4]
y = [3,2,1]

def test x, y
  hash = {}
  x.zip(y) { |k,v| hash[k] = v }
  hash
end

def test2 x, y
  Hash[x.zip(y)]
end

def test3 x, y
  x.zip(y).each_with_object({}) { |(k,v),hash| hash[k] = v }
end

def stat num
  start = GC.stat(:total_allocated_object)
  num.times { yield }
  total_obj_count = GC.stat(:total_allocated_object) - start
  puts "#{total_obj_count / num} allocations per call"
end

stat(100) { test(x,y) }
stat(100) { test2(x,y) }
stat(100) { test3(x,y) }

__END__
2 allocations per call
7 allocations per call
8 allocations per call

Sum Up

When somenone outsite of Ruby/Rails community talks bout Rails, the performance will always be brought to the conversation. And it really concerns developers when choosing tools to build their apps. After giving so much attention to impove it, we could say Rails is getting faster and Rails will be much more faster later.

So thanks to everyone who has contributed to Rails performance improvement, you guys make this community better and better.

❤

很好,不知道实际环境的 Benchmark 会怎么样......

@48hour Discourse team 搞过不同 Rails 版本的 benchmark,也可以在本地跑,有空我测一下。

纯干货,强 :plus1:

学习啦! Write fast Ruby/Rails.

全是干货!👏

干干的! 👏

期待 4.2 stable 版本的发布 👏 👏 👏 👏 👏 👏 👏

12 Floor has deleted
13 Floor has deleted

原理是减少对象创建内存分配次数,套路还是很标准的

还有一个提交值得学习 https://github.com/rails/rails/pull/17173

#14 楼 @jasl 恩,这个 thread 我一直在关注中。

You need to Sign in before reply, if you don't have an account, please Sign up first.