今天我做了几件事情:
1. 给 Topic model 加上了 last_reply_user_login 和 node_name 字段,在数据创建的时候写入,以减少列表也的动态查询;
2. 修正了分页的 Bug 带来的性能问题;
3. 给 User, Node, Section 加上了 MongoDB 的 cache;
4. MongoDB 移到了 MongoIC 上面去;
5. 服务上面的 Ruby 重新装了一下那个 falcon 补丁,并试着调整了一下 GC 参数(看起来比之前好些了);
随着数据越来越多,我渐渐感觉 ruby-china 越来越慢了,日志显示有时候论坛首页打开要600ms
...无法接受,而论坛的分页页面就更离谱了,1300ms
+ ...
于是我试着检查慢下来的原因
开始是想将 最后回复人和节点两个数据写死到 Topic
里面
class Topic
belongs_to :user
belongs_to :last_reply_user
field :last_reply_user_login
belongs_to :node
field :node_name
end
目前列表页面需要关联三个数据 user
, last_reply_user
和 node
,而 node
的数据有不是市场修改的,而列表展示只是需要 name
字段,所以如果直接再 Topic
保存的时候存下 node_name
,那查询的时候就会省下很多了,last_reply_user
也是同理。
这个改动后,论坛首页就只有两个动态查询了。
不过这个改动导致我不得不把修改帐号名的功能给关闭的。
论坛首页的速度是上去了,但是分页依然还是很慢。 无意中我发现查询日志里面 includes user 的 id 数量非常多,有 Topic 的总数那么多...
后面才知道原来 will_paginate
无法支持 Mongoid 的分页方式,目前实现的分页原来是假象,它是先取出全部的数据后,用 Array 来分页...
解决办法就是另外再安装 will_paginate_mongoid
这个 Gem,具体再 will_paginate_mongoid 的 Github 页面上也有说明 https://github.com/lucasas/will_paginate_mongoid
然后,速度自然上去了,这个 Bug 数据少的时候无法发现...
做个标记,以后调整哪来对比 1 CPU, 1G 内存,4 个 unicorn 进程,帖子 1000 用户 1000 现在的响应速度
Started GET "/topics/last"
Completed 200 OK in 265ms (Views: 201.0ms | Mongo: 52.8ms | Solr: 0.0ms)
Started GET "/topics"
Completed 200 OK in 120ms (Views: 88.4ms | Mongo: 30.7ms | Solr: 0.0ms)
Started GET "/topics/node2?page=2"
Completed 200 OK in 198ms (Views: 152.2ms | Mongo: 36.3ms | Solr: 0.0ms)
Started GET "/topics"
Completed 200 OK in 223ms (Views: 90.3ms | Mongo: 130.8ms | Solr: 0.0ms)
帖子查看页面,以 777 这篇为例,目前有 58 个回复
Started GET "/topics/777" at 2012-02-04 20:41:05 +0800
Processing by TopicsController#show as HTML
Parameters: {"id"=>"777"}
Rendered topics/_base.html.erb (1.9ms)
Rendered app/cells/topics/sidebar_for_new_topic_node.html.erb (0.9ms)
Rendered topics/_topic_info.html.erb (3.7ms)
Rendered topics/translation/_topic_reply_statics.zh-CN.html.erb (0.3ms)
Rendered replies/_reply.html.erb (5.1ms)
Rendered replies/_reply.html.erb (3.5ms)
Rendered replies/_reply.html.erb (4.0ms)
Rendered replies/_reply.html.erb (4.0ms)
Rendered replies/_reply.html.erb (5.3ms)
Rendered replies/_reply.html.erb (3.9ms)
Rendered replies/_reply.html.erb (3.2ms)
Rendered replies/_reply.html.erb (3.8ms)
Rendered replies/_reply.html.erb (3.3ms)
Rendered replies/_reply.html.erb (3.5ms)
Rendered replies/_reply.html.erb (3.6ms)
Rendered replies/_reply.html.erb (3.6ms)
Rendered replies/_reply.html.erb (3.3ms)
Rendered replies/_reply.html.erb (3.7ms)
Rendered replies/_reply.html.erb (4.4ms)
Rendered replies/_reply.html.erb (4.7ms)
Rendered replies/_reply.html.erb (3.8ms)
Rendered replies/_reply.html.erb (4.3ms)
Rendered replies/_reply.html.erb (3.7ms)
Rendered replies/_reply.html.erb (3.1ms)
Rendered replies/_reply.html.erb (3.7ms)
Rendered replies/_reply.html.erb (3.2ms)
Rendered replies/_reply.html.erb (3.0ms)
Rendered replies/_reply.html.erb (3.6ms)
Rendered replies/_reply.html.erb (3.2ms)
Rendered replies/_reply.html.erb (3.3ms)
Rendered replies/_reply.html.erb (3.3ms)
Rendered replies/_reply.html.erb (3.2ms)
Rendered replies/_reply.html.erb (3.5ms)
Rendered replies/_reply.html.erb (4.1ms)
Rendered replies/_reply.html.erb (6.9ms)
Rendered replies/_reply.html.erb (6.0ms)
Rendered replies/_reply.html.erb (3.9ms)
Rendered replies/_reply.html.erb (3.1ms)
Rendered replies/_reply.html.erb (4.3ms)
Rendered replies/_reply.html.erb (4.0ms)
Rendered replies/_reply.html.erb (4.3ms)
Rendered replies/_reply.html.erb (3.3ms)
Rendered replies/_reply.html.erb (3.8ms)
Rendered replies/_reply.html.erb (7.3ms)
Rendered replies/_reply.html.erb (5.3ms)
Rendered replies/_reply.html.erb (3.8ms)
Rendered replies/_reply.html.erb (4.1ms)
Rendered replies/_reply.html.erb (4.1ms)
Rendered replies/_reply.html.erb (3.2ms)
Rendered replies/_reply.html.erb (3.6ms)
Rendered replies/_reply.html.erb (3.6ms)
Rendered replies/_reply.html.erb (4.0ms)
Rendered replies/_reply.html.erb (3.2ms)
Rendered replies/_reply.html.erb (3.3ms)
Rendered replies/_reply.html.erb (3.9ms)
Rendered replies/_reply.html.erb (10.3ms)
Rendered replies/_reply.html.erb (8.5ms)
Rendered replies/_reply.html.erb (6.0ms)
Rendered replies/_reply.html.erb (3.7ms)
Rendered replies/_reply.html.erb (3.6ms)
Rendered replies/_reply.html.erb (3.6ms)
Rendered replies/_reply.html.erb (3.7ms)
Rendered topics/_replies.html.erb (252.6ms)
Rendered topics/_editor_toolbar.html.erb (0.4ms)
Rendered app/cells/topics/reply_help_block.zh-CN.html.erb (0.0ms)
Rendered topics/show.html.erb within layouts/application (279.3ms)
Rendered common/_search_form.html.erb (0.2ms)
Rendered common/_user_nav.html.erb (4.9ms)
Rendered common/_feeds.html.erb (0.7ms)
Rendered common/_google_analytics.html.erb (0.1ms)
Completed 200 OK in "500ms" (Views: "288.6ms" | Mongo: "49.6ms" | Solr: 0.0ms)
似乎 MongoIC 依然不给力,刚刚有几个请求:
Completed 200 OK in 842ms (Views: 162.0ms | Mongo: 677.9ms | Solr: 0.0ms)
Completed 200 OK in 948ms (Views: 95.1ms | Mongo: 851.6ms | Solr: 0.0ms)
这句呀:
<% replies.each_with_index do |reply,i| %>
<%= render :partial => "replies/reply", :locals => {:topic => topic, :item => reply, :i => i} %>
<% end %>
改成:
<%= render :partial => "replies/reply", :collection => replies%>
(模板里的变量可能要调整一下)
渲染局部模板是 Rails 里的性能瓶颈之一,像这种循环里 render 相当于数据库查询里面的 n+1 query 了。。
#6 楼 @hooopo 原来 Rails 3 还有这个东西,才知道。 http://rails.rubyonrails.org/classes/ActionView/Partials.html
<%= render :partial => "replies/reply", :collection => replies%>
replies/reply.html.erb
第 <%= reply_counter + 1 %>楼, 创建于 <%= reply.created_at %>
刚刚测试,目前查看帖子页面最耗时的就是 format_topic_body
这个转换过程,比如 777 这片 58 个回复,如果不用 format_topic_body
的话,Rendered topics/_replies.html.erb
的时间可以从 (252.6ms) 下降到 60ms,这个数字还是很可观的。
需要做的就是把 format_topic_body
的过程放到数据提交的时候,查出来就是有格式的。
@fsword 做了一个简单的测试:
<% unless params[:collection]%>
<% @topic.posts.each do |post|%>
<%= render :partial => "post", :locals => {:post => post}%>
<% end%>
<% else %>
<%= render :partial => "post", :collection => @topic.posts%>
<% end %>
使用 collection 选项 100 个请求耗时 1.73s
hooopo@hooopo:~/demo$ ab -c1 -n100 http://localhost:3000/topics/1?collection=1
This is ApacheBench, Version 2.3 <$Revision: 655654 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/
Benchmarking localhost (be patient).....done
Server Software: thin
Server Hostname: localhost
Server Port: 3000
Document Path: /topics/1?collection=1
Document Length: 26220 bytes
Concurrency Level: 1
Time taken for tests: 1.731 seconds
Complete requests: 100
Failed requests: 0
Write errors: 0
Total transferred: 2688900 bytes
HTML transferred: 2622000 bytes
Requests per second: 57.78 [#/sec] (mean)
Time per request: 17.307 [ms] (mean)
Time per request: 17.307 [ms] (mean, across all concurrent requests)
Transfer rate: 1517.20 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 0.0 0 0
Processing: 11 17 10.7 11 47
Waiting: 11 17 10.7 11 47
Total: 11 17 10.7 11 47
Percentage of the requests served within a certain time (ms)
50% 11
66% 13
75% 14
80% 36
90% 37
95% 38
98% 43
99% 47
100% 47 (longest request)
循环 render 耗时 2.24
hooopo@hooopo:~/demo$ ab -c1 -n100 http://localhost:3000/topics/1
This is ApacheBench, Version 2.3 <$Revision: 655654 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/
Benchmarking localhost (be patient).....done
Server Software: thin
Server Hostname: localhost
Server Port: 3000
Document Path: /topics/1
Document Length: 26467 bytes
Concurrency Level: 1
Time taken for tests: 2.249 seconds
Complete requests: 100
Failed requests: 0
Write errors: 0
Total transferred: 2713600 bytes
HTML transferred: 2646700 bytes
Requests per second: 44.46 [#/sec] (mean)
Time per request: 22.492 [ms] (mean)
Time per request: 22.492 [ms] (mean, across all concurrent requests)
Transfer rate: 1178.20 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 0.0 0 0
Processing: 14 22 12.3 16 61
Waiting: 14 22 12.3 16 61
Total: 15 22 12.3 16 61
Percentage of the requests served within a certain time (ms)
50% 16
66% 18
75% 22
80% 42
90% 43
95% 48
98% 52
99% 61
100% 61 (longest request)
ruby 1.9.3-p0 ,Rails3.2.1 评论数量是 100 个,_post.html.erb 是简单的读属性没有复杂逻辑,从日志也可以看到循环 render 的耗时明显高于 render collection
不过另外一个问题就是用户修改了 email,头像无法更新,无法清除缓存,这个还是个麻烦事...不过这种情况应该很少 如果是用户头像用用户名作为固定的地址,比如 //l.ruby-china.com/avatar/huacnlee.jpg,新上传永远覆盖这个文件,就不会有这个麻烦
#17 楼 @fsword https://gist.github.com/1739013 前面 100 个是加了 collection 选项的,后面是没加的
现在加了回复内容的缓存以后
Started GET "/topics/777"
Completed 200 OK in 124ms (Views: 42.1ms | Mongo: 25.6ms | Solr: 0.0ms)
@huacnlee 能不能稍微讲一下 Mongoid#cache 文档上说
Out of the box, Mongoid wraps the MongoDB Ruby Driver's cursor in order to be memory efficient for large queries and data sets. However if you want the query to load all matching documents in memory and return them on n iterations without hitting the database again you may cache a class or criteria:
这个缓存是针对每个 request 的,还是全局的?大体的机制是什么样的?谢谢