Rails Cache 在 Ruby China 里面的应用

huacnlee for Ruby China · 2014年05月21日 · 最后由 tdeng 回复于 2016年06月13日 · 22735 次阅读

本帖已被管理员设置为精华贴

看最近 @quakewang 分享的《总结 web 应用中常用的各种 cache》，我也搭车分享一下在 Ruby China 里面，我们是如何做 Cache 的。

首先给大家看一下 NewRelic 的报表

最近 24h 的平均响应时间

流量高的那些页面 (Action)

访问量搞的几个 Action 的情况：

TopicsController#show
UsersController#show (比较惨，主要是 GitHub API 请求拖慢) PS: 在发布这篇文章之前我有稍加修改了一下，GitHub 请求放到后台队列处理，新的结果是这样:
TopicsController#index
HomeController#index

从上面的报表来看，目前 Ruby China 后端的请求，排除用户主页之外，响应时间都在 100ms 以内，甚至更低。

我们是如何做到的？

Markdown 缓存
Fragment Cache
数据缓存
ETag
静态资源缓存 (JS,CSS，图片)

Markdown 缓存

在内容修改的时候就算好 Markdown 的结果，存到数据库，避免浏览的时候反复计算。

此外这个东西也特意不放到 Cache，而是放到数据库里面：

为了持久化，避免 Memcached 停掉的时候，大量丢失；
避免过多占用缓存内存；

class Topic
  field :body # 存放原始内容，用于修改
  field :body_html # 存放计算好的结果，用于显示

  before_save :markdown_body
  def markdown_body
    self.body_html = MarkdownTopicConverter.format(self.body) if self.body_changed?
  end
end

Fragment Cache

这个是 Ruby China 里面用得最多的缓存方案，也是速度提升的原因所在。

app/views/topics/_topic.html.erb

<% cache([topic, suggest]) do %>
<div class="topic topic_line topic_<%= topic.id %>">
   <%= link_to(topic.replies_count,"#{topic_path(topic)}#reply#{topic.replies_count}",
          :class => "count state_false") %>
  ... 省略内容部分

</div>
<% end %>

用 topic 的 cache_key 作为缓存 cache views/topics/{编号}-#{更新时间}/{suggest 参数}/{文件内容 MD5} -> views/topics/19105-20140508153844/false/bc178d556ecaee49971b0e80b3566f12
某些涉及到根据用户帐号，有不同状态显示的地方，直接把完整 HTML 准备好，通过 JS 控制状态，比如目前的“喜欢“功能。
由于缓存参数里面有 topic.cache_key，这段缓存代码有 LRU 的机制，当 topic 更新的时候，ActiveRecord 会更新 updated_at 为更新的时间节点，于是 cache_key 下次调用的时候会是一个新的值，于是乎，这里 cache 会认为没有 hit 到而生成新的内容，也就达到了更新过期缓存的效果。

<script type="text/javascript">
  var readed_topic_ids = <%= current_user.filter_readed_topics(@topics) %>;
  for (var i = 0; i < readed_topic_ids.length; i++) {
    topic_id = readed_topic_ids[i];
    $(".topic_"+ topic_id + " .right_info .count").addClass("state_true");
  }
</script>

再比如 app/views/topics/_reply.html.erb

<% cache([reply,"raw:#{@show_raw}"]) do %>
<div class="reply">
  <div class="pull-left face"><%= user_avatar_tag(reply.user, :normal) %></div>
  <div class="infos">
    <div class="info">
      <span class="name">
        <%= user_name_tag(reply.user) %>
      </span>
      <span class="opts">
        <%= likeable_tag(reply, :cache => true) %>
        <%= link_to("", edit_topic_reply_path(@topic,reply), :class => "edit icon small_edit", 'data-uid' => reply.user_id, :title => "修改回帖")%>
        <%= link_to("", "#", 'data-floor' => floor, 'data-login' => reply.user_login,
            :title => t("topics.reply_this_floor"), :class => "icon small_reply" )
        %>
      </span>
    </div>
    <div class="body">
      <%= sanitize_reply reply.body_html %>
    </div>
  </div>
</div>
<% end %>

同样也是通过 reply 的 cache_key 来缓存 views/replies/202695-20140508081517/raw:false/d91dddbcb269f3e0172bf5d0d27e9088。

同时这里还有复杂的用户权限控制，用 JS 实现；

<script type="text/javascript">
  $(document).ready(function(){
    <% if admin? %>
      $("#replies .reply a.edit").css('display','inline-block');
    <% elsif current_user %>
      $("#replies .reply a.edit[data-uid='<%= current_user.id %>']").css('display','inline-block');
    <% end %>
    <% if current_user && !@user_liked_reply_ids.blank? %>
      Topics.checkRepliesLikeStatus([<%= @user_liked_reply_ids.join(",") %>]);
    <% end %>
  })
</script>

关于利用 `updated_at` 字段实现 LRU 机制的简单介绍

假设有这么个缓存的场景（不仅 Fragment Cache 可以哦，其他地方也是可以的）

<% cache(@post) %>
  <h1><%= @post.title %></h1>
  <div class="markdown">
    <%= markdown @post.body %>
  </div>
<% end %>

缓存的流程是这样的：

@post.cache_key => posts/10-20140508081517

        < cache(@post) >
              |
    read_cache(@post.cache_key) -> <hit> -> return
              |
            <miss>
              |
        [Execute ERB code] -> write_cache(@post.cache_key, data)
              |
            return

当我们更新数据的时候

@post.update(title: 'New title', body: 'New body')
# cache_key 变了, 因为 updated_at 在 `update` 调用的时候会更新
@post.cache_key => posts/10-20140508101213

于是再次访问 cache(@post) 的时候，流程会这样

    < cache(@post) >
          |
read_cache(@post.cache_key)
          |
        <miss>
          |
    [Execute ERB code] -> write_cache(@post.cache_key, data)
          |
        return

如果你能掌握好这个方法，你几乎不需要手动调用 Rails.cache.delete 来删除缓存，就能有效的在各种场景设置缓存，同时又能在数据变更的时候实时的让老的缓存失效。

TIP: Memcached 是不能用 (Pattern Matchs 的方式清理缓存的哦，我见过好多人想 delete('posts/*') 这么干。

数据缓存

其实 Ruby China 的大多数 Model 查询都没有上 Cache 的，因为据实际状况来看，MongoDB 的查询响应时间都是很快的，大部分场景都是在 5ms 以内，甚至更低。

我们会做一些复杂的数据查询缓存，比如：GitHub Repos 获取

def github_repos(user_id)
  cache_key = "user:#{user_id}:github_repos"
  items = Rails.cache.read(cache_key)
  if items.blank?
    items = real_fetch_from_github()
    Rails.cache.write(cache_key, items, expires_in: 15.days)
  end
  return items
end

ETag

ETag 是在 HTTP Request, Response 可以带上的一个参数，用于检测内容是否有更新过，以减少网络开销。

过程大概是这样

第一次请求

      [浏览器]                   浏览器收到，并记录到本地 Cache
         |                         |
         |  [GET /index.html]      | [HTTP status 200]
         |                         | [ETag: abc]
         |                         |
  [Rails Controller]               |
         |                         |
      [Views]                      |
         |-------------------------|-

第二次请求 /index.html

      [浏览器]                   浏览器收到，并记录到本地 Cache
         |                         |                          |
         |  [GET /index.html]      | [HTTP status 304]        | [HTTP Status 200]
         |  [ETag: abc]            | [ETag: abc]              | [ETag: efg]
         |                         |                          |
  [Rails Controller] --------------|                          |
         |                      ETag 相同                      |
         |                                                    |
      [Views] ------------------------------------------------|-
                                ETag 不同

Rails 的 fresh_when 方法可以帮助将你的查询内容生成 ETag 信息

def show
  @topic = Topic.find(params[:id])

  fresh_when(etag: [@topic])
end

静态资源缓存

请不要小看这个东西，后端写得再快，也有可能被这些拖慢（浏览器上面的表现）!

1、合理利用 Rails Assets Pipeline，一定要开启！

# config/environments/production.rb
config.assets.digest = true

2、在 Nginx 里面将 CSS, JS, Image 的缓存有效期设成 max；

location ~ (/assets|/favicon.ico|/*.txt) {
  access_log        off;
  expires           max;
  gzip_static on;
}

3、尽可能的减少一个页面 JS, CSS, Image 的数量，简单的方法是合并它们，减少 HTTP 请求开销；

<head>
  ... 
  只有两个
  <link href="//l.ruby-china.com/assets/front-1a909fc4f255c12c1b613b3fe373e527.css" rel="stylesheet" />
  <script src="//l.ruby-china.com/assets/app-24d4280cc6fda926e73419c126c71206.js"></script>
  ...
</head>