Redis 多线程使用 Redis gem 时遇到的性能问题

holysoros · 2015年01月14日 · 最后由 holysoros 回复于 2015年01月18日 · 12462 次阅读

使用 redis.rb 这个 gem 往 Redis 中大量地写，每秒钟数千个 request，发现在多线程环境下会遇到比较严重的性能问题。

比如，在方法 process_item 中有 1000 次 set，如果单线程执行这个方法，耗费 1s；当有 5 个线程并行执行这个方法时，这个方法会耗费 10~15s！

看了下 redis.rb 这个 gem 中：

class Redis
  ...
  include MonitorMixin

  def synchronize
    mon_synchronize { yield(@client) }
  end

  def incr(key)
    synchronize do |client|
      client.call([:incr, key])
    end
  end

所有方法都有 synchronize 同步，根据 MonitorMixin 的文档：

at each point in time, at most one thread may be executing any of its methods.

怀疑就是多个线程在互相等解锁，造成每个线程执行 process_item 方法成倍增加。

尝试使用了 connection_pool 这个库，并不能解决问题，我认为这个 gem 对 Redis 根本没有作用，无法避免方法全局锁的限制。

有没有好的办法解决这样多线程并发地写 Redis 的问题呢？

hz_qiuyuanxin #0 2015年01月14日

Redis is a single-threaded server. It is not designed to benefit from multiple CPU cores. People are supposed to launch several Redis instances to scale out on several cores if needed. It is not really fair to compare one single Redis instance to a multi-threaded data store.

如上面说的，Redis 以单线程模型运行，不需要考虑并发读写的问题。

xxqfamous #1 2015年01月14日

1000 次 set 耗费 1s，不知道你有没有配合 hiredis 使用，并发如楼上所说并不需要考虑

holysoros #2 2015年01月14日

#1 楼 @hz_qiuyuanxin 由于要先做解包然后再将包中的数据写到 Redis，解包耗费大量 CPU，因此使用了多线程；worker 线程先解包，然后写 Redis，最后释放回 thread pool。

根据这两个条件：

Redis is a single-threaded server;
redis.rb 有方法全局锁;

因此，打算 worker 线程只做解包，然后将包推到一个队列，另外有一个线程专门从队列读，然后写到 Redis，这样就避免多线程竞争锁的问题。

不知道这样的想法是否可行。

hooopo #3 2015年01月14日

无度量，不优化。谢谢

hz_qiuyuanxin #4 2015年01月14日

#3 楼 @holysoros

As of Ruby 1.9, Ruby uses native threads. Native threads means that each thread created by Ruby is directly mapped to a thread generated at the Operating System level. Every modern programming language implements native threads, so it makes more sense to use native threads. Here are some pros of native threads:

Pros

Run on multiple processors

Scheduled by the OS

Blocking I/O operations don’t block other threads.

Even though have native threads in Ruby 1.9, only one thread will be executing at any given time, even if we have multiple cores in our processor. This is because of the GIL (Global Interpreter Lock) or GVL (Global VM Lock) that MRI Ruby (JRuby and Rubinius do not have a GIL, and, as such, have “real” threads) uses. This prevents other threads from being executed if one thread is already being executed by Ruby. But Ruby is smart enough to switch control to other waiting threads if one thread is waiting for some I/O operation to complete.

如上说的，在 Ruby 1.9 版本以后，I/O 密集型使用多线程是可以的，但是 CPU 密集型的 task，使用多线程就没有用了。所以我认为，使用多进程来解决 CPU 密集型的任务是更好的方案，Resque 貌似使用的是多进程，而 sidekiq 采用的是多线程。至于是用队列还是直接写，还是怎么样的，根据你的应用的实际情况做相应的策略。

而你上面指出的 redisrb 这个 gem 的源码，redis server 那边因为是单线程模型运行，所以 redis client 是有可能会被阻塞住的。对于你外面的采用多线程，MRI 本身要进行调度，所以在你这种情况下多线程并没有单线程有优势。

hz_qiuyuanxin #5 2015年01月14日

#3 楼 @holysoros 另外，你的 1000 次 set 不能压缩？例如 1000 次的 incr key，是可以换成 incrby key 1000 的

holysoros #6 2015年01月18日

#6 楼 @hz_qiuyuanxin @hooooopo “解包”的操作是 CPU bound task，经过测试度量，一个线程与多个线程做毫无影响；我使用的是 BinData 这个 gem 做解包的，虽然灵活，性能的确不理想；我并没有使用 Resque 或 sidekiq 这样的组件。

写 redis 的操作经过这几步优化后，性能有大幅提升：

使用 hiredis driver
使用 pipeline
使用 unixsocket 各自有 2~3x 的提升吧。

需要登录后方可回复, 如果你还没有账号请注册新账号