蝉游记( http://chanyouji.com )网站之前用 Nginx+Passenger+ 自制 script 来部署,随着用户增多,移动 app 的 api 调用增加,服务器增多和无缝部署重启的需求,转移到了 Nginx+Unicorn+Capistrano,写篇博客记录一下各种细节和需要注意的地方。
Nginx 的配置
gzip on;
#开启gzip,同时对于api请求的json格式也开启gzip
gzip_types application/json;
#每台机器都运行nginx+unicorn,本机用domain socket,方便切换
upstream ruby_backend {
server unix:/tmp/unicorn.sock fail_timeout=0;
server 10.4.8.34:4096 fail_timeout=0;
server 10.4.3.8:4096 fail_timeout=0;
}
#用try_files方式和proxy执行rails动态请求
server {
listen 80;
server_name chanyouji.com;
root /www/youji_deploy/current/public;
try_files $uri/index.html $uri.html $uri @user1;
location @user2 {
proxy_redirect off;
proxy_set_header Host $host;
proxy_set_header X-Forwarded-Host $host;
proxy_set_header X-Forwarded-Server $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_buffering on;
proxy_pass http://ruby_backend;
}
}
#用不同的域名提供静态资源服务,减少主域名带来的cookie请求和方便做cdn源
server {
listen 80;
server_name cdn.chanyouji.cn cdnsource.chanyouji.cn;
root /www/youji_deploy/current/public;
location ~ ^/(assets)/ {
root /www/youji_deploy/current/public;
gzip_static on; # to serve pre-gzipped version
expires max;
add_header Cache-Control public;
}
}
unicorn.rb 的配置
worker_processes 6
app_root = File.expand_path("../..", __FILE__)
working_directory app_root
# Listen on fs socket for better performance
listen "/tmp/unicorn.sock", :backlog => 64
listen 4096, :tcp_nopush => false
# Nuke workers after 30 seconds instead of 60 seconds (the default)
timeout 30
# App PID
pid "#{app_root}/tmp/pids/unicorn.pid"
# By default, the Unicorn logger will write to stderr.
# Additionally, some applications/frameworks log to stderr or stdout,
# so prevent them from going to /dev/null when daemonized here:
stderr_path "#{app_root}/log/unicorn.stderr.log"
stdout_path "#{app_root}/log/unicorn.stdout.log"
# To save some memory and improve performance
preload_app true
GC.respond_to?(:copy_on_write_friendly=) and
GC.copy_on_write_friendly = true
# Force the bundler gemfile environment variable to
# reference the Сapistrano "current" symlink
before_exec do |_|
ENV["BUNDLE_GEMFILE"] = File.join(app_root, 'Gemfile')
end
before_fork do |server, worker|
# 参考 http://unicorn.bogomips.org/SIGNALS.html
# 使用USR2信号,以及在进程完成后用QUIT信号来实现无缝重启
old_pid = app_root + '/tmp/pids/unicorn.pid.oldbin'
if File.exists?(old_pid) && server.pid != old_pid
begin
Process.kill("QUIT", File.read(old_pid).to_i)
rescue Errno::ENOENT, Errno::ESRCH
# someone else did our job for us
end
end
# the following is highly recomended for Rails + "preload_app true"
# as there's no need for the master process to hold a connection
defined?(ActiveRecord::Base) and
ActiveRecord::Base.connection.disconnect!
end
after_fork do |server, worker|
# 禁止GC,配合后续的OOB,来减少请求的执行时间
GC.disable
# the following is *required* for Rails + "preload_app true",
defined?(ActiveRecord::Base) and
ActiveRecord::Base.establish_connection
end
GC OOB
这篇 newrelic 的文章解释很清楚: http://blog.newrelic.com/2013/05/28/unicorn-rawk-kick-gc-out-of-the-band/ 就是将 GC 延迟到用户请求完成以后,这样就会缩短响应时间,配合现成的 gem unicorn-worker-killer 也不用担心内存爆掉。
在 config.ru 里面配置:
require 'unicorn/oob_gc'
require 'unicorn/worker_killer'
#每10次请求,才执行一次GC
use Unicorn::OobGC, 10
#设定最大请求次数后自杀,避免禁止GC带来的内存泄漏(3072~4096之间随机,避免同时多个进程同时自杀,可以和下面的设定任选)
use Unicorn::WorkerKiller::MaxRequests, 3072, 4096
#设定达到最大内存后自杀,避免禁止GC带来的内存泄漏(192~256MB之间随机,避免同时多个进程同时自杀)
use Unicorn::WorkerKiller::Oom, (192*(1024**2)), (256*(1024**2))
require ::File.expand_path('../config/environment', __FILE__)
run Youji::Application
Capistrano 部署脚本
set :unicorn_config, "#{current_path}/config/unicorn.rb"
set :unicorn_pid, "#{current_path}/tmp/pids/unicorn.pid"
namespace :deploy do
task :start, :roles => :app, :except => { :no_release => true } do
run "cd #{current_path} && RAILS_ENV=production bundle exec unicorn_rails -c #{unicorn_config} -D"
end
task :stop, :roles => :app, :except => { :no_release => true } do
run "if [ -f #{unicorn_pid} ]; then kill -QUIT `cat #{unicorn_pid}`; fi"
end
task :restart, :roles => :app, :except => { :no_release => true } do
# 用USR2信号来实现无缝部署重启
run "if [ -f #{unicorn_pid} ]; then kill -s USR2 `cat #{unicorn_pid}`; fi"
end
end
完成这些改进以后,部署蝉游记的新版本就只用输入 cap production deploy,然后就可以喝茶去了,也不用担心用户在重启动的时候会有短期卡死的问题 :)
补 2 张图: new relic 的监控图,和启用 OOB 之前相比,平均响应时间从 100ms 左右下降到了 90ms 左右:
服务器的内存和 CPU 使用: