quakewang · 回帖 · Ruby China

高级会员

第 162 位会员 / 2011-11-22

[email protected]

上海

26 篇帖子 / 752 条回帖

212 关注者

4 正在关注

22 收藏

GitHub Public Repos

linux 1

Linux kernel source tree
fiber-scripts 0
fiber 0
async 0

async programming library for MoonBit
ckb 0

CKB is a public/permissionless blockchain, the layer 1 of Nervos network.
molecule 0

Another serialization system: minimalist and canonicalization.
ckb-sdk-rust 0

Rust SDK for Nervos CKB
ckb-testtool 0

A helper library for writing CKB script test cases. It is migrated from capsule: https://github.c...
ckb-cli 0

CKB command line interface
ckb-vm-contrib 0

Community-contributed tools, extensions, testing and experimental features for the CKB-VM.

More on GitHub

一些奇怪的 Ruby 行为 at 2022年05月23日

ruby 不支持这样的嵌套方法定义，如果开启了 warning, 会看到如下错误：

irb -w
Foo.new.method_one
Foo.new.method_one
(irb):3: warning: method redefined; discarding old method_two
(irb):3: warning: previous definition of method_two was here

ruby map 有没有更简洁的写法 at 2021年02月05日
你可以给 Symbol 加个 patch
```
class Symbol
  def call(*args, &block)
    ->(caller, *rest) { caller.send(self, *rest, *args, &block) }
  end
end
```
这样任意的 proc shorthand 写法就可以写成
```
arr.map(&:[].(:a))
```
但是这样的可读性反而比不上非 shorthand 写法，记得 ruby 有讨论一个新的语法支持 shorthand argument, 语法类似这样，&0, &1, &N 代表 proc 的第 N 个元素，不过还没有被接受：
```
arr.map(&0[:a])
```
AWS Aurora 数据库有没有什么坑？ at 2021年01月14日

我有帮别人测试过从 RDS 迁移到 Aurora, 差不多 Aurora 要贵 30%, 忘记具体是哪项收费了，不过他们是单 RDS, 没有 read, 不过他们后来还是迁移了，没有遇到什么坑

ruby quiz at 2021年01月11日

one line style

'idP111~nm~~nm xxx ~~~~~~~~ ~~id~~br~~~qt10'.scan(/(id|nm|qt|pr)(.+?[^~](?:~~)*(?=(?:~[^~])|$))/).to_h.transform_values{|s| s.gsub('~~', '~')}

111 at 2020年10月31日

促进 GDP，是好事 😂😂😂
感觉 Ruby 缺少显式创建引用的功能,有没有大佬可以给核心开发者建议添加一下 at 2020年08月16日
Ruby 默认都是引用操作，你自定义一个类，测试一下就知道 array index 也是引用操作，不存在你说的性能消耗问题：
```
class Foo
    attr_accessor :bar
end

a = [Foo.new, Foo.new, Foo.new]
b = a[1..-1]

b[0].bar = 42
a[1].bar == b[0].bar
=> true
```

【Sensitive】基于 DFA 算法的 Ruby 敏感词过滤 Gem at 2020年06月26日

这种需求下，你的 DFA 结构没有必要添加 is_end / value，直接用 hash 是否为空来判断就可以知道是否是最后一个，另外代码里面用了很多 clone，会很消耗内存，可以用 inject:

def add_word(word)
  word = word.strip.gsub(%r{[^\p{Han}+/ua-zA-Z0-9]}, '')
  word.chars.inject(self.words) do |words, char|
    if !words.key?char)
      words[char] = {}
    end
    words[char]
  end
end

def filter(word)
  sensitive_word = ''
  word = word.strip.gsub(%r{[^\p{Han}+/ua-zA-Z0-9]}, '')
  word.chars.each_with_index.inject(self.words) do |words, (char, index)|
    if words.key?(char)
      sensitive_word += char
      break if words[char].empty?
      # 如果被检测的词已是最后一个，但关键字还不是最后，则返为空
      return '' if index == word.size - 1
      words[char]
    else
      # 如果上一步在关键字中，这次又不在关键字中，需要重新初始化检测
      if !sensitive_word.empty?
        sensitive_word = ''
        words = self.words and redo
      else
        words
      end
    end
  end
  sensitive_word
end

以上代码是手写的，filter 部分应该还能再简化一下

换一个视角：最不应该学的语言排行榜 2019 at 2019年04月19日

比如 PHP
关于机器 TIME_WAIT 过多的问题的请教 at 2019年03月19日
你在 location 里面添加这 2 个试试看，具体说明请参考 nginx 的文档：
```
proxy_http_version 1.1;
proxy_set_header Connection "";
```
关于机器 TIME_WAIT 过多的问题的请教 at 2019年03月19日

把 Nginx 配置贴上来看看？
关于机器 TIME_WAIT 过多的问题的请教 at 2019年03月19日

看这个 TIME_WAIT 是 nginx 和 puma 之间的连接，试试设置 nginx upstream keepalive 应该能解决这个问题。

Feed 流设计 (二)：拉模式 Vs 推模式 at 2019年01月19日

可以多创建一个表，记录被用户单条忽略的动态：

ignored_events
------------------------
user_id,  event_id
3      ,  10

然后加一个 not exists 或者 not in 查询

select e.* from events e, event_subscribers es
    where e.user_id = es.subscribed_user_id
      and es.user_id = 3
      and e.id not in (
        select id from ignored_events ie
          where ie.user_id = 3
      )

PostgreSQL 构建通用标签系统 at 2019年01月12日

文章写的好详细，赞。

如果担心大数据量的性能问题，还有一个选择是用 PostgreSQL 的 array，可以给它设置 GIN 类型的索引，本质上是一个全文索引的字段，Rails 也有相关的 gem : https://github.com/tmiyamon/acts-as-taggable-array-on

圣诞快乐！Ruby 如何用一句循环实现多个循环？ at 2018年12月25日

[*0..5, *2..8, *3..12, 2, 2, 2, 2].each{|i| print' '*(40-2*i-i/2)+'*'*(4*i+1+i)+"\n"}

求指教，关于冷热数据分离，各位大神们是如何在 Rails 中处理的？ at 2018年11月11日

可以看一下 ProxySQL 它能满足你提到的这些需求： https://github.com/sysown/proxysql

另外一个是 Vitess 但我对它不熟，听 Github 的人说他们在评估这个，你也可以看一下： https://github.com/vitessio/vitess
Rails job 默认的 Active Job 如何不并发调同一方法顺序执行方法 at 2018年11月11日
ActiveJob 目的是为了并发/异步执行，改成长度为 1 的队列，未免削足适履，更合适的解决方案是用 flock，用 Exclusive lock 和 None blocking 来保证一个文件只能被一个 JobWorker 处理，同时也有并发处理多文件的能力
```
f = File.open(...)
if f.flock(File::LOCK_EX | File::LOCK_NB)
  ...
end
```

求解释， [a,b].max 比调用 max 函数快 at 2018年10月18日

直觉是不可能，在我自己机器上跑了一下，结果是相反的，你的 ruby 是什么版本？

2.3.2 :018 > Benchmark.measure { maxa(1000)}
 => #<Benchmark::Tms:0x007fcd7a884b78 @label="", @real=3.692405005916953, @cstime=0.0, @cutime=0.0, @stime=0.040000000000000036, @utime=3.6, @total=3.64> 
2.3.2 :019 > Benchmark.measure { maxf(1000)}
 => #<Benchmark::Tms:0x007fcd7b027b00 @label="", @real=0.8660073862411082, @cstime=0.0, @cutime=0.0, @stime=0.009999999999999898, @utime=0.8300000000000001, @total=0.84>

好像在2.4测试就反过来了，是因为2.4的那个Array#max实现更新吗，等下再看看...

Feed 流设计 (二)：拉模式 Vs 推模式 at 2018年09月28日

推模式问题在于任何影响订阅关系的行为，比如屏蔽或者取消屏蔽，还需要对已经推送的 event 进行修改
Feed 流设计 (二)：拉模式 Vs 推模式 at 2018年09月27日

和推不一样的，推的模式在产生 event 的时候，需要写入大量数据，而且这种设计也没有屏蔽/取消屏蔽导致数据不正确的问题
Feed 流设计 (二)：拉模式 Vs 推模式 at 2018年09月26日
我觉得拉的模式缺点讲得不正确，把 event 和 friendship 强关联的设计是这个造成这个缺点的主要原因，而不是拉模式本身的问题。

正确设计应该有一个中间表，比如 event_subscribers，表结构是这样的：
```
user_id, subscribed_user_id
3      , 4
3      , 8
1      , 4
1      , 10
```
SQL 查询是固定的：
```
select events.* from events, event_subscribers
    where events.user_id = event_subscribers.subscribed_user_id and event_subscribers.user_id = 3
```
所有的屏蔽，过滤都是对 event_subscribers 这个中间表数据做操作

CSV 文件如何做一个类似合并单元格的操作 at 2018年09月13日

可以先用数组的前一个元素做 group_by，然后用 map! 方法将后一个元素做 replace：

data = [['A', '123'], ['A', '1223'], ['A', '12343'], ['A', '122XX33'], ['B', '678'], ['B', '612378'], ['B', '67XX8'], ['C', '100'], ['C', '1000']]

data.group_by(&:first).each{|_, v| v.map!(&:last)}

# => {"A"=>["123", "1223", "12343", "122XX33"], "B"=>["678", "612378", "67XX8"], "C"=>["100", "1000"]}

系统使用 Devise 做的登录，有个需求 [同一帐号被另一台机器登录，本机器被强制退出登录]，求大神指点一下 at 2018年09月06日
除了添加额外字段对比 session 之外，你也可以用一样的思路依赖 devise 内置的 model Trackable（通常我都会建议在使用 devise 的项目中设置这个模块）的字段 sign_in_count，然后写一个 hook，几行代码就可以满足你的需求：
```
Warden::Manager. after_authentication do |record, warden, options|
  warden.session(options[:scope])[:sign_in_count] = record.sign_in_count
end

Warden::Manager.after_fetch do |record, warden, options|
  if record.sign_in_count != warden.session(options[:scope])[:sign_in_count]
    warden.logout(options[:scope])
    throw :warden, :scope => options[:scope], :message => "Signin from another IP address #{record.last_sign_in_ip}"
  end
end
```
基本上做 devise 的扩展都是通过 hook 来进行，非常简洁方便。
数组->哈希，指定元素求和的问题 at 2018年05月21日
想了一个无需默认 0 Hash 的一行版：
```
a.inject({}){|m, (k, v)| m.merge(k => v.to_i) {|k, old, new| old + new} }
```
论坛里面杠精很多，有没有人有同感？ at 2018年05月10日
论坛里面杠精很多，有没有人有同感？ at 2018年05月10日

这个弱鸡的换行问题 at 2017年12月11日

params.values_at(:begin_integral, :end_integral, ...).all?(&:present?)

性能优化案例分析之二：时间区域查询的性能优化 at 2017年12月01日

你这个只有对单个 created_at 进行查询，RTree 索引在这里和 BTree 相比没有优势
性能优化案例分析之二：时间区域查询的性能优化 at 2017年11月14日

对，ActiveRecord 的 PG 支持有 range 这个类型，Mysql 就比较落后了，得自己去实现，就是提到的扩展问题 3： http://edgeguides.rubyonrails.org/active_record_postgresql.html#range-types
Ruby 爬虫框架 at 2017年11月07日

现在的爬虫趋势是用 headless chrome，通过 amazon lambda 运行，之前写过 2 个脚本，支持 ajax，加上 css selector/xpath，用来抓数据非常方便，大规模爬虫，代理IP，都很容易设置。

如果做爬虫框架的话，求支持这种模式。
请问 CDN 来的图片，是通过哪个 header 来缓存的？ at 2017年10月21日

有 etag 和 last-modified