Ruby 如何理解 Symbols & String 的差异

wikimo · 2014年04月22日 · 最后由 pynix 回复于 2014年04月23日 · 3798 次阅读

很多新手可能不能很好地理解 ruby 下的 Symbols 与 String 的差异,感谢@chenzhong 分享的一篇文章,供大家参考参考,原文地址,猛戳这里

文章最后的结论:

to use strings when:

you want to use any of string methods like #upcase, #split, #downcase etc.

you want to change/mutate the string

to use Symbols when:

symbols can't be changed at runtime. If you need something that absolutely, positively must remain constant

Places where same string is going to be repeatably used, example hash keys are pretty good candidate for symbols. so instead of string keys,hash["key"] = value. you should use symbols like this hash[:key] = value

To conclude, strings & symbols in ruby are similar but differences given above. But the main difference is that symbols are immutable & slightly more performant than strings so you should you them place where you know same string is likely to be repeated again & again.

为什么 ruby 的 symbal 不拓展成指针呢……

挺好的,而且原作者博客里面有篇文章提到一行代码,实现一个 WEB SERVER 更酷。

ruby -run -e httpd . -p 3000

@springwq 是的,感觉非常酷啊……

我不明白有什么难理解的,Symbal 就是不会变的 String。

#5 楼 @Rei 话说 symbal 是排除在 GC 外的么?

#6 楼 @cassiuschen 是的,不释放。

#7 楼 @Rei 所以说 erlang 和 elixir 里的 atom 实际上是同等的意思……

这边博文中有两个地方不解: 第一段 script 的结果: no. of object in memory at START: 17622 number of objects in memory at END: 21624

TOTAL: number of new objects at got created during this program => 4002!!! 这里他创建了 1000 个 object,为什么相差 4002 啊?

第二段 script 的结果: TOTAL: number of new objects at got created during this program => 4002!!! 为什么只创建了一个 object,但是还是和第一段 script 的结果一样??

求解答啊

#9 楼 @beyondyuqifeng

GC.disable
start_count = ObjectSpace.count_objects
start = GC.stat[:total_allocated_object]

  (1..1000).each do |i|
         name = "gaurish"
              puts "created a new string => #{name} in memory with object id => #{name.object_id}"
                end

  finish = GC.stat[:total_allocated_object]
  end_count = ObjectSpace.count_objects
  puts "no. of object in memory at START: #{start}"
  puts "number of objects in memory at END: #{finish}"
  puts "---------------------"
  puts "TOTAL: number of new objects at got created during this program => #{finish - start}!!!"
  p start_count
  p end_count

可以跑一下自己试试,第一段的 puts 的字符串也是创建的对象,这样是 2000(1000 + 1000)

还有 1000 是 object_id 转换的 string,(3000)

剩下的 1000 应该是连接字符串时的临时对象。(4000)

第二段还是 4000 是因为多出的 1000 是 symbol 转换成的字符串。

剩下的两个估计一个是 range

另外一个就不知道了...我把循环 puts 去掉是相差1001,估计和内嵌字符串语法的内部实现有关

struct RString {
   struct RBasic basic;
   union {
      struct {
         long len;
         char *ptr;
         union {
            long capa;
            VALUE shared;
         } aux;
      } heap;
      char ary[RSTRING_EMBED_LEN_MAX + 1];
   } as;
};


#define SYMBOL_P(x) (((VALUE)(x)&~(~(VALUE)0<<RUBY_SPECIAL_SHIFT))==SYMBOL_FLAG)
RUBY_SPECIAL_SHIFT  = 8
#define SYMBOL_FLAG RUBY_SYMBOL_FLAG
RUBY_SYMBOL_FLAG    = 0x0e

应该是 python 里的 internal string

不过 python 由虚拟 vm 决定是否 internal,当然也可以手动 internal。

需要 登录 后方可回复, 如果你还没有账号请 注册新账号