Ruby Ruby 的 object_id 和 Fixnum

jun1st · 2013年11月16日 · 最后由 jun1st 回复于 2013年11月17日 · 5150 次阅读

在做 Koan 的Learn Ruby的时候，碰到一个挺有意思的练习，

def test_small_integers_have_fixed_ids
  assert_equal 1, 0.object_id
  assert_equal 3, 1.object_id
  assert_equal 5, 2.object_id
  assert_equal 201, 100.object_id

  # THINK ABOUT IT:
  # What pattern do the object IDs for small integers follow?
end

根据这增长的逻辑，可以猜测出来这个 object_id = value * 2 + 1, 但是猜测是不可靠的，还是了解其为什么是这样的才行。

显示在 stackoverflow 上看到这个这篇文章,

in MRI the object_id of an object is the same as the VALUE that represents the object on the C level. For most kinds of objects this VALUE is a pointer to a location in memory where the actual object data is stored. Obviously this will be different during multiple runs because it only depends on where the system decided to allocate the memory, not on any property of the object itself.

However for performance reasons true, false, nil and Fixnums are handled specially. For these objects there isn't actually a struct with the object's data in memory. All of the object's data is encoded in the VALUE itself. As you already figured out the values for false, true, nil and any Fixnum i, are 0, 2, 4 and i*2+1 respectively.

The reason that this works is that on any systems that MRI runs on, 0, 2, 4 and i*2+1 are never valid addresses for an object on the heap, so there's no overlap with pointers to object data.

这个可以算是“然”，继续搜索“所以然”

然后就搜索到了这篇文章，The Ruby VALUE. VALUE 就相当于 C 当中的指针，他的值等于对象在内存中的地址，但是对于 true, false, nil 和 Fixnum, 出于性能的考虑用的是不同的算法。

内存地址是以 4 bytes 为单位的（64 位机器上 8 bytes）,因此，如果当前地址是 0x0000F000, 那下一个地址就是 0x0000F004, 二进制表示低八位就是 00000000 和 00000100，最低的两位永远会是0

Ruby 利用了这个特性，它保留了最低的 1 位，然后用剩下的 31 位 (63 位) 存储 Fixnum, 其中一位作为符号位，在我的机器上

irb(main):001:0> (2 ** 62).class
=> Bignum
irb(main):002:0> (2 ** 62 - 1).class
=> Fixnum

这下可以解释为什么 object_id = value * 2 + 1 了。以 5 为例，5 的 object_id 是 11，它的二进制表示是 0x0101(仅表示最低的四个 bits)，由于 ruby 保留了最低的一位，因此它对应的 VALUE 值是 0x01011(省略了前面的一堆 0)，0x01011 就是 11。

知道它是为什么，感觉特别好。

8 个赞

xstmjh #0 2013年11月16日

GJ，说来说去程序员第一门语言还是应该学 C

1 个赞

jimrokliu #1 2013年11月16日

确实，否则浪费太大了。

zgm #2 2013年11月16日

赞！

jjym #3 2013年11月16日

太高端了..看不懂..

ZombieCoder #4 2013年11月16日

那看来 Fixnum 的 object_id 就不是唯一的了

35047277182079.object_id
=>70094554364159
123456789098765432133333333333333333333333333333333333.object_id 
=> 70094554364159

jun1st #5 2013年11月17日

#5 楼 @ZombieCoder 123456789098765432133333333333333333333333333333333333 这个是 Bignum 了，每次跑 123456789098765432133333333333333333333333333333333333.object_id 得到的结果都会不一样，你那两个结果一样，挺有意思的，如果是巧合，那就太不可思议了

Rei 在关于编码的一些问题提及了此话题。 10月24日 16:31

需要登录后方可回复, 如果你还没有账号请注册新账号