在做 Koan 的Learn Ruby的时候,碰到一个挺有意思的练习,
def test_small_integers_have_fixed_ids
assert_equal 1, 0.object_id
assert_equal 3, 1.object_id
assert_equal 5, 2.object_id
assert_equal 201, 100.object_id
# THINK ABOUT IT:
# What pattern do the object IDs for small integers follow?
end
根据这增长的逻辑,可以猜测出来这个 object_id = value * 2 + 1, 但是猜测是不可靠的,还是了解其为什么是这样的才行。
显示在 stackoverflow 上看到这个这篇文章,
in MRI the object_id of an object is the same as the VALUE that represents the object on the C level. For most kinds of objects this VALUE is a pointer to a location in memory where the actual object data is stored. Obviously this will be different during multiple runs because it only depends on where the system decided to allocate the memory, not on any property of the object itself.
However for performance reasons true, false, nil and Fixnums are handled specially. For these objects there isn't actually a struct with the object's data in memory. All of the object's data is encoded in the VALUE itself. As you already figured out the values for false, true, nil and any Fixnum i, are 0, 2, 4 and i*2+1 respectively.
The reason that this works is that on any systems that MRI runs on, 0, 2, 4 and i*2+1 are never valid addresses for an object on the heap, so there's no overlap with pointers to object data.
这个可以算是“然”,继续搜索“所以然”
然后就搜索到了这篇文章,The Ruby VALUE. VALUE 就相当于 C 当中的指针,他的值等于对象在内存中的地址,但是对于 true, false, nil 和 Fixnum, 出于性能的考虑用的是不同的算法。
内存地址是以 4 bytes 为单位的(64 位机器上 8 bytes),因此,如果当前地址是 0x0000F000, 那下一个地址就是 0x0000F004, 二进制表示低八位就是 00000000 和 00000100,最低的两位永远会是0
Ruby 利用了这个特性,它保留了最低的 1 位,然后用剩下的 31 位 (63 位) 存储 Fixnum, 其中一位作为符号位,在我的机器上
irb(main):001:0> (2 ** 62).class
=> Bignum
irb(main):002:0> (2 ** 62 - 1).class
=> Fixnum
这下可以解释为什么 object_id = value * 2 + 1 了。以 5 为例,5 的 object_id 是 11,它的二进制表示是 0x0101(仅表示最低的四个 bits),由于 ruby 保留了最低的一位,因此它对应的 VALUE 值是 0x01011(省略了前面的一堆 0),0x01011 就是 11。
知道它是为什么,感觉特别好。