Dec 10, 2014

Ruby Object_id Assigment

blogpost author photo
Michał Brodecki
blogpost cover image
Most of you heard about ruby koans. If not I strongly recommend all readers to take a look. The page basically goes through all the ruby code features. I decided to give it another look not long ago (my last visit there was about 3 years earlier). And I encountered a quite cool way ruby’s object_id works and how it has changed lately. I promise it will be interesting ;)

What’s the problem?

You already know that everything in Ruby is an object. Therefore, every object has it’s own id. When you create an object, ruby assigns an id to it. Logic tells us that object_id should be unique obviously.

a='Some string' #=> 'Some string'
a.object_id #=> 70355155146360
'Some string'.object_id #=> 70355155094260
b = a.clone #=> 'Some string'
b.object_id #=> 70355155065800
a.object_id #=> 70355155146360

So far so good, we’ve just confirmed what we have told before. But where it’s gets really interesting, is how ruby assigns ‘static’ id’s to objects like nil, true, false.

In ruby 1.9.3-p286:

nil.object_id #=> 4
false.object_id #=> 0
true.object_id #=> 2

In ruby 2.0.0 and above:

nil.object_id #=> 8
false.object_id #=> 0
true.object_id #=> 20

Hmm, I think there may be some tiny changes between those versions ;) (The difference can be spotted on x64 arch. only). I tried to figure out where those values come from. So I found the function that calculates object_id in the ruby library and I encountered this comment (ruby 2.0.0):

rb_obj_id(VALUE obj)
     *                32-bit VALUE space
     *          MSB ------------------------ LSB
     *  false   00000000000000000000000000000000
     *  true    00000000000000000000000000000010
     *  nil     00000000000000000000000000000100
     *  undef   00000000000000000000000000000110
     *  symbol  ssssssssssssssssssssssss00001110
     *  object  oooooooooooooooooooooooooooooo00        = 0 (mod sizeof(RVALUE))
     *  fixnum  fffffffffffffffffffffffffffffff1
     *                    object_id space
     *                                       LSB
     *  false   00000000000000000000000000000000
     *  true    00000000000000000000000000000010
     *  nil     00000000000000000000000000000100
     *  undef   00000000000000000000000000000110
     *  symbol   000SSSSSSSSSSSSSSSSSSSSSSSSSSS0        S...S % A = 4 (S...S = s...s * A + 4)
     *  object   oooooooooooooooooooooooooooooo0        o...o % A = 0
     *  fixnum  fffffffffffffffffffffffffffffff1        bignum if required
     *  where A = sizeof(RVALUE)/4
     *  sizeof(RVALUE) is
     *  20 if 32-bit, double is 4-byte aligned
     *  24 if 32-bit, double is 8-byte aligned
     *  40 if 64-bit

So we can see that true should be 2, not 20. What’s going on there? In ruby 1.9.3-p286 it was correct, and when we make the change to 2.0.0 we get a completly different value. Well, there is an answer for that: the Flonum.

What is flonum?

Before we get into that we need some background about how is Fixnum object_id calculated. Let’s see an example:

0.object_id #=> 1
1.object_id #=> 3
4.object_id #=> 9
100.object_id #=> 201
-30.object_id #=> -59

You see the pattern? To calculate object_id for fixnum we just have to do simple math operation

(x*2)+1 #where x is the desired number.

But to achieve this, ruby (in fact a ruby wrapper method written in c language) makes some bit operation magic. Ruby holds an integer as Fixnum or Bignum type. It depends on the number and on the architecture (32 or 64 bits). The below example will show how it’s done.

The following is based on 64 bit architecture. That means we have bits from 0 to 63. But the less significant bit is reserved for indicating whether the number is Fixnum or Bignum. It’s called the FIXNUM_FLAG, when it’s set to 1 it means that we have Fixnum. Another bit that is reserved is the most significant bit. It tells if the number is positive or negative. Based on those informations we know what the biggest Fixnum number is in x64 architecture. We have 64 bits for our disposition, but two of them are reserved. So that means the largest Fixnum number is 4611686018427387904.

(2**62-1) #=> 4611686018427387903
(2**62-1).class #=> Fixnum
(2**62) #=> 4611686018427387904
(2**62).class #=> Bignum

Let’s take number 4 as example. Binary 4 is: 100

But remmember that lowest bit is fixnum flag, so we need to shift left: 1001

Object_id of 4 is 1001 which equals 9.

4.object_id #=> 9

Ok, one mystery solved. Sorry for this intermission, but it was needed for you to better understand the flonums. Let’s get back to the main event. The answer about flonum comes with the commit made by Koichi Sasada. This difference is only avaliable on 64 bit machines. The issue was described by Sasada here. The fixnum object_id numbers are just calculated like we showed before. It doesn’t make a new object. Floating numbers on the other hand were allocated as object every time, just like strings that I mentioned before.

In ruby 1.9.3-p286:

a=2 #=> 2
b=2 #=> 2
a.object_id==b.object_id #=> true
a = 1.1 + 1.2 #=> 2.3
b = 1.1 + 1.2 #=> 2.3
a.object_id==b.object_id #=> false

So the idea was to present some technique that will present Floats as Fixnums (immediate value). I won’t explain how it works right now, but the point is, that with this commit representations of ‘nil’ and ‘true’ was also changed.

...xxxx xxx1 Fixnum
...0000 1110 Symbol
...0000 0000 Qfalse
...0000 0010 Qtrue
...0000 0100 Qnil
...0000 0110 Qundef

...xxxx xxx1 Fixnum
...xxxx xx10 Flonum
...0000 1100 Symbol
...0000 0000 Qfalse  0x00 =  0
...0000 1000  Qnil   0x08 =  8
...0001 0100 Qtrue   0x14 = 20
...0011 0100 Qundef  0x34 = 52

And here it is. That’s why true and nil have different object_ids in ruby 1.9.3 and ruby 2.0.0.


It was a great adventure (yes, I think we can use this word) to dig into the core ruby code, read forums and commits to understand what’s going on. I came a long way from confusion to a big feeling of satisfaction that I learnt something. There’s more things to cover, like how the symbol object_id is calculated and the difference between a symbol and a string. But not this time. I recommend everyone to dig the code up and find out for themselves! Put your Indiana Jones hats on!