You may not know that Ruby’s ‘puts’ method does this

If you ran this code snippet, what would you expect the output to be? Take a moment to think about the answer before reading the following paragraph.

If you had asked me a few months ago, I likely would answered with ’{:bar=>"car"}’, the rationale being something like: “When we pass in an object to puts, to_s gets called to do string conversion, and '{:bar=>"car"}' is the string representation of the value returned by to_s.” Seems reasonable, right?

When we actually run the code, we see the following output:

#<Foo:0x00007f7f7f160bb0>

Counterintuitive, right? We typically see the object’s class name and address (ie. #<Foo:0x00007f7f7f160bb0>) in an interpolated string like that when we haven’t explicitly defined to_s , but we did define that method.

Why are we printing out a reference to the parent Foo, rather than something related to our {:bar=>"car"} hash?

Let’s dig in further and look at a slightly different code example:

If we run the code snippet above, our terminal prints this:

#<Foo:0x00007f843b0f1268>
{:bar=>"car"}

Counterintuitive indeed. When we pass in an instance of Foo to puts, aren’t we expecting to_s to be called under the hood? Why are we getting a different result when we explicitly call to_s?

puts is Ruby function that’s purely implemented in C, so we can’t just step with a debugger like pry or byebug to find out more; puts doesn’t have Ruby code to step into! But, we can read through the Ruby source code on Github: the io.c file sounds like a promising place to read about puts, and we find this definition of rb_io_puts there:

C can be difficult to read compared to Ruby code, but this line of code looks promising: line = rb_obj_as_string(argv[i]);. So, let’s read the definition of rb_obj_as_string, which is found in the string.c file:

str = rb_funcall(obj, idTo_s, 0) calls our object’s to_s method (this idTo_type naming pattern is also found elsewhere in Ruby’s C source code for other built-in Ruby types, such as Array and Symbol). We then pass the result of to_s into rb_obj_as_string_result. How is rb_obj_as_string_result defined in string.c?

And this explains it! In the underlying implementation of puts, rb_obj_as_string_result explicitly checks if to_s has returned a string. If we haven’t returned a string, that value is overridden, and we use the return value of rb_any_to_s instead (ie. the function that returns a class name / address string like #<Foo:0x00007f7f7f160bb0>).

This is why we’re printing a reference to Foo, and not anything to do with the actual hash – the value of to_s is discarded because it’s not a string! This also explains the discrepancy between puts foo_instance and puts foo_instance.to_s – we pass in a hash to rb_obj_as_string, meaning {:bar=>"car"} is passed into rb_obj_as_string_result, which does have a definition of to_s that returns a string.

The way Ruby’s puts function to overrides a value we explicitly return with to_s can be unexpected if you haven’t seen it before, but upon reflection, I do think that this is sensible language design. The alternative would be for Ruby to recursively call our underlying rb_obj_as_string on the value returned by to_s until we get a string, but this introduces additional complexity for little benefit. At the end of the day, if we want to write clean code, any to_s functions that we write should, well, return a string 🙂

Start your journey towards writing better software, and watch this space for new content.