Today, a co-worker was reviewing some code of mine similar to this:
1
|
|
He suggested that using merge!
would be faster, as it would save instantiating a new hash. I was skeptical but decided to put it to the test using benchmark-ips. If you are unfamiliar with benchmark-ips, it is a really awesome gem that measures how many times something can be run in a given timeframe, as opposed to how long it takes to run something. This is a particularly useful measurement when looking at things that take a variable amount of time to execute or, in this case, things that are very quick.
I set up the script to compare these methods as follows:
1 2 3 4 5 6 7 8 9 |
|
This simply replaces merge
with merge!
and runs each repeatedly for 5 seconds (the default from benchmark-ips). I made foo
do nothing just so that all the same objects would be instantiated, without adding any overhead to each run. The results were surprising!
1 2 3 4 5 6 7 8 9 10 |
|
Using merge!
is almost 2 times as fast! That’s really great. Out of curiosity, I wanted to check the number of objects that each makes as well. I know that the difference in the way merge
and merge!
work should mean that with merge!
we have half as many objects created, but I wanted to measure it to be sure. For that, we can use ObjectSpace. If you are unfamiliar with ObjectSpace, or need a refresher, our very own Aaron Quint has covered it a few times. To count the number of hash objects we make in a given time period, I run a script like this:
1 2 3 4 5 6 7 8 9 |
|
Using merge
, we created 4039 hash objects. With merge!
, we made only 2039, just as I expected.
It is important to note, however, that using merge!
can have some side effects in certain instances. Because it modifies the original hash, you won’t have a copy of that original object. This is especially relevant when using a method argument. For example, take the following code:
1 2 3 4 5 6 7 8 |
|
This over-writes the :a
attribute in the original object. In this instance, using merge
would be preferable if you want to retain the original state of hash
. You could also call dup
on hash_arg
. This is particularly useful when doing a number of merges:
1 2 3 4 |
|
In case you’re curious, using merge!
here is still faster than the equivalent with merge
(we have to reassign the hash to actually modify it):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 |
|
All in all, this was a pretty fun dive into some minor performance stuff. While it might not make a huge difference at a small scale, as you start to run a method more and more the time and object space saved can add up! It’s often worth it to grab a few tools and take a look.
UPDATE: Tieg posed the question below of whether Hash#[]
would be faster than using dup
. I took a swing at it and it appears that he is correct! Here are my findings:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 |
|
Thanks to Chris Belsole, Mary Cutrali, Dan Condomitti, Aaron Quint, Ari Russo, and Ivan Tse for their help on this post.