Today, a co-worker was reviewing some code of mine similar to this:
1
| |
He suggested that using merge! would be faster, as it would save instantiating a new hash. I was skeptical but decided to put it to the test using benchmark-ips. If you are unfamiliar with benchmark-ips, it is a really awesome gem that measures how many times something can be run in a given timeframe, as opposed to how long it takes to run something. This is a particularly useful measurement when looking at things that take a variable amount of time to execute or, in this case, things that are very quick.
I set up the script to compare these methods as follows:
1 2 3 4 5 6 7 8 9 | |
This simply replaces merge with merge! and runs each repeatedly for 5 seconds (the default from benchmark-ips). I made foo do nothing just so that all the same objects would be instantiated, without adding any overhead to each run. The results were surprising!
1 2 3 4 5 6 7 8 9 10 | |
Using merge! is almost 2 times as fast! That’s really great. Out of curiosity, I wanted to check the number of objects that each makes as well. I know that the difference in the way merge and merge! work should mean that with merge! we have half as many objects created, but I wanted to measure it to be sure. For that, we can use ObjectSpace. If you are unfamiliar with ObjectSpace, or need a refresher, our very own Aaron Quint has covered it a few times. To count the number of hash objects we make in a given time period, I run a script like this:
1 2 3 4 5 6 7 8 9 | |
Using merge, we created 4039 hash objects. With merge!, we made only 2039, just as I expected.
It is important to note, however, that using merge! can have some side effects in certain instances. Because it modifies the original hash, you won’t have a copy of that original object. This is especially relevant when using a method argument. For example, take the following code:
1 2 3 4 5 6 7 8 | |
This over-writes the :a attribute in the original object. In this instance, using merge would be preferable if you want to retain the original state of hash. You could also call dup on hash_arg. This is particularly useful when doing a number of merges:
1 2 3 4 | |
In case you’re curious, using merge! here is still faster than the equivalent with merge (we have to reassign the hash to actually modify it):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 | |
All in all, this was a pretty fun dive into some minor performance stuff. While it might not make a huge difference at a small scale, as you start to run a method more and more the time and object space saved can add up! It’s often worth it to grab a few tools and take a look.
UPDATE: Tieg posed the question below of whether Hash#[] would be faster than using dup. I took a swing at it and it appears that he is correct! Here are my findings:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 | |
Thanks to Chris Belsole, Mary Cutrali, Dan Condomitti, Aaron Quint, Ari Russo, and Ivan Tse for their help on this post.
Dev Blog