I started working on a little test utility to go along with the verbosely-named “Let’s Make Byte Buffers Easier” project. There will be a post about that soon, because it’s been an adventure and I think I’ve produced something that is at least moderately useful. I quickly realized that determining whether or not two things are equal to each other is a fascinating problem and something I had to look deeper into to get a better sense of how to write a workable set of assertions.
In the end, this started to become a project all of its own, and I think there are exciting opportunities for expansion to be found.
I started out with a very naive approach, one that was going to be good enough for my limited purposes within the scope of “Let’s Make Byte Buffers Easier.” Very specifically, the rule was that two things were equal if, when compared with strict equality (“===”), the result was true. I figured I could get plenty of mileage out of that, but as I stared at that line in my editor, I had a gradually strengthening feeling that I missed something important. A bunch of somethings, in fact.
This called for research and experimentation. Some poking around led me to a helpful source, the ECMAScript Language Specification. Specifically, the section for the Strict Equality Comparison Algorithm, found here. Also, the explanation of SameValueNonNumeric, found here. This gave me a great starting point to figure this out for myself. As this was an exercise in “Can I do this?” I intentionally stayed away from other, more specific resources that might end up providing the solution for me. It’s interesting to see how far you can get on your own, after all.
Immediately, number five from SameValueNonNumeric illustrated why my naive approach was working appropriately for me. I was only comparing strings, and I only cared that the strings that resulted were the strings I expected. Trying to use this to compare objects would give me troubles.
After reading that, the first thing I tried was comparing two empty arrays. I need to convince myself that described behavior is observed behavior, sometimes. Sure enough, the fact that they are pretty much the same doesn’t matter, it evaluates to false. This is fine, because it’s what I expected after reading the spec. That didn’t take long – we (I’m bringing you with me, now) uncovered something that my naive approach wouldn’t handle properly. I just happen to have a difference of opinion with the computer. For our purposes, [ ] and [ ] are the same thing even though they are technically different collections. We’ll need to be able to specify something along the lines of “Hey, I’m expecting an empty array” and not end up with problems from the strict equality comparison.
So this means we have some thinking to do. For my purposes, I figured I’d want to differentiate between checking to see if two objects are the same object, and checking to see if two objects possess exactly the same set of values. That is, if we have the following arrays:
const x = [1, 2, 3];
const y = [1, 2, 3];
Then I want my equality check to return true. So for arrays, the rule is that two arrays are considered equal by our assertion if they are the same length and have exactly the same values. The “same length” bit is important because if we only checked for the same values, we’d have to run both against each other. If we had this:
const x = [1, 2, 3];
const y = [1, 2, 3, 4];
Then if we checked to see if all values in x were present in y, we’d be pretty happy because they are! But if we didn’t check the other direction, all values in y present in x, we’d have troubles. We could do that, but running through both arrays twice isn’t a great time. So instead, we can make sure that they’re the same length. In fact, we can go a bit further and say if they’re both empty, then they’re the same. That way, we may not even have to spin anything up that will try iterating on the array.
But that’s just an array. An array holding primitive values, no less. What if we get up to non-array objects? Or collections of objects? Or collections of objects that have objects and collections as properties? Or objects with all of the above, but also functions because some objects have behavior!? We’ve gotten to the summit of one mountain only to find several more to climb! Luckily, we like climbing (hooray, a metaphor).
We’ll have to run through the properties of the object. If they’re booleans, or strings, or numbers, we should be able to use strict equality without trouble. We’ve also figured out how to do array comparison earlier, so we can use that if we encounter an array. If it’s an object, presumably we can just subject it to the same rules we’re putting the parent object through. Done!
So how do we do that?
Consulting MDN provides a potential answer, the for…in statement that “…iterates over all enumerable properties of an object that are keyed by strings… including inherited enumerable properties.” (Check It Out!)
This sounds like what we’d like to be using. As we walk through them, we’ll check on what kind of thing we’re dealing with. If it’s an array, we’ll boldly subject it to our array equality check outlined earlier. If it’s an object, we’ll subject it to the function we’re currently describing (with equal boldness). Other kinds of things will be subjected to The Strict Equality Comparison Algorithm and that will be exciting until something I overlooked causes problems.
We’ll start off by subjecting an array to for…in to see if it could work for our purposes there, as well as objects. Sure enough, it seems to. We should be able to treat arrays and objects the same, now, because we’ll either get the index or property name, and we can also get the values associated with that index or that property. We’re in business! We can submit both arrays and objects to this function with all of the boldness I mentioned before.
So far, we’ve got this:
We’re using for…in and checking two objects against each other. For simplicity’s sake, the default case is true and we only return early if something is false. I’m feeling pretty good about this, so far.
Also, if you’re particularly upset about the use of for…in to work with an array, well usually you’re correct. In this case, though, I truly want to treat it like an arbitrary object. Further, order of indexes really isn’t important to me, as long as it will hit all of them. I’m perfectly open to correction on this point, however, if anyone has a compelling argument!
But wait, what about functions? Objects can have behaviors, and maybe we want to check on those! Well, this is where things get tricky. I feel like we can reasonably claim that two objects are equal, for our purposes, if all of their properties and values are identical. Functions put us into a strange spot. Up until now, we hadn’t really considered what might make two functions equal in our world. Perhaps the best thing we could do would be to construct abstract syntax trees and compare functions that way. However, I feel that is currently beyond the scope of this project.
So how about a “good enough for my purposes” implementation instead? Let’s suggest, right now, that we only want to compare functions that are added to objects by some sort of object creating function. That way whitespace will be the same across objects, as will the syntax. The whitespace issue is an important consideration for the naive comparison I’m about to suggest, so keep that in mind. Let’s further suggest that maybe we’d prefer to be able to ignore functions where appropriate, so we’ll provide two comparison functions instead of one catch-all. Further, let’s just encourage testing functions based on other kinds of things, like expected outputs for given inputs, and keeping them free of side effects. So I submit that the best we can do, within the current scope of this project, is to check the functions that get made by something else and thus will have consistent whitespace and syntax and all manner of things.
With all that in mind, let’s take yet another naive approach and just fire those functions off to toString() and compare them that way. If you recoiled in horror at reading that, go back and read how I specified I was interested in comparing them, and maybe sip some tea. Feel less horrified. With that said, we currently have this for comparing objects that have functions:
A future project that would be terrifically exciting would be getting away from toString() and actually using abstract syntax trees. Or, if you know a better way of doing this operation, feel free to let me know! I’m perfectly open to suggestions of how to achieve my end goals here, as well.
Finally, we haven’t accounted for Symbols, another primitive type. The for…in loop will actually pass right over them according to the MDN documentation. I believe that this is appropriate for now, because I am unlikely to be using Symbols in the near future. However, I certainly would like to handle it for the next iteration of this.
Anyway, the next thing I want to do is build a good (small) automated testing framework around this. That’s going to be another article, full of more code and less consideration around the ECMAScript spec. Also, I will make this code available on GitHub and give all of you a link, so you don’t have to rely on the inline images in this post!