If you’ve been developing for any length of time, it’s likely that you’ve run across some variation on the phrase “Code isn’t for machines to run, it’s for humans to read.” Yet, we’re often also caught discussing how code is a mess, difficult to read, hard to maintain, etc… It seems that if it was really meant for humans to read, as an end goal, then we wouldn’t have so many complaints about how difficult it was to read.
Let’s consider an alternative way of thinking about code. What if code wasn’t for humans to read, and also wasn’t for machines to run? It’s right there in the shorthand name – it’s for encoding information.
Code is incredibly information dense. If you want to prove this to yourself, have a look at anything you’ve ever written, and try to describe even the tiniest part of it in any language actually used for communicating with humans (in my case, English). This is important, though: don’t describe it like you’re talking to someone else fluent in your chosen programming language. Describe the action it’s taking, every hidden assumption, everything you think is obvious. Allow me to give you an example, from C#:
var x = 33;
Wow, that’s a pretty short example. What if I described that as fully as I can? (Fair warning: this may be incomplete and if you’ve got a compelling, different interpretation, please let me know! I’ll include any corrections or expansions, because that’s just good for everyone.)
“Create a variable named ‘x.’ Allow the compiler to determine the type of data. Set the variable equal to an integer value of 33.”
That’s three whole sentences to minimally (and superficially) describe a laughably simple assignment statement. Now consider that any program that does anything useful has thousands and thousands of statements and most are decidedly more complex than what I outlined above. Once you start working with custom data types, layers of indirection, functions-as-data, anonymous functions, etc… you end up with a staggering amount of hidden assumptions. You add even more hidden assumptions when you begin working with frameworks and libraries.
Code is really encoding (hah) all kinds of rules about how the system you’re developing is going to work. Because a machine ultimately is going to have to run this stuff, you end up having to specify all kinds of things that another human would be able to fill in from context. This is where we usually end up saying something like “code is for humans to read.” But we should’ve taken it one step further.
When you code with the idea that another human is eventually going to read it, you’re making a completely new set of assumptions centered around the idea that someone is going to be familiar with your problem domain, framework choice, and language choice.
That’s quite a bit of hidden information, isn’t it? All of these factors end up blending together quite nicely to make code, in general, difficult to read. What can we do about it, as developers that care about our craft?
I’d suggest approaching it from two directions, one focused on writing and one focused on reading. Truthfully, I don’t have terribly much to say about how to handle this while writing, given that there are other confounding factors that might make something “unreadable” and it’s really subjective anyway.
When writing code, do your best to at least keep in the back of your mind the knowledge that much of this information is hidden and potentially unclear. Sometimes there really isn’t much that you are able to do to mitigate this, though. It may be that you’re solving difficult problems that require difficult solutions, or you might have some complicated data relationship that was tough to model and there’s simply no escaping the complexity and hidden information. Whatever the case, hopefully knowing the size of the stack of assumptions you’re making will help inform how you write. In turn, hopefully that makes the project easier to maintain in the future.
Essentially, do your best to minimize the amount of hidden assumptions. Readability will always be subjective, but you can find ways to pull back the curtain and show off what your system is doing.
I do have a decent strategy for reading and understanding, though. Your best shot at this is to find another developer and convince them to read through it with you. Bribe them somehow if you have to. At least, this is a sound strategy when starting out (having someone else read with you, not the bribery. Though, that may also be a sound strategy…) Adding another person into the mix allows you to argue over interpretation as you work through the code, and each will catch things the other misses. It is much easier to parse the volume of information housed in code if you’re not going at it alone.
If you’re a lone developer, as far as I know, there isn’t a shortcut for developing an understanding of a bunch of code. You simply have to read a lot of it, and do your best to understand it as you go. It will start off being a hard thing to do, but as you familiarize yourself with the language, the idioms used, the problem domain, and whatever frameworks or libraries are present, you’ll find that it eventually goes much more smoothly. The strategy here is just “read a lot of it, and expect difficulty.” The best thing to do is try to make the hard thing easy. Each time you break into a new problem domain or check out a new repository, expect a bunch of difficulty. It’s perfectly fine to have troubles when trying to unpack something as information dense as useful code. It would be weird if you didn’t have a bunch of troubles.
Weird, and lucrative, so good for you! The rest of us will have to continue puzzling through all of the information. It’s a good thing we all enjoy this, right?