• Home
  • About Me
  • Blog
  • Dev Journal
  • Resources

HomeAbout MeBlogDev JournalResources
TechExperienceRSSGiving What We Can
GitHubTwitter / XLinkedIn

Principles Gathered from Clean Code: A Handbook of Agile Software Craftsmanship

It's worth mentioning (as is done in the book), that memorising these principles is not simply enough; these rules are simply heuristics that have worked for Robert C. Martin and others - you will need to use your judgement on where and how to use them.
I would recommend using this article as an overview of the topics covered and diving into the book itself to see the details where you feel you will benefit the most.

1. The cost of bad code and the essence of clean code

Martin starts the book by warning us of the cost of owning a mess. In short, systems that are a mess are expensive and painful to maintain.
Software systems can become so unmaintainable that the developers rebel out of frustration, calling for an overhaul of the legacy codebase. If managers relent, a new team is formed and a race to build the new system begins.
Productivity briefly spikes for the new team, but as the new codebase grows (with the same old dirty code habits), productivity quickly slows. It takes years for the “new” system to catch up to the old; by this time the original team that started on the new system have gone and everyone is already calling for another overhaul.
The cycle continues.
Martin goes on to ask many well-known programmers to define what "clean" code is. They converge on similar themes. The essence is this:
💡
“Clean” code is simple and conveys clearly its intent to the reader.
The practices Martin promotes for the rest of the book are simply ways of making the intent of your code clearer.
Martin advocates for the Boy Scout's Rule:
Leave the campground (codebase) cleaner than how you found it.

2. Meaningful names

Naming variables, functions, and classes sensibly goes a very long way to making your code more understandable to a reader. Obscure, uninformative, or misinformative names require the reader to become a detective, sleuthing through your code. Five minutes of refactoring can easily turn into thirty. Choosing a good name may take some thinking, but this is time well-invested in a cleaner codebase.
Choose names that:
  • Reveal the intent of the variable/ function/ class (can I guess what everything does without needing read the logic)
  • Avoid disinformation (don't call something accountList unless it really is a list!)
  • Avoid non-information (don't call a variable i unless indexing in a loop)
  • Make meaningful distinctions - every named "thing" should be something distinct, where the difference between these distinct things should be clear. For example, having car and carInfo is not a helpful distinction (do you really need two variables here if you can't distinctly name them?).
  • Are pronounceable - make life easy for yourself!
  • Are searchable - make it easy to grep search the entire codebase for the variable you're looking for (unique names trump common ones).
  • Are not encoded - don't use prefixes; if your variables are suitably named these shouldn't be required and will, at best, be ignored by readers.
  • Avoid “mental mapping” - do not require the reader to translate your name to something else in her head for them to make sense of it.
  • Follow a convention - use one name for one concept and stick to it.
  • Refer to the problem domain - developers must understand the business domain they are writing software for (rather than mindlessly typing away about things they know nothing about).
  • Refer to the solution domain - it will be programmers reading your code, so don't be afraid to use CS terms. I find referring to the problem domain preferable where possible.
  • Add meaningful context - add the context that will be required by the reader at the level of abstraction that the reader will be using the variable (the variable state makes sense next to addressLineOne, addressLineTwo but is extremely misleading used out of this context).
  • Don't add gratuitous context - if you prefix all variables in the Customer class with customer then every time you type c into your IDE, it will take nine keystrokes before your autocomplete starts helping you out - don't make your IDE work against you!
  • Use verbs for functions - verb/ noun pairs are very descriptive (write(name) shows that name is being written - even better would be writeField(name) as we have also described what name is in the function). Also, try to use the same verbs everywhere (don't mix fetch and get)
Don't underestimate the power that the choice of your names can have on the cleanliness of your code! This is something I often find myself considering when reviewing code.

3. Functions

Small!

Functions should hardly ever be longer than 20 lines - any longer and you need to break your logic down into smaller steps. This reduces the mental strain on the reader and makes it easier to spot any bugs in your own code. Breaking down your code into smaller functions means that you naturally begin to label (by creating a new function with a variable name) and sort the logic of your code into related steps.
Naturally, small functions also avoid excessive indentation, which increases readability.

Do one thing!

Functions should do one thing. They should do it well. They should do it only.
If you can extract another function from within your function, with a name which is not merely a restatement of the outer function's name, then you should do so. For example:
function login(username, password) { if areValidUserCredentials(username, password) return initialiseSession(username); throw Exception('Invalid user credentials'); } function areValidUserCredentials (username, password) { const user = this.getUserByUsername(username); return password === decrypt(user.passwordHash) }
is better than:
function login(username, password) { const user = this.getUserByUsername(username); if (password === decrypt(user.passwordHash)) return initialiseSession(username) throw Exception('Invalid user credentials'); }
Perhaps a trivial example, but here we see the first implementation splits the function into a non-trivial extra step of verifying the credentials. Understanding login in the first implementation is simple - irrelevant details are abstracted away into areValidUserCredentials.
If I want to understand the login function for the second implementation, I now also need to figure out what the instructions const user = this.getUserByUsername(username); and password === decrypt(user.passwordHash are actually doing, because they are mixed in with the other login logic. To understand a single function I need to understand the workings of two functions.
Aside: switch statements
The obvious problem arises: what is "one" thing? It's difficult to give a one-size-fits-all rule, but considering levels of abstraction can help us determine this...

Functions should SLAP! (Single Level of Abstraction Principle)

An interesting and important concept that Martin vocalised for me was ensuring that we only have one level of abstraction per function. Examples of different levels of abstraction would be:
renderHtml() // high-level const pageTitle = getPageSectionTitle() // mid-level shoppingListToRender.append(shoppingItem) // low-level
Keeping these levels of concepts separated cleans up your code and only shows the reader the level of detail they would expect to see in that function.
Martin describes the step-down rule to check if your abstractions are of the right level; you should be able to read a program as if it were a series of TO paragraphs. Here is an example in JavaScript:
function registerUser () { const contactDetails = collectUserContactDetails(); const paymentDetails = collectUserPaymentDetails(); const user = {...contactDetails, ...paymentDetails}; createNewUser(user); }
To register a user, we collect the user's contact details, then we collect the user's payment details, then we combine the two and create a new user from the result.
This reads like a simple set of instructions which is understandable to a non-technical reader. We can then move down one level of abstraction and do something similar for collectUserContactDetails to describe how this action is performed.
This way of organising concepts can be very difficult and requires practice. It comes hand-in-hand with making sure that your functions do only one thing.

Fewer arguments

The ideal number of arguments to pass to a function is zero, but you should never need to use more than three. Passing round arguments uses conceptual power and can expose a lot of the inner workings of a function.
In the case that we are tempted to use more than three arguments, notice that some of these variables can be grouped into an object or list of related values which express meaning as a single concept. If we then need access to a new piece of data from one function in another, we don’t need to change the function signatures of all the intermediate function calls - we just pack up the new piece of data in one place and unpack it in another.
💡
Don’t use boolean “flag” arguments which change the behaviour of your function - I look out for this one when reviewing code - it’s a sign that your function has too much responsibility!

Side effects

A function that contains side effects is one that creates a change outside of its own scope.
Depending on when you call it, the state in your upper scope can be affected, which is confusing and creates a temporal coupling in your code. Side effects are a way of lying to the reader about what your function is doing.
Consider:
function checkPassword (userId, passwordToCheck) { const user = getUserById(userId); if (user) { const encodedPassword = user.passwordHash; const userPassword = dcrypt(encodedPassword); if (userPassword === passwordToCheck) { initializeSession(); return true; }; } return false }
checkPassword will return the correct output, however, it also calls initializeSession if our password is correct (which changes some state outside the scope of the function). This code is just asking for bugs to arise.
Another way to give another developer a nasty surprise is to mutate a data structure passed into your function. They will have hours of fun trying to figure out why their code is doing something strange. Avoid mutating data - create a new object and return it!
Here is an example of how this can go wrong:
// (in the body of some outer function) ... console.log({ allIds }) // output: { allIds: [1, 2, 3] } if (!isIdRegistered(id, allIds)) { allIds.append(id) } console.log({ allIds }) // output: { allIds: [4] } - who stole all my ids!? ... function isIdRegistered(id, allIds) { for (let i=0; i<allIds.length; i++) { const registeredId = allIds.pop(); if (id === registeredId) return true; } return false; }
Our function, isIdRegistered, mutates the allIds list with the .pop method (which mutates the original data structure). The outer body logic reads as: check if the id is already registered, and if not, add it to the list of allIds. A reader would not expect isIdRegistered to alter allIds.
💡
Pure functions (without side effects) should be the only ones you use - don’t lie to your readers!

Command-query separation

Functions should either do something (perform some action on data) or answer something (retrieve data), not both; if it does, your function is doing more than one thing - split it out!

Extract try/catch blocks

Error handling is one thing - a function that handles errors should do nothing else. Try/except blocks are confusing and should ideally be extracted to their own functions so the intent of the function is clearer to the reader. For example:
function delete(page) { try { deletePageAndReferences(page); } catch (Exception e) { logError(e); } } function deletePageAndReferences(page) { deletePage(page); registry.deleteReference(page.name); configKeys.deleteKey(page.name.makeKey()); }
is clearer than:
function delete(page) { try { deletePage(page); registry.deleteReference(page.name); configKeys.deleteKey(page.name.makeKey()); } catch (Exception e) { console.error(e.message); throw e; } }
Error handling should be classified as one thing. Extracting the exception handling makes it easy to understand deletePageAndReferences and not be distracted by the error handling. This removes the unnecessary complexity for the reader.

DRY (don't repeat yourself)

It goes without saying, that creating reusable abstractions reduces duplication in your code. This means you only need to change your code in one place rather than many when the business logic changes.

Addendum: DRY (do repeat yourself)

It’s also important to note when not to use DRY! After learning about dry, junior developers often create abstractions everywhere in the code, which can be just as bad just repeating the code.
Consider abstractions to be like a see-saw: on one end we have code duplication (bad) and on the other, we have code coupling (very bad). Every time we create a new abstraction and use it in two places, those two places are coupled together.
If you are using the same function in many places, and want to change its behaviour for only some of the places you call it, changing the function may incur unintended side effects.
💡
Only apply DRY for things that change at the same time for the same reason, otherwise duplication is preferable.
Graph of developer priorities through their career (take a 15-year shortcut; start valuing readability above DRY now)

4. Comments

In general, don't use comments. If you need to add extra explanation to your code, it's a sign that you're not writing maintainable code. Even with the best of intentions, when the code changes (and it will), comments are often not updated by the person who changes the code, which clutters your codebase with redundant comments that the dev team are too nervous to remove.
It is easier to list the times a comment might be justified:
  1. Legal comments (e.g. copyright).
  1. Revealing intent behind a technical decision which would otherwise be obscure.
  1. Clarifying obscure methods of a standard (external) library that you can't refactor (you could also write a unit test for the external library to document it).
  1. Warnings to other programmers of the consequences of running some code.
  1. TODO comments - these should be immediately documented with a plan to address the tech debt (i.e. could be a link to a ticket where you plan to do the work - don’t create tickets solely to address tech debt, do it alongside a feature)
If you do leave a comment, make sure it is concise and serves a specific purpose.
Please, please, please, don't commit commented-out code!

5. Formatting

Formatting is important - it helps to make your code more readable.
Like smaller functions, smaller files are easier to understand - aim for fewer than 150 lines if possible.
Aim to have the highest level of abstraction closer to the top of the file to give the reader the broadest overview first. As you move down the page, the functions give more detail as the user requires - think of the order like a newspaper (the headline and summary come at the top with the details coming lower down the page).
Let a shared set of linting rules do the hard work for you. Your IDE should auto-format your code on save to keep consistency within the development team.