notion image

DRY vs WET - is Everyone Wrong About The Biggest Question in Software Engineering?

In 1999, Andrew Hunt and David Thomas co-authored The Pragmatic Programmer, and ever since, the book has made it to the top lists of best and most influential books on Software Engineering. In publishing the book, the duo popularised several now-ubiquitous terms used within the industry including the acronym DRY (”do not repeat yourself”).
Although I’ve heard lots of people using the phrase “DRY code”, I’m not so sure many developers have heard the long version: "Every piece of knowledge must have a single, unambiguous, authoritative representation within a system". My perception is that this more nuanced version has been lost in translation, and only the acronym remains.
As DRY is such a catchy turn of phrase, it has infiltrated Software Engineering vocabulary around the world - one Engineer can write the single word, “DRY”, on a pull request and another will understand the suggestion.
Engineers are encouraged to extract functions and methods so the same procedure can be called from multiple places. I’ve seen codebases that don’t use DRY at all - long script files thousands of lines long which instantiate the data, act on it, and then it’s a copy-paste job for the rest of the file, repeating the previous procedures on new data by just duplicating the code - it’s not pretty. Following DRY seems like a sensible starting point!
However, as time passes, something starts to go wrong; Engineers following the law of DRY start to run into nuances with the rule that makes their software harder to change. It doesn’t happen overnight but creeps up on the system over time - using shared function calls everywhere to avoid duplication creates a rigid, difficult-to-change system when the problem space changes.
People started to realise that although DRY was a good starting point, there are still some kinks to work out - this has led to the backronym “WET” (”write everything twice”) as a rebuttal to DRY. WET aims to intentionally induce some duplication in your codebase to account for the long-term problems incurred by DRY. Maybe it only becomes important to avoid duplication after a certain number of repetitions - “You can ask yourself "Haven't I written this before?" two times, but never three”.
On my first project as Tech Lead, my co-TL and I tried softly enforcing WET as a principle for developers to follow on our codebase. Frankly, it was a stupid idea. We didn’t see any negatives incurred by using WET over 9 months, however dictating that two repetitions are acceptable but a third is not, seems a very arbitrary rule retrospectively. WET is a good tool to get the team thinking about when to duplicate code and when not to, but shouldn’t be followed as law.
On the subject of duplication of code, it seems:
  • always repeating yourself is wrong - don’t solve the same problem hundreds of times!
  • never repeating yourself is wrong - you will end up with rigid, painful software!
  • repeating yourself an arbitrary number of times before removing duplication is imprecise although it might solve some issues with DRY by random chance.

What Does “Duplication” Really Mean?

It’s not so clear to developers what duplication means - I recently polled our work #devs channel to ask: is there code duplication in the following code?
export const calculateSalesTaxReturnAmount = (cost: number, taxAmount: number) => cost * taxAmount * 0.2; export const calculateInvoiceReturnAmount = (charge: number, amount: number) => charge * amount * 0.2;
Let me give you some space to read the code and answer for yourself…
Got your answer?
What I’m really asking when I talk about code duplication is, “Is any code within the snippet interchangeable with another equivalent piece of code within the same snippet?”.
The question of whether or not the above contains duplication depends on how you describe the two functions, either you think:
  1. “I have two functions that take two numbers and multiply them together and then by 0.2, therefore one is interchangeable with the other”, or you think,
  1. I have one function that calculates the sales tax return amount and one function which calculates the invoice return amount, so these are not interchangeable”. 
I tend to think about it in terms of the latter - I wouldn’t write these two functions as a single one.
[Technicality] What happens if you see both levels of abstraction as valid?
OK, you might be being difficult here:
You could say that there is a function to extract here so the code looks like this:
const multiplyTogtherAndThenByPointTwo = (num1: number, num2: number) => num1 * num2 * 0.2; export const calculateSalesTaxReturnAmount = (cost: number, taxAmount: number) => multiplyTogtherAndThenByPointTwo(cost, taxAmount); export const calculateInvoiceReturnAmount = multiplyTogtherAndThenByTwo; (charge: number, amount: number) => multiplyTogtherAndThenByPointTwo(charge, amount);
If the way you calculate one changes later, you just change the implementation at the definition and you don’t need to change anything where it’s used within the codebase. 
However, there’s also a nice rule of thumb in Clean Code that says "So, another way to know that a function is doing more than "one thing" is if you can extract another function from it with a name that is not merely a restatement of its implementation." - and he also advocates that ”functions should only do one thing”. This means extracting functions until going any further turns a function into a restatement of its implementation. In the case above multiplyTogtherAndThenByPointTwo is just a restatement of the implementation so making this abstraction seems superfluous.
If you haven’t heard about the SLAP rule of functions, it’s also quite relevant here.
The results of the poll exercise firmed my perception that the prevailing wisdom on code duplication is that you should write mostly DRY code but some duplication is healthy and shouldn’t be dictated by a solid number (like WET). It also showed that duplication is not so clear-cut - when does one “thing” become interchangeable with another? It reminds me of the long definition of DRY - when does something become a “piece of knowledge” in the system?

Is Code Duplication a Big Distraction?

One word in particular is deeply interlinked with code duplication: abstraction.
In their daily applications, DRY and WET are both suggestions for how to deal with code duplication, but by proxy, they are actually trying to advise us on how to make abstractions - something that requires an understanding of the specifics of the system you’re working with. I assert that we broadly shouldn’t be concerned with code duplication, but instead, we should be learning how to make the correct abstractions.
In practice, a significant portion of Software Engineering is organising ideas and concepts into smaller composable chunks that describe a real-world problem or domain. There are lots of ways to model a problem, which makes making decisions on how to organise those concepts (this is called factoring an application) quite difficult.
I’m not the only one to have noticed this - out of the dilemma of DRY vs WET, a new acronym has risen to rule all other acronyms: it’s called AHA (Aha!) programming and I first read about it from Kent C. Dodds. AHA stands for “avoid hasty abstractions” and tells us to prefer duplication over the wrong abstraction”.
I like this - it brings the focus away from duplication and towards the important conversation of when to make abstractions. It simply points out that removing duplication is cheaper and safer than refactoring a poorly abstracted system. The thing is, it doesn’t give us any wisdom on when the “right” time is to make abstractions - it essentially warns the reader to be wary of abstractions and leaves the rest up to them to decide.
A codebase without abstractions is a bad idea, so then what does warrant an abstraction?

What Does “Hasty” Mean? What is “True” Duplication?

I read on Feb 18, 2024, “A good abstraction or interface is one that allows either side to change something without requiring coordination or changes on the other side. Matt Ranney, DoorDash”. It’s a good principle, but is more concerned with technical abstractions - it’s a necessary but insufficient measure of a good abstraction (worth noting here).
Robert C. Martin says something in Clean Code to the effect of:
I like this adage and I repeat it at work - it introduces a time-dependent viewpoint of software. An important property of software is its malleability - it changes over time (which is why it’s “soft”ware and not “hard”ware) because the reality of the system you’re modelling will change over time (hopefully also in response to the solution you’re building).
Although it doesn’t have a catchy acronym, this advice on making abstractions touches on a description of what Martin describes as “true” duplication (as opposed to “false” duplication). Two functions might happen to have the same implementation today, however, if (from a business perspective) there exist two distinct concepts, it makes sense to create two distinct interfaces to expose in the codebase so that when the implementation of one changes, the impact is minimised.
“There are different kinds of duplication. There is true duplication, in which every change to one instance necessitates the same change to every duplicate of that instance. Then there is false or accidental duplication.”
I find this to be a more extensive expression of DRY - those people who struggled with over-abstraction in their code were removing “false” duplication which hurt their code hygiene, introducing harmful coupling into the system. The aim is to remove all “true” duplication and leave alone “false” duplication to avoid that harmful coupling.
Maybe we need a competing acronym to raise awareness? How about CTKT (”changes-together, keep-together”)? Or perhaps we just need to spread awareness of “true” vs “false” duplication.
Let me know what you think!