What do we mean when we say that something is complex? A system, a module, a class - they can all have that quality. When we talk about software architecture we say that certain practices raise complexity. This argument is usually given as a warning that a decision needs to be reconsidered.
Developers take pride in solving obscure problems and challenging puzzles. Yet, keeping a project simple is considered a virtue. But faced with a complicated business problem, is this even an option? Is there a difference between complicated and complex?
The Nature of Complexity
In “A Philosophy of Software Design”, John Ousterhout defines complexity as “anything related to the structure of a software system that makes it hard to understand and modify it”. It is an unwanted but sometimes unavoidable quality of a software.
When classes are tightly coupled, changing one without affecting the other is hard. If the code is not easy to read or relies on unclear conventions, keeping track of the flow is no longer trivial. Badly written or missing documentation can hide important information.
Put simply - in a complex system changes are hard. In a simple one, they are easy.
Complex and Complicated
The word “complexity” is usually used in technical conversations. In a regular chat with a friend you’d normally use “complicated”. The difference between the two is very subtle.
Complex things are composed of many components and moving parts. They are hard to understand because of the way they are built. It was a decision to make something complex.
When we say that something is complicated, we mean that it’s hard to understand by nature. It wasn’t an internal decision to make it that way. It’s a result of external forces.
Usually when something is complicated, it will be complex as well. But software can be complex without the problem it’s solving being complicated.
Causes of Complexity
Ousterhout defines multiple causes of complexity. To me, two of them stand out the most are dependencies and obscurity. Besides them, bad documentation and over-engineering are at the top of my personal list.
The first thing that come to mind when we say the word dependency is external libraries. Third party functionality that our code literally depends on to work properly.
External code increases complexity because it introduces outside rules that we need to conform to. If our application is structured around a library or a framework, every developer needs to understand the third party APIs before they make changes.
This increases the cognitive load that one needs to take in order to start working. Utility libraries don’t have that big of an impact, but unless abstracted with wrapper code, they can still influence the data flow and other APIs.
A bigger problem than third party packages is when we introduce tight dependencies between our own classes and modules.
Each component, no matter how big it is, has two parts - an interface and an implementation. The interface is what we expose to the world. The methods and properties that other components will communicate with in order to use the underlying functionality.
There is a contract between the two components. If you call the interface with the required arguments, you’d get a response of a certain type.
You don’t want the components to care about how they do their job, as long as they respect the contract. If, by changing the implementation of one component you need to modify multiple others, then boundaries are not properly set.
If you are not sure of the broader impact of a change in a class then you’d have to spend more time testing and validating. The more system specific knowledge you need to have, the higher the complexity.
The lack of clarity is another cause of higher complexity. More specifically - bad naming. Imagine a class named in a way that ends in Helper, Manager or Utility. Can you guess what it does? Me neither.
Names that are too generic don’t give enough information and lead to confusion. By being more clear we leave less for interpretation. We should aim to use descriptive identifiers or ones that are widely accepted for a specific purpose.
Naming is important on a micro level as well. Being unclear with the purpose of variables is a leading cause of confusion. In a loop, a variable named i usually carries the current index. Outside of that use case it’s better to be more verbose and spell out the full word - index.
Magic constants are even worse. If you find a number being multiplied by 0.9384528 - it probably carries some significance. If those constants are not given descriptive names then your guess about them is as good as any.
Documentation in the code is in the form of comments. They are widely appreciated when done properly. Sadly, often they are not. A comment should be part of the abstraction - it should provide a high level explanation of what the module or the method is doing.
People say that good code is self-documenting but not everything can be expressed through code. Comments serve that purpose. They should be used to explain the things that are missing in the API - why, what, edge cases, unexpected behavior. They capture information that the code’s author couldn’t express without using a non-programming language.
Comments should be used to explain why we are doing something. In order to work on a product you need to know more than the technical details. You need to understand the domain. Business logic expressed in code can leave you with a lot of questions. A few helpful comments that explain why an operation has to happen before the other is incredibly useful.
You shouldn’t need to read a method to know what it does. Looking at the API and the documentation should give you enough information. Having to read every function that you need to call is an unproductive way to spend your time.
It’s possible to overdo them as well. If you find yourself repeating your code with comments that’s a red flag. Too many explanations lead to too much noise. There’s no benefit in explaining that you’re looping through a collection of database records or reading from a file.
Dependencies, obscurity and bad documentation are well-known causes of complexity. Many projects have been hurt by them so we know how to tackle them. But we mentioned that software can become complex even when the problem it’s solving is not complicated.
This is known as over-engineering - when we design something in a more robust and complex way that it needs to be. But why do we do this? Since simplicity in software development is widely appreciated shouldn’t our problem be under-engineering?
Often we try to look into the future and make guesses about how the project is going to evolve. We make decisions now that would help us to make modifications in the future. By putting certain abstractions in place early we can easily change the underlying implementation if we need to.
The truth is that we are horrible at predicting the future. Most times the precautions we put in place wouldn’t be needed or they’d be wrong. This increases the levels of complexity that we need to deal with. It makes development slower and harder.
We create complex architectures that can launch a space shuttle even though we’re still riding a bike. More projects fail from over-engineering rather than the opposite.
- When we say that something is complex it is because it is composed of many parts and connections between them.
- Even though we cherish simplicity, we can end up with complex projects that are hard to understand and work on.
- Complexity is unwanted but sometimes unavoidable.
- There are four major causes of complexity - dependencies, obscurity, bad documentation and over-engineering.