Complexity is everywhere in our world: In our ever-growing canon of laws, in the volatile & unpredictable nature of the stock markets (esp. now with the abundance of autonomous trading systems), in our tax code, and of course in the second law of thermodynamics. It's little wonder then, as programmers the world over know, that complexity is definitely present in our software. Of all the long term threats to applications, complexity is perhaps second most critical (the first being no longer meeting user needs :-).
Adopt this maxim: "Leave each class a little better than when you found it". Even if it's a small change - adding a comment, reformatting a few lines of code - taken in aggregate these changes really add up over time.
Remove features. I heard one of Instagram's founders state that they spend a good deal of time removing features as well as adding them. That was probably a slight exaggeration, but removing features can be very powerful in terms of fighting complexity: both directly (fewer lines of code == lower complexity, except with perl ;-), and indirectly as a signal to the team and your customers.
(Complexity can be beautiful too. Source)
Unfortunately, complexity goes hand in hand with success: the more popular an application, the more demands that are placed on it, and the longer its "working life". Left to their own devices both factors will increase complexity significantly. Life for a mildly successful app is even worse, the low demand usually results in a never-ending "maintenance mode" where poor code stewardship often abounds.
Without ways to tame complexity, any evolving piece of software, no matter how successful, will eventually collapse under the load imposed to simply maintain it.
How is software complexity defined? Many techniques have been proposed, from simple approaches such as lines per method, statement counts, and methods per class, to more esoteric-sounding metrics like efferent coupling. One of the most prevalent metrics in use today is cyclomatic complexity, usually calculated at the method level and defined as the number of independent paths within each method. Many tools exist to calculate it, at RelayHealth we've had good success with NDepend.
Identifying areas of complexity in the code base is easy. The hard part is deciding what to do about them. Options abound...
Big Ball of Mud
The "Do Nothing" approach is always worth exploring and it typically results in Brian Foote's Big Ball of Mud. Foote wrote the paper as, however noble the aspirations of their developers, many systems eventually devolve into a big, complex mass. He also notes that such systems can sometimes have appealing properties as long as their working life is expected to be mercifully short. Fate often intervenes though and woe betide the programmers stuck maintaining a big ball of mud.
Creating a big ball of mud is easy, just add code and mountain dew :-)
Let's assume that you'd like to stay out of the mud. What other options are there?
Process
Some simple process changes can help fight complexity:
- Analyze code metrics upon checkin and reject the new code if the files changed don't pass complexity targets (this will initially slow down development if you impose it mid-flight but it will improve your code quality).
- Allocate bandwidth for complexity bashing: reserve capacity such as 1 sprint every release, or a %age of total story points (e.g. 20% of all completed story points every month).
- Temporal taming: Focus on different parts of the architecture over time, say a new area every month.
- Something I've been wondering about: Are there processes that promote complexity? Or are some so time consuming that they prevent developers from addressing complexity?
- Automation is a powerful tool. You can easily add exceptions to a manual process ("Oh, well if it's an app server in cluster B, then we need to run this additional program") but an automated process is a lot harder to complexify, and if it needs additional steps at least you'll know their execution will be consistent.
Architecture
Complexity has spawned many solutions at the architecture / software engineering levels, though even something as basic as ensuring developers all have a common understanding of the architecture and documenting its basic idioms can go far. Other solutions are very well covered in our industry:
- Design patterns. Tried and true approaches to common problems.
- Aspect Oriented Programming. AOP's focus on abstracting common behaviors from the code base can reduce its complexity.
- Service Orientation. Ruthlessly breaking up your applications into disparate, independent services reduces the overall complexity of the system. This is an SOA approach but without the burdening standards and machinery that armchair architects are prone to impose. One of my favorite examples of this approach, Amazon.com, has been using SOA since before anyone thought up the acronym. By creating loosely coupled services with standard interfaces it's much easier to update or completely replace a service compared to the same work in an inevitably intertwined monolithic application.
Culture
The most powerful weapon against the encroachment of complexity is culture: the shared conviction among developers that everyone needs to pitch in to reduce it.
- Refactoring: developers should feel empowered to refactor code that's overly complex, not in line with the evolution of the architecture, or simply way too ugly. Two key enablers are required and both need a strong cultural buy-in:
- A solid set of unit and other tests so the developers knows if they've broken something
- A fast build & test cycle. Most developers like to work in small increments. Make a small change, test it. If it takes 15min for a build & test cycle, very few developers are going to refactor anything that isn't directly in their path. I really like the work Etsy has done in this area as well as culture in general by focusing on developer happiness.
What have I missed? I haven't written about complexity as the database level though, while we're on the topic, I suspect that however much I like NOSQL databases, their rise will increase data complexity in the long term. The leeway many provide developers in storing information will make it very hard to manage it: data elements will be needlessly duplicated, inconsistencies within elements will abound, etc. Error recovery will be critical, as will a strong business layer to help provide consistency.
(Another source of complexity! Source :-)
Happy simplifying!