Overview of Software Complexity
What is Software Complexity?
Complexity is anything related to the structure of a software system that makes it hard to understand and modify.
Complexity is shaped by the activities developers perform most often. If a system contains a few highly intricate parts that are rarely touched, they contribute little to the overall complexity. This intuition can be expressed in a crude mathematical form:
\[ C = \sum_{p} c_{p} t_{p} \]
The overall complexity \(C\) of a system equals the complexity \(c_{p}\) of each part \(p\), weighted by the fraction of time \(t_{p}\) developers spend working on it. Isolating complexity in a place where no one ever has to see it is nearly as good as eliminating it altogether.
Human Limitation on Software
The greatest limitation in software development is our ability to understand the systems we build. As a program evolves and accumulates features, it grows complicated, with subtle dependencies forming between its components. Over time, this complexity compounds, and it becomes increasingly difficult for developers to hold all the relevant details in mind while modifying the system.
General Approaches to Fight Complexity
- Making code more simple, clear, and consistent.
- Decomposing the system into modules that remain relatively independent of one another.
Software Design Methods
Waterfall: The project proceeds through a sequence of discrete phases — requirements, design, coding, testing, and maintenance — with each phase completing before the next begins. The entire system is designed up front, in one pass.
Unfortunately, the waterfall model rarely works well for software. Software systems are intrinsically more complex than physical ones; it is simply not possible to visualize the design of a large software system thoroughly enough to anticipate all of its implications before any code is written.
Agile: The project starts with a simple design covering only a few core features, and then iteratively evolves — refining what exists, adding new features, and fixing problems as they surface.
Symptoms of Complexity
Change amplification: a simple change requires modifications across many parts of the system.
Cognitive load: the amount of information a developer must keep in mind to complete a task. Sometimes an approach that uses more lines of code is actually simpler, because it lowers cognitive load.
Unknown unknowns: it is not obvious which pieces of the system must be modified to complete a given task, or what information is required to do it correctly.
Causes of Complexity
- Dependencies: a fundamental part of software that cannot be entirely eliminated, only managed.
- Obscurity: important information is not readily apparent to developers reading the code.
Complexity is Incremental
Complexity is not caused by a single catastrophic mistake; it accumulates in many small increments. It is easy to convince yourself that the small bit of complexity introduced by your current change is harmless. Yet once complexity has built up, it is hard to remove — fixing a single dependency or one piece of obscurity rarely makes a noticeable difference on its own.
Open Questions
In the post-LLM era, could we develop a heuristic algorithm that previews the architectural design of a software system by learning from similar or related systems, thereby making waterfall development more viable?
Does the agile approach reliably converge on a globally optimal design over the lifecycle of a system, or does it tend to settle into a local optimum at some intermediate stage?