A while ago I read "Out of the tar pit", which is an excellent paper written by Ben Moseley and Peter Marks. In this paper, the authors discuss different types of complexity in software and what their causes are. The second part of this paper deals with Functional Relational Programming (FRP). In contrast with the OO approach, FRP emphasizes a clear separation of state and behavior. I highly encourage anyone to download and read this paper.
While reading this paper, there were some paragraphs that made me think and question some of the practices I’ve seen and also not seen during my career as a professional software developer.
In section 3, Approaches to Understanding, two approaches are mentioned for understand software systems:
Testing
This is attempting to understand a system from the outside — as a “black box”. Conclusions about the system are drawn on the basis of observations about how it behaves in certain specific situations. Testing may be performed either by human or by machine. The former is more common for whole-system testing, the latter more common for individual component testing.
Informal Reasoning
This is attempting to understand the system by examining it from the inside. The hope is that by using the extra information available, a more accurate understanding can be gained.
The authors argue that while both these approaches have their limitations, informal reasoning (e.g. code reviews) is the most important while testing is more limited.
Quoting Edgar Dijkstra:
Testing is hopelessly inadequate….(it) can be used very effectively to show the presence of bugs but never to show their absence.
This made me think about how most software teams, that I’ve been a part of, don’t have code reviews as a integral part of their development process while there always seems to be testing activities in one form or another. At my current employer, code reviews are a visible part of our process. Looking back, I can’t believe how we ever managed without it. Take a look at how open-source software development teams work and how evaluating code submissions is one of the main reasons of many sustainable and qualitative open-source projects.
I do wholeheartedly agree that both approaches, testing and informal reasoning, are definitely NOT mutually exclusive and that both have their place. But my personal experience of the last couple of years tells me that in order to prevent lots of bugs during testing, code reviews that precede this activity actually reduces bugs.
This is not to say that testing has no use. The bottom line is that all ways of attempting to understand a system have their limitations (and this includes both informal reasoning — which is limited in scope, imprecise and hence prone to error — as well as formal reasoning — which is dependent upon the accuracy of a specification). Because of these limitations it may often be prudent to employ both testing and reasoning together.
It is precisely because of the limitations of all these approaches that simplicity is vital. When considered next to testing and reasoning, simplicity is more important than either.
In section 4, Causes of Complexity, the three most important causes of complexity in large systems are discussed, which are State, Control and Code Volume. Section 4.4 adds some additional causes of complexity, of which Power corrupts made me think about the current state of programming languages in general.
What we mean by this is that, in the absence of language enforced guarantees (i.e. restrictions on the power of the language) mistakes (and abuses) will happen. This is the reason that garbage collection is good — the power of manual memory management is removed. Exactly the same principle applies to state — another kind of power. In this case it means that we need to be very wary of any language that even permits state, regardless of how much it discourages its use (obvious examples are ML and Scheme). The bottom line is that the more powerful a language (i.e. the more that is possible within the language), the harder it is to understand systems constructed in it.
Maybe, in twenty years from now, our kids look back at this age of computing and wonder why on earth we were dabbling in programming languages that encourage mutable state. Maybe at that time, a programming language like Clojure might be seen as the first light at the end of a long and very dark tunnel? A very interesting thought if you ask me.
The last quote I want to share is a footnote on page 37, in section 8 that discusses the relational model:
Unfortunately most contemporary DBMSs are somewhat limited in the degree of flexibility permitted by the physical/logical mapping. This has the unhappy result that physical performance concerns can invade the logical design even though avoiding exactly this was one of Codd’s most important original goals.
After reading this I instantly spurred into ‘evil laugh’ mode. Past traumas with some DBA folks might have something to do with it.
Go read this paper. It’s time well spent.
Until next time.
I actually started reading it three days ago, love it!
I think the trade-off between power and safety in a language is fascinating. I work a lot in C#, and I once did a post about the trade off between using its high-level features like linq vs its fast features, and how they’re quite distinct things (http://coder-mike.com/2013/09/09/expressiveness-vs-efficiency-in-c/). This seems to be the case in general. Imperative, stateful code often corresponds directly and predictably to machine instructions, which means that performance is easier to “see” through the code, and the programmer will be more aware of it. For example, is this call a direct, statically-resolved call, or an indirect call through a pointer? In all functional languages I know of, there is no way to tell. It’s only good to hide these details if you don’t care about the performance cost of doing things one way or another.
On another note, there may also be a bias toward thinking that immutability is the way of the future because more programming language research is done in functional languages than in imperative ones, because they’re easier to deal with mathematically. Perhaps the answer is not to make everything immutable, but to define better models for mutability that make it safer than it is currently without sacrificing the performance benefits of being closer to the natural way the hardware works. This may not be true, but its something to think about.
I do believe immutability simplifies things as compared to the concept of blocking threads. But an application that doesn’t change anything, isn’t very useful. The paper describes an approach where all mutable code is placed together, separated from the rest of the application. As to my understanding, this seems to be the direction that languages like Clojure are already heading with constructs like STM, atoms, agents, etc. …