Squaring the Circle, or how Rich Hickey explains why Clojure is a prime solution for developing in the multi-core world, Oct 08 — Disclojure, Public Disclosure of All Things Clojure

Coincidentally, I recently read the very relevant ”The Free Lunch Is Over: A Fundamental Turn Toward Concurrency in Software" by Herb Sutter. In this article the author puts forward the arguments for a necessary paradigm shift in software for small and medium sized computers (as this shift already happened in large computing systems): the need to design our software to leverage multiple CPU cores as the only way to make that software run faster. CPUs are not getting faster anymore, but they are getting more powerful, and to leverage the new power one must write software in a way that all the cores in the same CPU can be fully utilized, that is, programs must be able to leverage multiple concurrent threads of execution. The problem? Most of the programming languages out there, and especially the most commonly used, are not a good fit for concurrent execution on multiple CPUs or CPU cores. Concurrency in those languages is mostly an afterthought, and writing multi-threaded software on those languages is cumbersome and error prone.

What makes concurrent programming hard today is the difficulty of having each thread of execution obtain a coherent view of the state of the system while other threads are changing this very same state.

In this talk Rich Hickey explores the issues with concurrent software development when done with the tools that most current computer languages provide. At around minute 19, Rich uses a genius metaphor of race-walker foul detection to illustrate what is the problem with multiple threads concurrently updating the system state, and how a solution would look like. To detect fouls in a race-walker race the judges use snapshots of the race to see if the runners have both feet above ground. The judges can take their sweet time to look at the snapshot while the race continues, and without worrying that the fact that the race is still on will not change the contents of their snapshot. The point that Rich makes is that a similar system is needed in concurrent software development, in which an execution thread can get a snapshot of the system state and work with it independently of the other threads, without stopping any of the other threads and without worry that the snapshot will change either. In a sense, each thread can ‘live’ in different points of the timeline, the same way the judges are inspecting a snapshot in the past of an ongoing race.

Once the problem is described Rich proposes that a functional language with persistent data structures and strong concurrency semantincs would provide the features described in the above paragraph. Persistent data structures are like any other data structure in the sense that the data they hold can change over time. What makes them unique is that the changes in the structure don’t affect past values of it, that is, some thread can be looking at the structure at time t-1 and some other thread might have changed it at time t. Both threads would be looking at the same data structure, but the contents would be different. The values of the structure at any time before now remain untouched no matter what happens in the future, and thus all threads have a coherent (albeit out of date) view of the system state stored in those structures. One important fact to note is that persistent data structures are almost as fast as non-persistent or mutable ones.

The strong concurrency semantics, provided by the Software Transactional Memory (STM) in the case of Clojure, allow the developer define how change will be affected to the persistent data structures. The concurrency primitives that provide these semantics describe how change is to be performed on a single or on multiple data structures in a way that the system state is always correct and coherent, and so that not two threads can, so to speak, step on each others toes.

So we have a functional language in which functions have no side effects and so every time they are called with the same parameter values they yield the same result, and thus it doesn’t really matter when they are called. Then we have data structures that represent value over time, so that it doesn’t matter when a thread is using them. Finally, we have a STM system that isolates the state changes between threads so that it doesn’t matter when threads are making those changes. With all these features what you end up with is with a programming environment in which you can use many threads of execution that will work together without correctness issues and minimal interaction between them, thus being able to fully utilize the power of the new multi-core CPUs.

Watch the presentation here. Enjoy!