Ever since I finished reading Joe Armstrong’s “Programming Erlang”, several details have been nagging me but I haven’t been able to put my finger on them… until now.

First of all, an erratum: I misinterpreted the wikipedia page when I said that Erlang was created in 1998. Erlang was open sourced in 1998, but it’s actually much more ancient than that. In some way, this explains why the language and tools feel so old, but I also learned something else that sheds another interesting light on why the language is completely devoid of any OO feature.

As it turns out, Joe Armstrong, the author of the book, is actually one of the main designers, if not the creator, of Erlang. He worked at Ericsson back then and created the language to address the fault-tolerance aspects of the products that Ericsson was trying to create. After doing some research in this area, I came across an article titled Why OO sucks written by… Joe Armstrong himself.

I found the article quite emotional and very light in substance, and this text would probably be inconsequential if it didn’t explain in great parts why Erlang doesn’t support any OO feature. When somebody creates a language after writing a “sucks” article, you know the yet-to-be-born language is already starting at a disadvantage because its creator is obviously not going to follow a “take what’s best in all languages” approach while designing his new language. And unfortunately, Erlang fell right into this trap. Which makes me very sad, because as I said in my previous post, I quite enjoyed the other aspect of the language and the decent blend of unification and functional programming that it offers. Maybe if some OO had been thrown in Erlang from the start, we would indeed be facing a language that has the potential to become meaningful. As it stands, I still believe that Erlang is a non-starter for this reason, and a bunch of others that I’ll get to in the following paragraphs.

Scalability

Let’s now turn our attention to Erlang’s claim to fame: that it produces infinitely scalable and fault-reliant programs more easily than other languages.

I have always had problems with this claim, and even after reading Armstrong’s book, my skepticism has not abated. I can definitely see that the syntax of the language makes it easier to write message-based components which, I agree, are probably easier to scale than locking intensive languages such as Java, but syntax itself is not enough. Just because you write a program in Erlang doesn’t mean your code will scale better on multiple processors than if you had written it in Java.

Armstrong doesn’t overlook this point, and he points out that you also need to change the way you think. He gives a few examples in the book (such as an implementation of Map Reduce) and he clearly shows that the initial approach needs to be completely reimplemented before it can be distributed to multiple processes.

Now, if we have to think in terms of distribution from the very beginning of our design phase, Erlang’s edge fades considerably. It seems pretty easy to me to design similar programs with Java using messaging API’s, and with very little additional effort. Message passing might end up being more expensive than Erlang, which has been optimized to make this as lightweight as possible, but without any credible figures, I am skeptical that an Erlang program will come ahead in the long run.

Notwithstanding the fact that Erlang itself, the language and the virtual machine, already runs orders of magnitude slower than Java (and even than Ruby), as Tim Bray’s latest forays in the domain have shown so far (quote from Tim’s series of articles: “How does it run? Well, like dogshit, more or less.”). Interestingly, Tim has received the assistance of a few distinguished Erlang experts to help him speed up his program (which is basically a simple log parsing routine) but the results are still running an order of magnitude slower than Ruby, which is telling since Ruby is already considered one of the slowest dynamic languages today.

The myth of lock-free programming

I am also beginning to question a few claims that are often tossed around when discussing about Erlang, such as the fact that since Erlang doesn’t have any variable nor side effect, it is contention free, and therefore doesn’t suffer from lock issues that traditional languages have.

First of all, contention exists as soon as you use the network, the file system, a database or even ETS and DETS (Erlang’s version of databases). But more importantly, there is one crucial bottleneck that’s never even mentioned in the entire book: the message inbox.

The only way a process can modify the state of or talk to another process is by sending a message to that process. Regardless of how lightweight the implementation of this message passing is, there has to be a lock on all these inboxes to guarantee that when dozens of processes send messages to your own process, these messages are delivered and treated in the order they are received from.

In effect, Erlang seems to be moving locks scattered in various places throughout your code (e.g. synchronized blocks in Java) into one single bottleneck. No matter how efficient this bottleneck is, I’m having a hard time imagining that this approach scales as miraculously as Erlang advocates claim. At least, Java gives me complete control on the coarseness of the locks that I use, and also absolute freedom on where I want to place these locks. If an object can be accessed by only two processes, only these two processes will compete for this lock. In Erlang, these two processes will have to pass messages to the inbox of my process, thereby competing with *all* the processes of my application. And since Erlang programmers want you to create as many processes as you possibly can, this can represent a lot of contention and wait times.

Five nines

Another Erlang claim to fame is that it makes it easy to create fault-tolerant system. This is usually ascribed to the fact that Erlang processes don’t throw exceptions: they die, and it’s up to a supervisor to detect this crash and act on it (usually by restarting that process). Again, I find the evidence that this approach is more robust than traditional ones hard to believe. No matter how much Erlang supports you from a syntactic standpoint, you still need to implement the supervisors and, in short, to plan for failure. This doesn’t strike me very different from any other languages, regardless of the failure mechanism they support (typically, exceptions).

Conclusion

After spending some time to try to find some back up evidence for these various Erlang claims, I still aven’t been able to find anything meaningful in any of the areas described above.

I know there are Erlang programs in the million of lines of code that have been running flawlessly for years, but there are even more running with mash ups of languages that were not even designed to be that resilient. The truth is that it’s fairly easy these days to achieve this level of reliability with redundant machines (ask the Googles, Yahoos, Facebook and eBays of the world) and also quite possible to achieve extremely high volumes of transactions with API’s built on top of traditional languages (e.g. Coherence). As I said in my previous post, the great lesson in scalability we learned this past decade is that it’s much easier to distribute data than code, and the current crop of modern languages (Java and C#) proves this admirably.

The claims put forth by Erlang advocates are quite provocative, and sometimes borderline outrageous, so I’d love to see some concrete and objective evidence to back these up. Failing that, Erlang will remain the oddity that it always been ever since it was created.