June 24, 2008

Are Scala's case classes a failed experiment?

A long time ago, Bertrand Meyer, the creator of Eiffel, made a very strong statement against the use of switch and case. He argued that these constructs were not object-oriented and made future maintenance difficult. And to make his point crystal clear, he decided that Eiffel would not support these statements.

This decision didn't go well with Eiffel users and Meyer eventually gave in and added switch> and case to his language.

Meyer's initial position was certainly extreme but I think the foundation of his argument still stands, and I was reminded of this age old debate just today while attending Martin Odersky's Jazoon keynote, and more specifically, when he described Scala's "case classes".

Ever since I first read about case classes, I have been confused about their utility, and my puzzlement has not abated. Either I am missing something big or this feature is something that has been vastly overhyped and that should be avoided as much as possible

Let's see why.

Here is the example used in the official Scala documentation:

def printTerm(term: Term) {
  term match {
    case Var(n) =>
      print(n)
    case Fun(x, b) =>
      print("^" + x + ".")
      printTerm(b)
    case App(f, v) =>
      Console.print("(")
      printTerm(f)
      print(" ")
      printTerm(v)
      print(")")
  }
}
This code is part of a hypothetical parser: it does different things depending on whether we just encountered a variable, a function or an application.

I have two strong objections to this kind of approach:

  • It ties this class very tightly to all the classes described in the various cases.
  • It makes evolution problematic.
I think the first argument is pretty obvious: whatever class this code belongs to, that class knows an awful lot about all the classes it is trying to use. This wouldn't be too much of a problem if the alternative didn't turn out to be free of this problem, as I'll show below.

The second claim is a bit more subtle but is actually the most important.

Obviously, this parser is quite incomplete: what happens if I want support for import clauses?

First, I am probably going to create at least one new class Import and second, I need to remember to add it to this code. Worse: all this code does is print out the type of Term it just parsed. Obviously, we will be needing more logic to actually do the work, such as a function to verify that the Term is syntactically correct (say verify(), and also a function to generate the code or the abstract information related to that term (generateCode()). Each of these methods will also feature a match structure, which will need to be updated for each new class.

In summary: every time that I add a new class, I need to remember to update three functions located somewhere in my code base.

Now we can see why Meyer dislikes switch statements so much...

The preferred way to solve these two problems is to make sure that each class encapsulates its own logic for all these operations:

interface Term {
  void print();
  void verify();
  void generateCode();
}

public Var implements Term {
  public void print() {
    print(n)
  }

  public void verify() { ... }

  public void generateCode() { ... }
}

// same for Fun and Expr
Here is how we print a term with case classes (from the Scala documentation, again):
Term t = ...;
printTerm(t)
And with proper encapsulation:
Term t = ...;
t.print();
With the OO approach, the code shown at the very top is simply no longer needed! If you need to print the content of a Term, you just invoke print() on it. If you add a new class to your parser, all you need to implement is conveniently summarized in the Term interface: no need to go hunting through your entire code base for switch statements. Not only do we have clean encapsulation, but we also have locality.

When I exposed the crux of this argument to a few Scala experts, they overall agreed about the point and responded by pointing out that case classes were more useful as an alternative to the Visitor pattern.

Before turning our attention to this specific case, it's important to note that at this point, we are now looking at solving a niche problem. And quite a rare one, in my experience. If this is really the reason why case classes were invented, I am really left scratching my head about the decision to include such a big feature inside a language to solve such a small class of problems.

Anyway, let's take a look at the Visitor angle.

In a nutshell, Visitors are used to emulate multiple dispatch in languages that don't support it natively. In some way, you are extending virtual invocation to apply to the runtime type of parameters passed to your functions.

I'm not going into a deep discussion of the pros and cons of this approach, but I'll just observe that even in the few cases where I have had such a need, an approach similar to the encapsulation explained above usually turns out to be preferable to a switch statement. In other words, if you need to have different operations depending on whether your function is being invoked with parameters of types A or B is better implemented by hiding this particular detail inside the class handling this dispatch, and not in a global switch.

From my understanding of Scala's history, case classes were added in an attempt to support pattern matching, but after thinking about the consequences of the points I just gave, it's hard for me to see case classes as anything but a failure. Not only do they fail to capture the powerful pattern matching mechanisms that Prolog and Haskell have made popular, but they are actually a step backward from an OO standpoint, something that I know Martin feels very strongly about and that is a full part of Scala's mission statement.

To be fair, Scala navigates in very rough waters: finding a good compromise between running seamlessly on an existing OO platform and inventing a language that will not only accomodate fundamental functional programming paradigms but also leave the door open for advanced dynamic concepts such as open classes is no easy feat, and overall, Scala has done a very good job at finding the right balance. But unfortunately, not with case classes.

Or am I missing something?

Posted by cedric at June 24, 2008 02:13 AM

Comments

I'd say you're missing something - this is the implementation of a Visitor pattern in Scala, which (case classes aside) are going to need modification if the syntax changes whatever happens.

Secondly, they're not just about pattern matching; they're about being able to unpack the structure in an elegant way with the minimum of syntactic sugar. Compare and contrast the approach you'd take with Java - apply(app.getFunction(), app.getValue()). The case clases are more liked named tuples of values, and you can have 'missing' values bound to _ (both values *and* types).

Lastly, it's used heavily for distinguishing None and non-None, which are both case classes. Even the built-in List() construct is nothing more than a linked list, where each element in the linked list is its own case class (which is why you can pattern-match against List(a,b,c) if you want to).

Alex

Posted by: Alex Blewitt at June 24, 2008 02:49 AM

Hi Cedric,

I have the feeling that your critique is not so much directed to case classes/pattern matching.

It looks like it's more related to the so-called "Expression problem" here: how to extend data structures and operations on these structures independently.

Actually there is a good paper on the subject and how this is approached using Scala:
www.scala-lang.org/docu/files/IC_TECH_REPORT_200433.pdf.

Does this bring some light to the debate?

Cheers,

Eric.

Posted by: Eric at June 24, 2008 03:05 AM

RE: In summary: every time that I add a new class, I need to remember to update three functions located somewhere in my code base.

If you use a vistor for this, and you make your visitor methods abstract, then your compiler will remedy this problem.

Basically a visitor is no more than a type-safe typeswitch with state.

Posted by: Wouter Lievens at June 24, 2008 04:08 AM

Hi Cedric,

You bring up some very good points.

As Eric said, you may find "expression problem" debate interesting. Also for discussion of pattern matching and Scala "extractors" that improve the mechanism, see lamp.epfl.ch/~emir/written/MatchingObjectsWithPatterns-TR.pdf It is my understanding that latest Scala version automatically converts case classes to extractors.

Posted by: Yardena at June 24, 2008 04:23 AM

It's not quite that simple. There is a tension between traditional object-oriented programming and layering. If you don't care about layering, sure, you can always add methods to the original type hierarchy. But what if you do? Or what if the type hierarchy is provided by a library (which you can't change) and you want to extend it? That's where visitor patterns tend to come from.

Also, language parsing is a bit weird because you have a type hierarchy with no real encapsulation. As a user, I can see and use the entire language. Multiple tools can parse this language and they all should handle the entire language; the idea that a compiler or lint tool should be protected from knowing about the language it's parsing is a bit weird. So this is an area where the usual arguments for encapsulation make less sense than usual and having convenient ways to iterate over trees and expose all their parts makes more sense.

Posted by: Brian Slesinsky at June 24, 2008 08:37 AM

If you make a table, and write down all the possible functions (methods in OO) as columns and data variants (classes in OO) as rows, then you need to decide how to divide the table up into modules. In OO, you divide by rows. Each row is a module, called a class. All the functions pertaining to that data variant are grouped together. This is a reasonable way of organizing things, and it's very common. The advantage is that's easy to add a data variant (your example of of an import statement). However the disadvantage is that it's hard to add new functions that vary by data type. You have to go through every class to add a new method.

The other organization is by columns. Each function is a module, and all the data variants are grouped together in those functions. This is the case/switch approach, poorly handled in Java and C, but very nicely done in ML and some other languages. The advantages and disadvantages are the opposite of the OO organization: it's easy to add new functions, but it's hard to add new data variants.

They're not exactly duals, because once you dig into the details you find places where there are a few things in one that don't have a correspondence in the other organization, and vice versa. But to a first approximation they're similar.

The thing is, just about everyone wants there to be only ONE way to organize ALL things. I think the right answer is that you need to look at the costs and benefits *for that type of data*. If you expect to add new data variants a lot more than you add new functions, then use OO; if you expect to add new functions a lot more than data variants, then use case with union types (not C's union, but properly typed unions).

For a parser type example, consider a compiler optimizer. You'd like to write lots and lots of optimization passes, each of which goes over the parse tree (or IL) and changes it. In the union+case approach, it's very easy to add new optimization passes. Each optimization knows how to work on all the types of nodes. You get locality, but it's a different kind of locality than the OO approach. If you had used OO here, then you end up spreading parts of your optimization pass as small methods across all of your classes. You keep everything about “import statements” together, but you lose the locality of “everything about loop hoisting”. Or if you want to get the loop hoisting locality, you end up writing a visitor, which is an adapter taking OO data and giving you the union+case approach.

The debate continues because neither form of organization is the best in all cases. You have to look at the kinds of extensibility you want, and make the design match that.

Posted by: Amit Patel at June 24, 2008 08:42 AM

Hi Brian,

You're not the first one to point out that the parser-based Scala examples are very poorly chosen, but I have still to see a domain where case classes really shine compared to the polymorphic approach.

Do you have any to share?

--
Cedric

Posted by: Cedric at June 24, 2008 08:48 AM

As Amit points out, Scala gives you the option of using traditional OO paradigms and traditional FP paradigms.

Personally, I find that pattern matching is very helpful when implementing complex logic (if I've got more than 1 "else", it's time for pattern matching). Case classes are useful as stand-alone simple containers as well as for pattern matching.

In lift and apps I've built on top of lift, I've found that case classes and pattern matching work best for immutable, event-stream kind of work. I find that traditional OO works best when I've got data structures that have mutable state.

Posted by: David Pollak at June 24, 2008 09:37 AM

To me, Scala's "case classes" look just like algebraic data types or tagged unions. This concept has been around for ages and lots of programmers love it. It's pretty ignorant to call this a failed experiment of Scala.

en.wikipedia.org/wiki/Algebraic_data_type
en.wikipedia.org/wiki/Tagged_union

>> every time that I add a new class, I need to remember to update three functions

No, you don't need to remember. Since you've created a closed collection, the compiler can tell you where you need to update your code. You don't ever have to worry about leaving out a certain case anywhere.

Posted by: James Justin Harrell at June 24, 2008 10:18 AM

The parse tree example is actually a lot better than you realize, once you scale it to reasonable real-world complexities. A parse tree implementation for an actual production language can easily have over a hundred different types of nodes (one for each language construct), and applications that use the parse tree can require many hundreds of different polymorphic actions over them (think compiler optimizations, or static code analyses like in IDEA or FindBugs). At those scales, implementing all of the polymorphic actions as methods is obviously too cumbersome, as it would require each node class to have a potentially unlimited set of methods, most of which have nothing to do with each other.

To make this work in Java, you need to punt to the Visitor pattern. That works, but is a lot cruder than pattern matching. Visitors can only differentiate on the class of the node visited, while pattern matching can differentiate on both the class and any components ("Only fire this rule on nodes of form BinaryOperation('-', x, 0)") , and to bind the components without a bunch of unnecessary boilerplate getter calls.

Type safety is a non-issue, at this scales. Either your visitors will inherit from an abstract base visitor with null visit method implementations, or your pattern matches will all end silently in default matches. In either case, the compiler won't be able to tell you about missing functionality (because it's caught by the default case), but that's the tradeoff you make for avoiding the enormous code bloat of having a thousand polymorphic actions over a hundred node classes, most of them empty.

Posted by: Dave at June 24, 2008 11:39 AM

BTW, how exactly are Scala's pattern matching capabilities any weaker than Haskell's. They are certainly weaker than Prolog's (no unification), but I was under the impression that they were basically identical to Haskell's, or possibly a little stronger.

Posted by: Dave at June 24, 2008 11:42 AM

Amit's exactly right about the column versus row organization. The non-OO approach isn't loss of locality: it's trading one kind of locality for another.

In a compiler that I work on we used to use the OO approach: every node had a bunch of methods, one for each phase of the compilation. The compiler was later rewritten to use the visitor pattern. The decision to do this had to do with code organization, not multi-dispatch. It's much easier to understand what's going on when all of the code for each phase is in the same place. It also makes it easier to add new phases or new code generators.

There are some downside: it does make it more difficult to add new node types. Also, as Dave points out, the type-safety you normally get with visitor pattern doesn't really hold at this scale. (It took me a long time to convince myself of this, and I'm still not entirely happy about it. :-/ ) Visitor pattern in Java is also horrendously verbose. Finally, most Java programmers have at best heard of the visitor pattern, and never really used it at a large scale. Despite all of this, the general consensus seems to be that the positives outweigh the negatives in our case.

Outside of the domain of compilers, the vistor/pattern-matching approach is frequently useful for cases where you need to be able to convert objects into multiple different output formats. For example, suppose you have complex objects that you want to be able to render in HTML, JSON, PDF, and possibly other formats in the future. You could add "toHtml", "toJson" and "toPdf" methods everywhere, or you could have a single place (eg: a visitor) that knows how to generate HTML, another that knows how to generate JSON, etc. As Brian mentioned, this is also nice if you care about layering. The visitor/pattern-matching approach helps separate out the presentation code.

Posted by: Laurence Gonsalves at June 24, 2008 12:59 PM

One other case where pattern matching is a win is in event handling code. Often you will have a component which needs to be able to handle streams of events, most of which it will discard. If it does handle an event, the event object itself is uninteresting (it's just a typed envelope) but the components of the event are needed. Events themselves have no interesting business logic. In this case, making events to be case classes allows event-aware classes to easily filter and decompose events as needed, allowing communications handling code to be very concise and clear. Erlang leverages this construct a lot, albeit in a dynamically-typed context.

Posted by: Dave at June 24, 2008 01:17 PM

Cedric,

I've been using case classes since November or so and today, while preparing an answer to your blog, is the first time I've used them with pattern matching (perhaps I've done that with some from Scala's library, I forget, but never my own).

So I'd say that case classes are a great replacement for the usual boilerplate Java for creating a class with a few public variables initialised by the constructor.

While trying out pattern matching on case classes just now, I noticed that the compiler tells you if you write non-exhaustive patterns, which means that if you 'evolve' the class hierarchy, the compiler will notify you of all places that you need to update code.

Finally, you seem to have a problem with the idea of code that depends on separate implementations of a type. I can see that point, but I strongly disagree. Avoiding that makes you end up with large classes and causes you problems when you can't feasibly change their code.

A strong use case for it in Scala is that of Either. It makes sense to pattern match on Left and Right. This is not unnecessary binding to details, but actually programming to the datatype.
Though Either actually provides a fold method, so you never need to use pattern matching on it.

Take a look at Either's fold. Does it violate encapsulation? Do the same for Option's map method. Do either really differ from pattern matching in any meaningful way?

Posted by: Ricky Clarkson at June 24, 2008 01:35 PM

You're missing something Cedric; a lot in fact.

Look up Algebraic Data Types and why they are important. The switch/case analogy is a straw man. There is *much* more to it than you seem to realise.

Posted by: Tony Morris at June 24, 2008 02:39 PM

JJH and RC have got it right. Case classes really shine with Algebraic Data Types (ADTs) like Option and Either. In combination with the "sealed" keyword, the compiler can warn when patterns are not exhaustive.

But remember that case classes just tell the compiler to do extra work (create public accessors for the class's constructor args, implement hashCode equals and toString, and add apply/unapply methods in a companion object). Most of the time, writing a case class is just a useful shortcut for those things that makes sense for a lot of common situations (e.g. value objects - c2.com/cgi/wiki?ValueObject ).

Posted by: Chris Hansen at June 24, 2008 02:57 PM

To add to Chris' point, you might also want to look at *Abstract* ADTs, which do not expose constructors for the type. This is achieved in Scala by declaring a sealed type, then private case classes from that type. In Haskell, ADTs are abstract by default (constructors must be exported).

Scala also allows custom extractors for pattern matching. However, a more disciplined programmer generally avoids pattern matching in favour of HOFs.

I am the author of scala.Either (the disjoint union algebra or disjunction under the Curry-Howard Isomorphism) - which is *not* an abstract ADT - but if it were, it would look like this:

sealed abstract class Either[+A, +B]
private final case class Left[+A, +B](a: A) extends Either[A, B]
private final case class Right[+A, +B](b: B) extends Either[A, B]

object Either {
// extractors?
}

There is much to learn! Keep it up :)

Posted by: Tony Morris at June 24, 2008 03:24 PM

Case classes (and pattern matching) are great for avoiding NPEs too:

val result = findRecord(input)
result match {
case Some(x) => println(x)
case None => println("Not found")
}

Posted by: Thomas Lee at June 24, 2008 05:42 PM

To make my earlier point perhaps clearer, prefer HOFs:

val result = findRecord(input)
println(result getOrElse "Not Found")

Posted by: Tony Morris at June 24, 2008 08:12 PM

Amit said it accurately. With OO/Java the best solution is the Visitor pattern to organize & localize the 'print' functionality under one hood - rather than the approach you have taken to spread the functions (which you have argued against in your past blogs - especially regarding 'print'). Your approach will not be extensible to adding new print formats.

Posted by: Sony Mathew at June 25, 2008 12:25 PM

Reading a post on Cedric's blog about any programming language other than Java is like watching a story on Fox News about any politician who is not a Republican :-)

Posted by: Michael at June 26, 2008 07:54 AM

martin odersky's response:

"
Here's the comment I just sent to the original blog.

Cedric, I think you are misrepresenting the various comments, interpreting them in a more negative light than they were intended.

Let me relate my experience to you: As you know I wrote the javac compiler with visitors and various ad-hoc solutions and I wrote most parts of the scalac compiler with case classes everywhere. The difference needs to be experienced to be believed. I would never, ever go back to a language that did not have case classes (or some equivalent) and pattern matching. Not if I wanted to write a compiler, or any other software that analyzes and transforms symbolic i_nformation.

Your argument concerning evolution is valid, but it has already been addressed in Scala with extractors. Read the ECOOP 2007 paper by Emir, Odersky and Williams.

-- Martin "


Posted by: pok at June 28, 2008 05:01 AM

Scala's case classes are a native implementation of the Visitor Pattern, and hence, what you are criticizing is the Visitor Pattern. You are partially right, the Visitor Pattern has disadvantages, but it also has advantages. It is the same on the other side, NOT using this pattern has both advantages and disadvantages. One has to carefully consider whether to use this pattern, sometimes it is useful and sometimes it is not.

Posted by: Fatih Coskun at July 14, 2008 03:54 AM
Post a comment






Remember personal info?