Nice TWiki > Doc > LanguageComparisons > NiceVersusPizza > AlgebraicDatatype TWiki webs:
Dev | Doc | Main | TWiki | Sandbox
Doc . { Changes | Index | Search | Go }
I think that it would be very convenient to have access to good old algebraic datatypes in Nice, perhaps modified as described in this article: (Extensible Algebraic Datatypes with Defaults)

It should be allowed to do a switch on an algebraic datatype (otherwise there is no reason to include the feature ;-).

I would say that it is easy to implement this feature - one might be able to get away with a simple translation to existing language features: just introduce a new multifunction for each switch statement. But I might be missing something....

-- TorbenHoffmann? - 08 Jul 2002

A problem with swich statements is that they are closed: if you add a new case to your datatype, you have to modify all switches operating on this datatype. It seems the paper presents well this problem, and tries to solve it with default cases (I have only skipped through the first pages so far).

I think multi-methods nicely solve this problem. Let's try to see if something could be improved. Taking their example:

package expressions;

abstract class Expression {}
class Variable extends Expression { String name; }
class Lambda extends Expression { String var; Expression body; }
class Apply extends Expression { Expression function; Expression arg; }

Expression eval(Expression);
eval(Variable v) = v;
eval(Lambda l) = l;
eval(Apply a) = apply(eval(a.body), eval(a.arg));

Expression apply(Expression, Expression);
apply(Lambda l, v) = substitute(l.body, l.var, v);
apply(e1, e2) { throw new Error("Not a function: " + e1);

package numeric;

import expressions;

class Number extends Expression { int value; }
class Opposite extends Expression {}

eval(Number n) = n;
apply(Opposite o, Number n) = new Number(value: - n.value);

package print;

import expressions;

String display(Expression);

package main;
import expressions;
import numeric;
import print;

display(Number n) = n.value.toString();
display(Opposite o) = "-";

(I choose Opposite instead of Plus for simplicity of implementation, but the idea is the same)

What do you object to this solution?

A small change could to to allow a shortcut for the definition of an abstract class and several simple subclasses:

class Expression = 
    Variable { String name; }
  | Lambda   { String var; Expression body; }
  | Apply    { Expression function; Expression arg; }

This would be equivalent to the first version, but somewhat easier to read. As for switch, I think method implementation is as elegant, and more general because of the possibility to add new cases. Objections?

-- DanielBonniot? - 10 Jul 2002

Thanks - you have enlarged my understanding of multimethods.

I can see how method implementation is more general due to the possibility of adding new cases, but I am missing one good thing about algebraic datatypes in connection with case statements: the compiler warns you if you have not covered all the types in the union. That forces you to update all over the place when you add a case to an algebraic datatype which has it merits and its demerits, admittedly.

In Nice, if you add a new case (a new subclass of the root class), the compiler will indeed warn you if this case is not covered by an existing implementation for each method. -- DanielBonniot?

As I was pondering over your answer I thought of something that might be of practical use: let the compiler generate information about how the functions handle the different sub-types of a specific super class. Eg, if you take the Person example from the tutorial you the compiler should generate the following information:

   worker -> worker
   person -> person

and if you add another person sub-class, say Manager, but do not create a display(Manger m) function the compiler should output something along the lines:

   worker -> worker
   person -> person
   manager -> person WARNING: super function used.

Then you still have the flexibility of multimethods but regain some of the extra checks that is inherent in ML-style algebraic datatypes.

Keep up the good work. -- TorbenHoffmann?

Are you suggesting to display this information systematically when an implementation from a super-class is used? This situation might be new to functional programming people, but I think it is normal and desirable often with OO languages. However there are cases where you want implementation to be done for sub-classes. It is possible using the # keyword:

String display(Person);
display(#Person p) = ...;

Then if you declare a sub-class of Person, it is not matched by the implementation #Person, so the compiler enforces that a new one be given.

-- DanielBonniot? - 28 Aug 2002

I don't think the closed nature of switch statements and algebraic data types is necessarily bad. Aren't some classes naturally closed?

Take the standard AST visitor, for example. I think you'd want the compiler to complain about all your switch statements when you add a new member to the class. If that results in too many error messages, you could add a "default" case during development to make the compiler shut up. But after you've implemented all the cases, the compiler can also warn you that the "default" case is unreachable and that you can safely remove them.

There seems to be something distinctly different about closed classes that would makes it undesirable to mimic them with regular OO classes. To me, it seems like most uses of "instanceof" (or even multi-methods) are not designed for open classes. Open classes should be dealt with using their public interface only, relying only on virtual methods for polymorphism. Using multi-methods with open classes seems like an unnecessary weakening of static type safety.

ML-style types shouldn't be adopted directly, of course. They're lacking many of the fundamental OO features that would be required for them to "fit" in an OO language. Once the features are added, I think they'll fit in quite well with the Nice language.

On advantage of introducing closed classes is that enums are now just a degenerate case of closed classes, adding uniformity to the type system.

-- KannanGoundan - 15 Oct 2004

For the AST example, if you implement it with multi-methods you also get the error messages if you add a new class without an implementation for it.

Could you give an example of why you think multi-methods lead to a weakening of static type safety? (Keep in mind that a multi-method defined in another package as a class will not have access to the private members of the class, so encapsulation is not broken).

Actually, now with enums with have a form of closed classes. I'm not sure why they are "degenerate". What difference do you make between enums and closed classes?

-- DanielBonniot - 16 Oct 2004

Degenerate Enums

From what I can tell, Nice enumerations are implemented using object instances (which is also what Java 1.5 does, I believe). If you had the following enum:

enum Color = Red | Green | Blue

It would be translated to something like:

class Color {
   private Color() {}
   public static final Red = new Color();
   public static final Blue = new Color();
   public static final Green = new Color();

That translation only works because enums are not treated like a case of general union types. If they were, I'd expect the translation to be something like:

class Color {}
  private Color() {}
  public static class Red extends Color {
     private Red() {}
     private static final INSTANCE = new Red();
     public static Red make() { return INSTANCE; }
  ... Green ...
  ... Blue ...

// Constructors are private to prohibit external code from subclassing us.
// Use the 'make()' wrapper to construct instances.

The advantage of the second translation technique is that it can also be used to encode more complex union types:

type Expr = Abst(String param, Expr e) | App(Expr e1, Expr e2) | Ident(String s) | Zero | Succ

// Compiles to:
class Expr {
   private Expr() {}
   public static class Add extends Expr {
      public final Expr e1, e2;
      private App(Expr e1, Expr e2) {
         this.e1 = e1;
         this.e2 = e2;
      public static Add make(Expr e1, Expr e2) { return new Add(e1, e2); }
   ... Sub ...
   ... Ident ...

   // These two are just like enum options
   ... Zero ...
   ... Succ ...

Enums are actually just very simple union types. They never contain any data values. All the information is in the type label (which means, for example, that you can't have two distinct Red objects).

Type safety

Even though Nice will check to make sure you have covered all the cases, this guarantee is only valid at the time of compilation. After you've compiled and deployed your function, somebody might extend the AST class and pass in an unexpected type. You can protect against this using access control tricks and wrappers around the constructor (like the enum translation) but I would prefer a declarative technique. The "closedness" of a class is an important property and seems like it should be mentioned as part of a class' public interface.

-- KannanGoundan - 17 Oct 2004

About enums, I agree that they should be extended to allow arbitrary fields. That would them make exactly like closed classes.

About type safety: there are indeed issues when you use dynamic loading, or using together code that hasn't been compiled together. But to be fair problems happen equally without multimethods. Taking your example of extending the AST class, if you needed a multi-method in a language that does not support them, you would have resorted to implementing a method with instanceof tests and casts. Now with the extension this code becomes incorrect, and it will fail at runtime. With multi-methods, you could at least properly detect this condition. Better, a clever linker could even gather the declarative information in all the extensions (the method implementations) and put them together to regenerate correct dispatch code. So multi-methods actually make possible a higher degree of type safety than mono-methods.

-- DanielBonniot - 20 Oct 2004

I agree that Java is lacking features to handle the AST Visitor. However, I think multimethods are are the less-safe solution (as opposed to union types). Nice tries to make things safer (since the compiler warns you about unhandled cases), any time OO-style dynamic dispatch crosses compilation unit boundaries, you're in trouble. This can even happen with single-dispatch:

abstract class Vehicle {}
class Car : Vehicle {}
class Boat : Vehicle {}

And in another compilation unit:

void handleVehicle(Vehicle v);
handleVehicle(Car c) { print("Car"); }
handleVehicle(Boat b) { print("Boat"); }

If you add another Vehicle type, the handleVehicle could die with a compiler-inserted runtime error. Though you could provide a default, sometimes a default doesn't make sense.

The reason I'm so stuck on union types is that they're safe. Though different-compilation-unit-dynamic-dispatch feels a little more flexible, I think union types will handle most situations just fine. Can you think of particular situations where union types aren't good enough?

-- KannanGoundan - 30 Nov 2004

Topic AlgebraicDatatype . { Edit | Attach | Ref-By | Printable | Diffs | r1.13 | > | r1.12 | > | r1.11 | More }
Revision r1.13 - 25 Feb 2005 - 10:23 GMT - DanielBonniot
Parents: WebHome > LanguageComparisons > NiceVersusPizza
Copyright © 1999-2003 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback.