TWiki . Dev . ConstructorSyntax

I'm trying to summarize my ideas on constructors, based on the discussions we had.

I think we should separate two aspects: creation of new instances, and initialization of new instances. The creation aspect is what interests clients of a class: they want to be able to construct instances that have a certain property, which is described by the arguments they pass. For the author of a class, there are cases where it is possible to return an existing object instead of creating a new one: if the class is immutable, and an object with the correct property already exists. For instance, consider:

class Point
  final double x;
  final double y;

For a client, a natural way to construct a point would be new Point(x: ..., y: ...);. The author of the class might anticipate or know by analysis that many instances of Point actually have the same coordinates, so he could decide to improve the efficiency by sharing their representation. He should be able to do so without chaning the API, so the clients can still use new Point(x: ..., y: ...);. So he should be able to write a "creation method", for instance:

let Point origin = new Point(x: 0, y: 0);

Point new Point(double x, double y)
  if (x == 0 && y == 0)
    return origin;
    return new Point(x: x, y: y);

The problem with this is the inside call to new Point(...). It will be a recursive call, and so will never finish (similarly, the value for origin would execute the creation method, which would try to read origin, which is not set yet). One solution would be to treat specially new inside a "creation method". However, this could get messy, and I would prefer a clean solution without such hacks. One idea is to give a special syntax for calling a "real constructor", ignoring creation methods with the same name. For instance:

let Point origin = Point.make(x: 0, y: 0);

Point new Point(double x, double y)
  if (x == 0 && y == 0)
    return origin;
    return Point.make(x: x, y: y);

With this, we can treat creation methods as normal methods (except for parsing their name). "new Point" is a normal method, so we can define it in a Nicer way:

let Point origin = Point.make(x: 0, y: 0);

Point new Point(double x, double y) = Point.make(x: x, y: y);
new Point(0,0) = origin;

One could be worried that allowing Point.make is exposing a detail about a class: if a client uses it and the class changes so that there is no such "real constructor", then the client will break. However, I think this is fixable. First, the author of the class can provide a CustomConstructor?, which is also reachable with the Point.make syntax. Second, it should be possible to use visibility. Should Point.make be only package visible, not public? This part needs some more thought.

Another good aspect of the make syntax is that it suggest a syntax for CustomConstructors that differentiates them from creation methods:

Point.make(double angle, double distance) = new Point(x: ..., y: ...);

It should even be possible to define Point.make as a normal method:

// Optimization for angle = 0
Point.make(0, distance) = new Point(x: distance, y: 0);

One possible improvement is to allow creation method implementations without a declaration. In that case, the declaration would be taken by looking at matching constructors (custom or not). This would allow the following:

let Point origin = Point.make(x: 0, y: 0);

new Point(0,0) = origin;

This makes some sense because, from the clients point of view, new Point exists even without a creation method declaration, and it defaults to the custructor. Would there be any drawback with this additional feature?

-- DanielBonniot - 18 Dec 2003

It's just struck me that what we're talking about here is very similar to Dylan's way of handling this problem. I'll include a reference here for comparison. Nice is very similar to Dylan in many ways, so it's not a bad idea to see what Dylan does when we have questions about what Nice should do. Instance Creation and Initialization

-- BrynKeller - 18 Dec 2003

A different idea is to use super inside a creation method to refer to the real constructor. This would give the following version for the example:

class Point { double x; double y; }

let Point origin = new Point(x: 0, y: 0);

new Point(0,0) = origin || super;

The syntax new Point(0,0) = origin || super is not intuitive.

Not sure if you are familiar with ||, but this is not specific to constructors. The equivalent code in a more traditional notation would be:

new Point(0,0) {
  if (origin == null)
    return origim;
    return super;

I didn't know this because it isn't documented. Now I understand what you meant.

From what I understand you want to allow the user to overload the new operator on a class-by-class basis. I think that is very confusing. When the expression new Point(...) is encountered we expect to have a new object allocated. In particular, it is always the case that

   new Point(0,0) != new Point(0,0)
The idea of object identity becomes unclear because it can't be described in terms of new. Imagine the craziness that would ensue for something like IdentityHashMap<Point,T>.

Yes, you get less control over when a new object is created or not. But that's the idea: most of the time, you don't need to know, and giving you this control causes lots of problems, which is why it is advised to use factory methods. When you call a factory method, you don't know if a new object is created or not either.

Could you give an example of craziness that would ensue with IdentityHashMap<Point,T>?

I will concede that it will not cause problems as long as the documentation is clear that the new operator does not guarentee uniqueness w.r.t. object identity (System.identityHashCode(Object)).

The example above can be written using a factory method, so I don't see what kind of expressivity you are gaining:

    let Point origin = new Point(x: 0, y: 0);
    Point makePoint(double x, double y) = new Point(x: x, y: y);
          makePoint(0.0, 0.0) = origin;

What we gain is that you don't need to defensively write all the code of a factory, just in case it turns out later. Either you do it in all cases, and a large proportion will be wasted effort (even with a good IDE, it's still cluttering the code), or you don't, and then you are stuck since clients started using new YourClass?, and you cannot make the changes that you need without breaking this API. I think these are the same benefits as for properties, where there is a known workaround (getters and setters), but it's a pain to have to do it by hand when the compiler could do it for you.

I disagree. If you start out writing code with the default constructors and then change the fields in the class, you have to remember to add an explicit custom constructor with the old interface to maintain binary compatibility. This is easy to forget to do and the compiler won't be able to help you find this error. For this reason, I think that implicit constructors are bad when used across package boundaries and that explicit constructors should always be used across package boundaries; if VisibilityModifiers were implimented than I would propose that implicit constructors should have package private access. (A more general rule would be that I think that a package's public interface should be explicitly defined in the code.) Once you have written an explicit constructor it doesn't really matter so much which syntax you choose (factory method or constructor), as far as I can tell.
-- BrianSmith - 29 Jan 2004

Also currently we have new Point(...) as equivalent to Point.getClass().getConstructor(...).newInstance(...). But, with the new system, we have don't have this symmetry:

True, but this is a rather lowlevel property, isn't it. How often would it be problematic? How often would the new system safe you work or let you improve your code without breaking the API?

I am just saying that it makes the semantics hard to explain because you can't piggyback on the JLS/JVM specs anymore in these areas. So, you have to write new documentation for the semantics of object creation/identity and for part of the reflection API.
-- BrianSmith - 29 Jan 2004

It would still be possible to offer another construct for surely creating a new instance, provided it's allowed by the class. That can be discussed. Basically, this would make new ... the equivalent of Java factory methods, and the other one the equivalent of Java's new. Since the latter is much rarer in my opinion, we would gain a lot by making it easier.

    abstract class A      { int getValue();
                            // more methods
    class Zero extends A  { int getValue() = 0;
                            // specialze other methods for ZERO
    class Other extends A { final int value; 
                            int getValue() = value;
                            // generic implementations for non-ZERO values
    let Zero ZERO = new Zero();

Then we could have either:

    new A(int value) = new Other(value: value);
    new A(0) = ZERO;
    A makeA(int value) = new Other(value: value);
      makeA(0)         = ZERO;

I find the second version to be much easier to understand and it can already be done without any language changes. When VisibilityModifiers are implemented then the package author can mark all constructors private and then provide public factory methods like makePoint and makeA. I think this is a good practice anyway.

It's good practice in a language that provides no such feature as we are discussing. The whole point is to make this extra work unecessary.

Isn't the second easier to understand because you are used to it, while the first is a new proposal? Doesn't makeA look like a hack?

More and more Java API's are designed so that the API is (almost) purely defined by interfaces, so that factories must be used anyway. That is the style I prefer so I don't find makeA to look like a hack to me. Anyway, I think the argument could be turned around as "Isn't using constructors easier to understand because you are used to it?"
-- BrianSmith - 29 Jan 2004

Finally, imagine:

    class ColoredPoint { int color; }
    new ColoredPoint(int color) = new ColoredPoint(color:color, x: 0.0, y:0.0);
The coder has specialized Point(0.0,0.0) but this specialization obviously cannot be used by the subclass. So, then you have two sets of rules for deciding what constructor implementation gets chosen (one for direct invocation, one for subclass invocation).

-- BrianSmith - 26 Jan 2004

Yes, this is the distinction between CustomConstructors and OverloadedConstructors?. new Point(0.0, 0.0) is an overloaded constructor, and you cannot use it to construct subclasses, in the same way that in Java, in a constructor you can use a parent constructor but not a parent factory method. Maybe we should say "factory method" instead of OverloadedConstructor??

-- DanielBonniot - 27 Jan 2004

I think that there should be a syntactical distinction between OverloadedConstructors and CustomConstructors. They are different concepts and I think users will get confused by this:
    class Foo { } { ... }


    new Foo(Number n, Number n) { ... }


    new Foo(Double d, Double d) { ... }

Is the second constructor above an overloaded constructor, a custom constructor, or a specialization (dispatch-wise) of the first constructor. Maybe you can tell just by reading the code. But, imagine that there are 50 lines of code where the ... lines are. Or, imagine now that all three elements are defined in seperate packages. Now how can you tell?

-- BrianSmith - 29 Jan 2004

Agreed. If I remember, the difference was supposed to come from the presence or absence of a return type (so your first constructor should have one, if it's supposed to be an overloaded constructor, no?). But it's true that if you can specialize an overloaded constructor (which makes sense), then there is syntactic ambiguity.

Any proposal?

-- DanielBonniot - 30 Jan 2004

----- Revision r1.8 - 30 Jan 2004 - 16:23 GMT - DanielBonniot
Copyright © 1999-2003 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback.