Nice TWiki > Dev > CurrentDiscussions > StandardLibrary TWiki webs:
Dev | Doc | Main | TWiki | Sandbox
Dev . { Changes | Index | Search | Go }

Documentation for the standard library

We need a documentation for the standard library. The best way would be to write a tool NiceDoc, similar to javadoc. There are already documentation comments in the library. Writing nicedoc is a project in search of a volunteer to start it!

Unsupported operations in Java collections

Another thing I've been struggling with, is that the current collection hierarchy is somewhat imprecise - Collection specifies a number of operations which commonly appear together, but which might not always be appropriate (e.g., it's debatable whether LazyVector? should have a size method), and the same is true of sequence. I'm toying with a much more granular hierarchy at the moment, which I'll detail later. It also has two hierarchies, one for collections which allow in-place modification, and one for collections which do 'functional update', that is, return copies of themselves with the changed values. More later. -- BrynKeller - 05 Aug 2002

I also have concerns with this UnsupportedOperationException. It seems that the interface hierarchy was not well designed. Anybody interested to propose a fixed hierarchy?

-- DanielBonniot - 09 Sep 2002

Here a page discussing this problem: http://jinx.swiki.net/87

see also FlagInterfaces for an experimental approach to solve this problem -- ArjanB

Additional collection methods

see also StandardLibraryMethods

Are the methods in the StdLib? a complete set of additions to the collections api or do we need more?

A few names of the methods could be could changed to fit better the java parts of the api. has->contains, keep->retain and "find" shouldn't throw an exception. -- ArjanB

I added contains and retain in CVS, and made has and keep deprecated. We will remove them in the future, preferably waiting some time after the compiler starts emiting warning when using deprecated methods. I also renamed findIndex into indexOf, do we agree on that too?

About find (and findLast), there are two versions. One works only on collections of non-null values, and it returns null if no element was found. The other one works on any collection, but throws java.util.NoSuchElementException is no element was found. So you can choose your style: either checked the returned value, or handle an exception. The advantage is that there is never confusions between a null element in the list passing the test and no element being found. What is the problem with this approach?

It might be practical to give the two versions different names, so one can really choose which one to use in every case. I thought about find and search. The idea is that find is expected to always return a value. If it doesn't, it is exceptional, so it throws and exception. On the other hand, search is merely searching for something, which might not be there (in which case it returns null). Comments?

-- DanielBonniot - 10 Jun 2003

Are the methods in the StdLib? a complete set of additions to the collections api or do we need more?

I think there's room for a lot more collections methods in the standard library - in fact I'm working on some now, hopefully I'll be ready to circulate them for comments some time in the next two weeks.

-- BrynKeller - 11 Jun 2003

Can you give a preview of what you have? If you are short in time, I could help implementing the additions.

Have you considered the addition/overloading of operators for common operations on collections? I think of an operator for creating ranges.

-- ArjanB - 29 Jun 2003

About "throws Exception" or "return null", I choose both in a similar problem without change the name of the method. My suggestion is to provide 2 methods find(<K> key) (throws NoSuchElementException? if not found) and find(<K> key, <V> defaultValue) (return defaultValue if not found), like this no method name confusion (and always need to remember wich one throw the Exception), and you could store/find 'null' value and use an other defaultValue. An other variant is to define a property of the Collection to tell if you want to throws Exception or return a defaultValue.

-- DavidBernard? - 02 Jul 2005

Support for 'break'ing in foreach

I've noticed that there's a need for another basic method on Collection, which is a method that's like foreach, but provides the client code with some way to stop the iteration. I'll call this method forbreak, because it's foreach with a break. It looks something like this:

<Any T> void forbreak(Collection<T> coll, (()->void)->T->void func);

forbreak(s@Sequence, func) {
  boolean breakFlag = false;
  T->void each = func(()->void esc =>{breakFlag = true;});
  for(int i=0;!breakFlag; i++) {
    each(s[i]);
  }
}
So the function passed to forbreak takes an 'escape continuation', so-to-speak, and returns a function to iterate with. Then the iterator function can use the escape continuation to break off the iteration when desired. Foreach can be implemented trivially on top of forbreak:
foreach(c@Collection, func) = c.forbreak(()->void esc=>func);
and it seems like a more generally useful building block for functions on collections which may or may not be sequences, or support size(), or what have you. For instance, find() is currently defined on Sequence, but with forbreak installed, it could be promoted up to Collection. If we added a method mkEmpty, which would produce an empty collection (i.e., much like a no-args constructor) to Container, then we could define foldLeft, map, filter, etc. top of forbreak. So for collections to get all that great functionality, they only have to define mkEmpty and forbreak. Actually, mkEmpty plus either forbreak or fold would do. Maybe we should make fold the base operation, and forbreak can easily be built on that as well.

-- BrynKeller - 05 Aug 2002

And alternate design uses exceptions. You make the iterating function catch an exception called Break. In the anonymous function, it is possible to break by just throwing that exception. One advantage is that we can add that functionality to foreach itself, and clients don't need to do anything if they don't need breaking.

foreach(c@java.util.Collection, f) 
{
  try {
    for (let i = c.iterator(); i.hasNext();)
      f(i.next());
  }
  catch(Break ex) {}
}

Usage:

List<String> l = [ "A", "B", "C" ];
l.foreach(String s => { println(s); if (s.equals("B")) throw new Break(); });

Does someone know (or can make experiments to find out) if the try block would cause a performance penalty, in case the client does not need to break?

Would foreach with Break exception support a good solution? Is forbreak still needed, because it is more general?

-- DanielBonniot - 06 Jun 2003

I think it would be a perfectly acceptable solution. The try/catch overhead probably isn't worth worrying about, since it's only getting set up once. It also has the advantage that if we implement map, filter, foldLeft, etc. in terms of this new foreach, then you can throw new Break() out of any of those as well, which could be handy.

-- BrynKeller - 11 Jun 2003

Unlike foreach, those functions return a value. What would be the semantics of breaking inside those? For instance, what does [1,2,3].foldLeft((int i, int j) => throw new Break(), 0) return? -- DanielBonniot - 11 Jun 2003

Zero, I'd say. Actually, the current definition of foldLeft would already work:

<Any T, Any U> U foldLeft(java.util.List<T> l, (U, T)->U f, U init)
{
  U res = init;
  l.foreach(T elem => { res = f(res, elem); });
  return res;
}

assuming that foreach looks something like this:

<T> foreach(c@java.util.Collection, f) 
{
  try 
    {
      for (java.util.Iterator<T> i = c.iterator(); i.hasNext();)
      f(i.next());
    } 
  catch (Break b)
    {
      //Ignore
    }
}

Oh, it seems everything's already implemented with foreach, so we should be all set:

<C,T,U> map(c, f)
{
  C<U> res = c.similarEmptyCollection();
  c.foreach(T elem => { res.add(f(elem)); });
  return res;
}

<C,T> filter(c, test)
{
  C<T> res = c.similarEmptyCollection();
  c.foreach(T elem => { if(test(elem)) res.add(elem); });
  return res;
}

-- BrynKeller - 11 Jun 2003

That would at least break the implementation of map on arrays (file arrays.nice):

<C,T,U> map(a@Array, f) = fill(new U[a.length], int i => f(a[i]));
If f is allowed to throw Break, then we cannot assume that the result will have the same size as the argument. We could of course adapt the implementation. My question is: is it useful, and clean, to allow to break during map, filter, ...

-- DanielBonniot - 12 Jun 2003

Hmm, that's a good point. I would say it's definitely useful - I seem to recall often wanting to break out of maps and filters in the past. Clean is another question - as you pointed out, handling Break in some cases could make things less efficient - for instance map for arrays would have to be written something like this:

<C,T,U> map(a@Array, f)
{
  List<U> list = new ArrayList(); //or maybe LinkedList
  list.foreach(T elem => { res.add(f(elem)); });  
  return list.toArray();
}

which is clearly going to be slower than the current implementation. Is it worth it? I'm not sure. What if we kept foreach the way it is now, but added forbreak as a foreach where you could throw a Break , and then added new methods like mapbreak, filterbreak and so on? Another approach that would be a little odd but would clutter the namespace less would be to define foreach like this:

<T> foreach(c@java.util.Collection, f, allowBreak = false) 
{
  if (allowBreak)
    try 
      {
        for (java.util.Iterator<T> i = c.iterator(); i.hasNext();)
          f(i.next());
      } 
    catch (Break b)
      {
        //Ignore
      }
  else
    for (java.util.Iterator<T> i = c.iterator(); i.hasNext();)
      f(i.next());
   
}

and then we could write other methods with allowBreak parameters too:

<C,T,U> map(a@Array, f, allowBreak)
{
  if (allowBreak)
    List<U> list = new ArrayList(); //or maybe LinkedList
    list.foreach(T elem => { res.add(f(elem)); }, allowBreak = true);  
    return list.toArray();
  else
    return fill(new U[a.length], int i => f(a[i]));
}
but maybe this is all getting too far-fetched.

-- BrynKeller - 12 Jun 2003

These days, if you want this sort of thing, you just use the generator library in nice.functional. If you want to break out of a map/filter/etc. you just call stop() and that's equivalent to breaking. So:

import nice.functional;

let foo = naturals().filter(
              int i => {if (i == 53) { stop(); } return odd(i);
          ).map(int i => i.toString);

gives you a generator which starts with all the naturals, filtered down to only the odd ones, converted to strings. Except the method passed to filter calls stop() when it hits i == 53, so the whole chain ends at that point. It turned out to be a lot simpler than rewriting the entire collection hierarchy and so on.

-- BrynKeller - 02 Feb 2004

There are still reasons why we would want to modify the collection hierarchy (not necessarily the implementation, but make it have a different hierarchy structure) so that it would have no unsupported operations. That's a different project, of course.

-- DanielBonniot - 03 Feb 2004

Topic StandardLibrary . { Edit | Attach | Ref-By | Printable | Diffs | r1.17 | > | r1.16 | > | r1.15 | More }
Revision r1.17 - 02 Jul 2005 - 10:04 GMT - TWikiGuest
Parents: WebHome > CurrentDiscussions
Copyright © 1999-2003 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback.

Dev.StandardLibrary moved from Dev.StdLib on 06 Jun 2003 - 19:17 by DanielBonniot - put it back