[PATCH] D48136: [Polly] Implement an iterator for isl maps, basic_maps, sets, basic_sets

Fri Jun 22 04:52:42 PDT 2018

bollu added inline comments.

================
Comment at: include/polly/Support/ISLTools.h:50
+protected:
+  ListT *List;
+  int Position = 0;
----------------
philip.pfaffe wrote:
> Meinersbur wrote:
> > philip.pfaffe wrote:
> > > Meinersbur wrote:
> > > > philip.pfaffe wrote:
> > > > > Meinersbur wrote:
> > > > > > philip.pfaffe wrote:
> > > > > > > Meinersbur wrote:
> > > > > > > >  Should this hold a reference to the list? That is, the underlying object is not freed until all iterators are destroyed.
> > > > > > > It needs a reference to do the iteration.
> > > > > > What I mean is that the iterator could increase isl's ref counter to ensure that it is not free'd as long as one iterator exists. That is, is the following code valid?
> > > > > > ```
> > > > > > basic_set_iterator I, E;
> > > > > > {
> > > > > >   auto List = S.get_basic_set_list();
> > > > > >   I = begin(List); E = end(List);
> > > > > > } // List out-of-scope here
> > > > > > for (; I != E; ++I) { ... }
> > > > > > ```
> > > > > Well, if you destroy the container you obtained iterators for and then use the iterators, that's on you ;)
> > > > Using reference counting would allow us to not invalidate the iterators.
> > > > 
> > > > It's not just destroying the container, but also modifying it while iterating. Isl objects are reference counted, so the underlying object might be shared by other references as well.
> > > > 
> > > > At least it should be documented when iterators are invalidated.
> > > If I capture the List by value, how do I compare iterators? I'll have to compare the lists by value then, and that's quite expensive.
> > > 
> > > I can certainly document iterator invalidation, but it behaves just as you'd expect: destruction or modification of the list invalidate.
> > Comparison of iterators from different containers is undefined (except: value-initialized iterators). You can also compare their pointers: `List.get() == O.get()`. However, since isl list objects use copy-on-write as an implementation detail, it pointer-equality of two list references (as in: two variables of lists that may or may not contain the same elements) was never well-defined.
> > 
> > IMHO the interesting question is whether the additional definedness/safety (accidentally invalidating an iterator is one of my least favorite kinds of bugs) is worth the potential overhead of reference counter increment/decrement.
> I'll buy the comparison point, but I'm not comfortable relying on copy semantics of isl lists:
> - the iterator is a template. I can instantiate it for std::vector.
> - it'd mean depending on internal implementation details of isl lists.
> - iterator use-after-invalidation is a bug everywhere in the code, and is easily checkable. Valgrind will detect this immediately.
So, I now understand what you mean when you said that the copy-on-write is part of the implementation, not the interface.

If I understood correctly, you mean that:

1. If we use `const ListT List`, and then ISL decides to use `std::vector` (at which point, ISL's interface changes, since it has abandoned COW semantics), our code will need to change.

2. However, if we use `const ListT* List`, then if ISL decides to use `std::vector`, out code will not need to change.

Hence, `2` is better than `1` since it makes less assumptions about ISL's semantics, and leads to less code churn.

Please correct me if I am mis-representing your position.

##### Is this implementation morally correct _today_?

While I agree with this position from a software engineering perspective, I disagree with it when it comes to the semantics of your code.

By holding a `const ListT *List`, with respect to ISL's _current_ interface, you are holding a pointer-to-pointer (so, something like a weak reference). Hence, I feel that this implementation violates ISL's _current_ interface, to prevent code churn in the _future_ (which is unlikely, BTW) that ISL changes its COW semantics. 

Keeping a `const ListT List` is in line with ISL's _current_ interface of COW semantics, and is thus the correct choice.

##### UB with respect to iterators if ISL decides to clone objects

Another argument put forth was that if ISL changes the implementation of COW to actually clone the underlying object, then something like:

```lang=cpp
isl_list l1 = ...;
auto begin1 = l1.begin();

isl_list l2 = l1;
auto end2 = l2.end();

// UB, since we are comparing iterators across different objects,
// since ISL cloned L2 into L1.
if (begin1 != end2) {
    ...
}
```

I don't believe this triggers UB according to iterator semantics.

1. [Definition of input iterator](http://eel.is/c++draft/iterator.requirements#input.iterators-1)

An input iterator is one that satisfies the iterator laws (de-referncible, existence of `++`, along with a `==` operator.

By this definition, yes the example given would be UB, since I'm performing `==` between two iterators in different domains. However, there are two important caveats that make this argument incorrect:

a. [EqualityComparable `==` is an equivalence relation](http://eel.is/c++draft/utility.arg.requirements#tab:equalitycomparable) 

> == is an equivalence relation, that is, it has the following properties:
> ...

b. [Domain of an iterator can change over time](http://eel.is/c++draft/iterator.requirements#input.iterators-2)

> term the domain of == is used in the ordinary mathematical sense to denote the set of values over which == is (required to be) defined. This set can change over time.

So, _as the iterator implementor_, I can say that when 
```lang=cpp
l2 = l1;
```

occurs, then __the domain of `==` on  `begin1` expands to contain all valid iterators of `l2`, and vice versa for `end2.

This way, `begin1 == end2` is perfectly well defined and the compiler cannot optimise this away.

##### Practically, is `const ListT List` nicer or `const ListT* List`

>From a usability perspective, assuming that ISL does not change its COW semantics (and therefore we pay no code churn), I much prefer `const ListT List`, since not only does it respect the current interface semantics, but it also makes debugging easier as Michael pointed out.

Hence, I vote for `const ListT List` :)

Repository:
  rPLO Polly

https://reviews.llvm.org/D48136