[cfe-commits] [patch] TransformIterator (motivating use case in Clang)

David Blaikie dblaikie at gmail.com
Tue May 15 11:57:35 PDT 2012

[-llvm-commits since it's not too relevant to that list at this point]

On Tue, May 15, 2012 at 11:24 AM, Douglas Gregor <dgregor at apple.com> wrote:
> On May 15, 2012, at 9:31 AM, David Blaikie <dblaikie at gmail.com> wrote:
>> On Tue, May 15, 2012 at 9:19 AM, Douglas Gregor <dgregor at apple.com> wrote:
>>> On May 6, 2012, at 7:02 PM, David Blaikie wrote:
>>>> This patch adds an iterator much like boost's transform_iterator
>>>> (without some extra bells & whistles) for use in some changes I'd like
>>>> to make to Clang to correct some misimplemented iterators there.
>>>> A few gotchas that could be improved/changed depending on opinions:
>>>> * I may be playing a little loose with the return type of the functor
>>>> (see the example/motivating functor, to_pointer) - the return type
>>>> actually must be a reference, though the result_type provides the
>>>> underlying value type, not the reference type. If this is violated
>>>> Clang will emit a warning, but I could make it more robust with a
>>>> compile time assertion in the TransformIterator that the result_type
>>>> is actually a reference type, and strip that off to provide the
>>>> value_type of the TransformIterator.
>>> It's much more correct for the value and reference types of the iterator type to be, e.g.,
>>>  typedef typename Functor::result_type reference;
>>>  typedef typename remove_reference<reference>::type value_type;
>>>> * I realize adding pointer-yness back onto an iterator over references
>>>> when the underlying data structure (in the Clang patch you can see
>>>> this situation) is a sequence of pointers may be a bit overkill -
>>>> though the alternatives are writing different algorithms in the two
>>>> cases where I've used TransformIterator, or having the decl_iterator
>>>> iterate over pointers (in which case I should probably change the
>>>> other iterators I've modified to iterate over pointers for symmetry).
>>> diff --git include/clang/AST/DeclBase.h include/clang/AST/DeclBase.h
>>> index 6aef681..0a16ea5 100644
>>> --- include/clang/AST/DeclBase.h
>>> +++ include/clang/AST/DeclBase.h
>>> @@ -1175,17 +1175,18 @@ public:
>>>     Decl *Current;
>>>   public:
>>> -    typedef Decl*                     value_type;
>>> -    typedef Decl*                     reference;
>>> -    typedef Decl*                     pointer;
>>> +    typedef Decl                     value_type;
>>> +    typedef value_type&              reference;
>>> +    typedef value_type*              pointer;
>>> Since we tend to traffic in declaration pointers, not references, it's really beneficial to have value_type be Decl*.
>> Fair enough - I was hoping to reduce the amount of pointers handed
>> around, but I realize in this case especially (where the underlying
>> sequence is a sequence of pointers) it's really an awkward fit.
> Why is reducing the number of pointers a specific goal?

Evidently not something everyone finds to be a useful goal - and on
that basis perhaps I should just leave it alone.

But for myself I find functions taking references (& locals that are
references) easier to read/reason about because it's clear that
they're not null. Trying to decide which functions are accepting
optional values and which ones aren't makes me pause a little - though
perhaps more out of habit than real curiosity about whether something
is null. If switching things to references causes pointer-habitual
people to have the same kind of pause then that's clearly not an
outright improvement.

> We tend use pointers  (not references) for AST nodes throughout AST, Sema, and a number of other places in the compiler.

Indeed - a simple search for "Decl &" turns up mostly VarDecl
references in CodeGen more than anything/anywhere else.

>> This means killing off the op-> for these iterators though, right? Or
>> do you want to keep that as an excusable convenience?
> Honestly, I preferred the world back in its conveniently-inconsistent state where operator* returned a pointer and operator-> returned that same pointer, because being able to refer to Iter->getDeclName() rather than (*Iter)->getDeclName() is *really* convenient.

It is convenient, certainly. Though I find op* returning T& to be
convenient/useful as well. In many cases we assign *i to a local
pointer (I'm not sure why - perhaps we should just have a better name
for the iterator and use that directly) & don't get the convenience of
op-> (I assume this code was written before the op-> was added,
actually) let alone the convenience of just using '.'.

But if that's what you want, I can do that.

> The inclusion of operator-> in the iterator concepts was a mistake. Nothing in the standard library actually depends on it, no sane algorithms need it, and it's a major PITA to implement for iterator adaptors (as you've noticed <g>).

That's fine - though, since I want to replace
specific/filtered_decl_iterators with some generic adapters, those
adapters will need to provide the op-> convenience you want, whether
or not it was specified in the concept/standard.

>> In which case my
>> change boils down to just changing the pointer typedefs to be Decl**
>> in these various iterator types. (& might make my planned work,
>> generalizing the filtering iterators, a bit weird - possibly providing
>> an op-> to the generic filtering iterators whenever the value type is
>> a pointer to match/support this oddity)
> Decl** won't always work, though. You'll almost certainly need a proxy.

For the basic decl_iterators they do (because the underlying sequence
is Decl*) but for the specific/filtered, where it'd be SpecificDecl**
but we don't have a real SpecificDecl* to point to, it wouldn't and
I'd need a proxy. Are there other cases/reasons for needing a proxy
that you're referring to?

>> Would you prefer the other iterators I changed
>> (specific_decl_iterator, filtered_decl_iterator) be switched to
>> pointer value_type too?
> I did prefer the pointer value_type; perhaps others want to weigh in.

I'm fairly sure all 3 of these should be consistent, and that the
specific/filtered should be refactored into a common adapter
(typedef'd in place so clients of Decl wouldn't need to change, of
course). Generally I prefer references, but I realize they're not a
perfect fit.

Yeah - I'd like to be able to get more discussion on this & other
stylistic (micro design?) aspects - I kind of feel like I'm just
throwing stuff up & seeing what's stuck after a few weeks.

>>> This isn't going to work; m_func returns a value_type or reference. Dealing with -> is a real pain, because in the general case you need a proxy class.
>>> +// a convenient utility for a common adaptation
>>> +template<typename T>
>>> +struct to_pointer {
>>> +  typedef T * result_type;
>>> +  typedef T argument_type;
>>> +  result_type &operator()(T &t) const {
>>> +    ptr = &t;
>>> +    return ptr;
>>> +  }
>>> +private:
>>> +  mutable T *ptr;
>>> +};
>>> 'to_pointer' is a very non-LLVM-y name.
>>> More generally, this particular function object seems like a bad idea. There's nothing wrong with having an iterator type whose reference type is not a true reference, which means that there is no reason to do this little dance with the mutable 'ptr'.
>> Yeah, a proxy would be nicer.
>> In any case all this transformation stuff isn't needed for this change
>> anymore if we're sticking with a pointer value_type.
> Right.

Thanks again for your time,
- David

More information about the cfe-commits mailing list