[cfe-commits] [patch] TransformIterator (motivating use case in Clang)

David Blaikie dblaikie at gmail.com
Tue May 15 14:33:58 PDT 2012

On Tue, May 15, 2012 at 2:23 PM, Chandler Carruth <chandlerc at google.com> wrote:
> On Tue, May 15, 2012 at 3:11 PM, Douglas Gregor <dgregor at apple.com> wrote:
>> On May 15, 2012, at 11:57 AM, David Blaikie <dblaikie at gmail.com> wrote:
>> > [-llvm-commits since it's not too relevant to that list at this point]
>> >
>> > On Tue, May 15, 2012 at 11:24 AM, Douglas Gregor <dgregor at apple.com>
>> > wrote:
>> >>
>> >> On May 15, 2012, at 9:31 AM, David Blaikie <dblaikie at gmail.com> wrote:
>> >>
>> >>> On Tue, May 15, 2012 at 9:19 AM, Douglas Gregor <dgregor at apple.com>
>> >>> wrote:
>> >>>>
>> >>>> On May 6, 2012, at 7:02 PM, David Blaikie wrote:
>> >>>>
>> >>>>> This patch adds an iterator much like boost's transform_iterator
>> >>>>> (without some extra bells & whistles) for use in some changes I'd
>> >>>>> like
>> >>>>> to make to Clang to correct some misimplemented iterators there.
>> >>>>>
>> >>>>> A few gotchas that could be improved/changed depending on opinions:
>> >>>>> * I may be playing a little loose with the return type of the
>> >>>>> functor
>> >>>>> (see the example/motivating functor, to_pointer) - the return type
>> >>>>> actually must be a reference, though the result_type provides the
>> >>>>> underlying value type, not the reference type. If this is violated
>> >>>>> Clang will emit a warning, but I could make it more robust with a
>> >>>>> compile time assertion in the TransformIterator that the result_type
>> >>>>> is actually a reference type, and strip that off to provide the
>> >>>>> value_type of the TransformIterator.
>> >>>>
>> >>>> It's much more correct for the value and reference types of the
>> >>>> iterator type to be, e.g.,
>> >>>>
>> >>>>  typedef typename Functor::result_type reference;
>> >>>>  typedef typename remove_reference<reference>::type value_type;
>> >>>>
>> >>>>> * I realize adding pointer-yness back onto an iterator over
>> >>>>> references
>> >>>>> when the underlying data structure (in the Clang patch you can see
>> >>>>> this situation) is a sequence of pointers may be a bit overkill -
>> >>>>> though the alternatives are writing different algorithms in the two
>> >>>>> cases where I've used TransformIterator, or having the decl_iterator
>> >>>>> iterate over pointers (in which case I should probably change the
>> >>>>> other iterators I've modified to iterate over pointers for
>> >>>>> symmetry).
>> >>>>
>> >>>> diff --git include/clang/AST/DeclBase.h include/clang/AST/DeclBase.h
>> >>>> index 6aef681..0a16ea5 100644
>> >>>> --- include/clang/AST/DeclBase.h
>> >>>> +++ include/clang/AST/DeclBase.h
>> >>>> @@ -1175,17 +1175,18 @@ public:
>> >>>>     Decl *Current;
>> >>>>
>> >>>>   public:
>> >>>> -    typedef Decl*                     value_type;
>> >>>> -    typedef Decl*                     reference;
>> >>>> -    typedef Decl*                     pointer;
>> >>>> +    typedef Decl                     value_type;
>> >>>> +    typedef value_type&              reference;
>> >>>> +    typedef value_type*              pointer;
>> >>>>
>> >>>> Since we tend to traffic in declaration pointers, not references,
>> >>>> it's really beneficial to have value_type be Decl*.
>> >>>
>> >>> Fair enough - I was hoping to reduce the amount of pointers handed
>> >>> around, but I realize in this case especially (where the underlying
>> >>> sequence is a sequence of pointers) it's really an awkward fit.
>> >>
>> >> Why is reducing the number of pointers a specific goal?
>> >
>> > Evidently not something everyone finds to be a useful goal - and on
>> > that basis perhaps I should just leave it alone.
>> >
>> > But for myself I find functions taking references (& locals that are
>> > references) easier to read/reason about because it's clear that
>> > they're not null. Trying to decide which functions are accepting
>> > optional values and which ones aren't makes me pause a little - though
>> > perhaps more out of habit than real curiosity about whether something
>> > is null. If switching things to references causes pointer-habitual
>> > people to have the same kind of pause then that's clearly not an
>> > outright improvement.
>> Unless we're seeing lots of problems due to the pointer-centric style, I
>> don't see a motivation to make this kind of change in the code base.
> Since there was some mention of wanting other perspectives here, I wanted to
> chime in with my two cents...


> This seems like a lot of work and a disruptive change in API patterning, and
> I'm really not clear what problem it's trying to solve. My suspicion is that
> there isn't one other than a preference for references instead of pointers.

Well the underlying problem was the inconsistency of the iterators in
Decl/DeclBase. I wanted to rationalize their value/reference/pointer
types so I could then extract the common filtering functionality out
of filtered/specific_decl_iterator as a proof-of-concept for a general
adapting/filtering iterator device which I could then re-use in my
efforts to improve the CFG fidelity and -Wunreachable-code diagnostic.
This particular path (generalizing the existing filtering iterators to
a reusable adapter before reusing that in the new use case) was
suggested by Ted Kremmenek & I was happy enough to have an excuse to
tidy up some existing code.

Then when I went to extract that filtering functionality I came across
the fact that they weren't quite meeting the iterator requirements
(specifically, value_type, reference, and pointer were all the same
type) & so I set about fixing that. It could go either way - pointer
value types or reference value types. I asked, I waited, and
eventually Richard Smith chimed in to say he was OK with the reference

Yes, my preference is towards the reference version because it seems
to match the semantics (non-null-ness) and usage (iter->x) that's

> I actually don't even agree with this movement. While yes, it is nice to use
> simpler constructs when there is no possibility of a null pointer,
> references aren't strictly superior. We can't assign to them to re-point
> them to other entities, which is a fundamental pattern in Clang and LLVM. I
> don't see soft "non-null" constraints as a sufficient reason to lose this.

I'm not suggesting losing that flexibility where it's
desired/necessary. Indeed I think that helps readability to know which
things are repointing over some algorithm & which things aren't. It's
certainly not uncommon for algorithms/types to have pointers inside
them that are reassigned to based on some ongoing algorithm or
condition & those would have to remain pointers, certainly.

> What's more, it is trivial to add non-null constraints to pointers. In fact,
> dyn_cast/cast/isa all provide such constraints. We could teach Clang about
> [[nonnull]] for parameters and get much more than a reference gives us:
> - Optimizing based on non-null-ness

Clang doesn't optimize references on the basis of non-null-ness? For
those things that need to be reassigned, but are guaranteed to never
be null (I'm not sure we have (m)any of those - mostly if we're
reassigning them, they start out/may be null for some time)

> - Checking in non-optimized builds for non-null-ness

Granted, adding this to reference usage might be a bit harder -
knowing where checks are beneficial & where they aren't, but I imagine
if we went all out with non-null annotations we might have some
similar perf problems anyway.

> - Still able to overwrite the argument within the funciton

That usually seems to be a source of errors (as we've seen with people
thinking they have an out parameter & assigning to the parameter
directly, rather than the dereferenced parameter). Are there many
deliberate uses of this in Clang?

- David

More information about the cfe-commits mailing list