[PATCH] D22161: SystemZ: Avoid implicit iterator conversions, NFC

Ulrich Weigand via llvm-commits llvm-commits at lists.llvm.org
Tue Jul 12 00:47:38 PDT 2016


dexonsmith at apple.com wrote on 11.07.2016 23:13:17:

> See PR26753 and "ilist/iplist are broken (maybe I'll fix them?)":
>   http://lists.llvm.org/pipermail/llvm-dev/2015-October/091115.html
>
> I thought I was "finished" back in November, but I'd forgotten about
> MachineInstrBundleIterator (MachineBasicBlock::iterator) and then
> didn't find time to continue until the last few weeks.
>
> Let me summarize the big picture (adding llvm-dev since I haven't
> talked about this in a while):
> - List iterators work by having a pointer to a list node and walking
> "node->next" every time operator++ is called.
> - Lists need a sentinel of some sort to represent the "list.end()"
iterator.
> - The way to do this efficiently and safely is to have the sentinel
> be some un-templated "list_node_base" class that has a next and prev
> pointer.  The next/prev pointers point to "list_node_base", and
> "list_iterator<T>" downcasts to "T" on dereference.
> - However, ilist<T> assumes the sentinel is a full-bodied "T" and
> uses "T*" for the "next" pointers.  In practice consumers override
> the sentinel traits so that the sentinel is only a "ilist_half_node"
> (to save space).  This means that operator++ invokes UB every time
> you increment to the end iterator and downcast to "T*" (badness).
> - MachineInstrBundleIterator::operator MachineInstr*() relies on the
> member of ilist_iterator<MachineInstr> being an already-downcast
> MachineInstr*, indirectly relying on the UB in operator++.  I can't
> safely refactor ilist/ilist_node/etc. until code stops (indirectly)
> relying on this UB.
>
> What's `&*I` all about?
> - An iterator<T> does not necessarily point at a valid T.
> - It could just be a sentinel object that T inherits from (in the
> case of lists), or an unallocated memory location (in the case of
arrays).
> - If you really want a T*, you should dereference iterator<T> and
> then take the address.
> - Everyone (?) knows that it's undefined behaviour to dereference an
> iterator<T> that is not pointing at a valid T.  `&*I` makes it clear
> at the call site that a pre-condition to getting a `T*` is for `I`
> to be a valid iterator.
> - I will be making the implicit operator illegal promptly when
> there's no code using it.  Just waiting on a few backends.
> - Your code will be nicer to read -- particularly after my changes
> -- if you use references (T&) instead of pointers (T*) whenever
> variables/parameters/etc. cannot be nullptr.  In many cases I've
> updated variables/parameters/etc. in this way, but sometimes I took
> the easy way out of just saying `&*I` (and sometimes that's the best
> option).  I encourage following up with more reference-based logic!

Thanks for the explanation, that does make sense.  So once the implicit
operator is made illegal, the compiler should catch any attempt to use
it at compile time, and we cannot accidentally reintroduce problematic
code in the future, right?

Bye,
Ulrich
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20160712/a0fc1ea8/attachment.html>


More information about the llvm-commits mailing list