[llvm-dev] Where and how to report an optimisation issue that doesn't cause a crash

David Blaikie via llvm-dev llvm-dev at lists.llvm.org
Fri Oct 25 15:45:16 PDT 2019

Baseline: No compiler produces optimal code in all cases - it's too
expensive. They'll all have some tradeoffs, some more reasonable than
others, some more intentional than others. I don't know where this
particular situation lies (whether it's a reasonable tradeoff of compiler
complexity, or an accidental one that might still be a reasonable tradeoff
of compiler engineer time (other things more important than this particular
optimization, etc)).

‪On Fri, Oct 25, 2019 at 1:02 AM ‫אלכס לופ'‬‎ <alex_lop at walla.com> wrote:‬

> Could be... But the wierd thing here is if I change the array to be of
> size 256 and the index to be 'unsigned char', seems like there is no way to
> access the 'size'  field throught  "y->ptr[index]" (in your example) but
> clang still performs the re-read: https://godbolt.org/z/ahlGHY

Yup - I guess LLVM doesn't track/use the upper bound of the index here,
even if it uses the lower bound. (as demonstrated by the reordering
enabling the optimization)

> It looks like clang sees an alias between "int ptr[256]" and "int size"...
> maybe because they are both "int"? Because changing the type of size to
> char/short/long eliminates the re-read: https://godbolt.org/z/4kr1oo

That's somewhat interesting - hmm, I'd assumed we didn't use strict
aliasing by default, but perhaps we do. You can try compiling with
-fno-strict-aliasing to disable type based alias analysis & see that the
type change doesn't fix the issue then.

> Interesting and confusing at the same time. Can it be related to the
> standard definition of (non)aliasing as this answer suggests:
> https://stackoverflow.com/a/58550055/5218277 ?

Yeah, guess so - once you change the type, type based alias analysis proves
the pointer produced by the index arithmetic can't alias the non-int
member. (with -fno-strict-aliasing changing the type does not remove the
extra load even when the type is different)

>  On the other hand, placing the "int size" (as you mentioned) before "int
> ptr[256]" also eliminates the re-reads: https://godbolt.org/z/kpB74m
> ב אוק׳ 25, 2019 0:44, David Blaikie כתב:
> Here's a simpler version which I believe still captures the essence of the
> issue: https://godbolt.org/z/BUoLq1
> Yep, looks like GCC manages this optimization and Clang does not - the
> runtime bound on the array is stopping clang from going further here
> (perhaps that's intentional - maybe to allow the array indexed store to
> possibly exceed the bounds of the array and assign to 'size' that comes
> after it - UB, but I don't know just how far non-strict aliasing goes)
> Yeah, looks like it might be something like that ^ if you change the index
> to be unsigned, and reorder the members so 'size' comes before the array
> (so indexing into the array can't alias 'size'), you do get:
> https://godbolt.org/z/Ax62Fv which avoids reloading size after the
> assignment to the array.
> ‪On Thu, Oct 24, 2019 at 12:53 PM ‫אלכס לופ'‬‎ <alex_lop at walla.com>
> wrote:‬
> Hi David,
> Thanks for your reply. The fact that
> "queue_ptr->queue[queue_ptr->wr_idx++]" could be somewhere in the object
> pointed by "queue_ptr" came to my mind too.
> I also added this in the stackoverflow question at the bottom.
> In such case, I assumed, that if I change the struct "queue_t" to have an
> array of "event_t" instead of just pointer to "event_t", like this:
> typedef struct
> {
>     event_t                     queue[256]; // changed from pointer to array with max size
>     size_t                      size;
>     uint16_t                    num_of_items;
>     uint8_t                     rd_idx;
>     uint8_t                     wr_idx;
> } queue_t;
> there would be no valid way (without breaking the C standard definitions) that "queue_ptr->queue[queue_ptr->wr_idx++]" ends up somewhere inside "*queue_ptr", right?
> However the generated assembly code still contained the same re-reads...
> Any idea why would this happen even with the new "queue_t" struct definition?
> P.S.
> I can easily reproduce it and provide any additional data which can be gathered during the compilation process, if it helps.
> Thanks,
> Alex.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20191025/78765bdf/attachment.html>

More information about the llvm-dev mailing list