[llvm-dev] Spurious peeling in simple loop unrolling

Fri Mar 12 04:59:19 PST 2021

On 10/03/2021 14:30, Thomas Preud'homme wrote:
On 10/03/2021 10:58, Florian Hahn wrote:

On Mar 9, 2021, at 13:13, Thomas Preud'homme <thomasp at graphcore.ai<mailto:thomasp at graphcore.ai>> wrote:

I tried the patch (thanks) but that did not remove any of the PHI (the 2 loads are still there and thus the bitcast don't appear to have the same source). I'll try tolook at InstCombine to see why loads are not CSE'd.

I’m not sure I follow here. For your example (spurious_loop_peeling.cpp), it looks like there’s no peeling happening any more after the patch landed, at least when building for ARM64: https://godbolt.org/z/q6d6Kn . Is there anything else that’s going wrong?

The testcase I sent is indeed fixed with your commit. However the code it is inspired from still shows unwanted peeling. I'm going to investigate what causes the difference.

Best regards,

Thomas

Sorry for the late reply. FYI the difference is because the original code is using pointer rather than reference parameter (see attachment). This leads to LICM not hoisting the load out of the outermost loop due to isSafeToExecuteUnconditionally returning false. This happens because the base pointer of the GEP used by the load is not sufficiently aligned. isDereferenceableAndAlignedPointer() from Loads.cpp calls Value::getPointerAlignment which returns an alignment of 1 and deduced that the alignment is not enough compared to the load requirement.

In my case however I know for certain that the this pointer is sufficiently aligned. Unfortunately I could not find a way to indicate it to the compiler. I tried to use __builtin_assume_aligned on the this pointer and use the return value for all access but that did not make any difference.

So to summarize:

Load whose base is a function parameter gets duplicated by loop rotate, LICM cannot hoist it out completely (it does get hoisted out of the innerloop) due to alignment issue which means a phi remains in the innerloop when loop peeling happens. This leads to code bloat and in our case lack of vectorization.

However clearly GVN thinks the load outside the loop is the same as the one in the loop and so the one in the loop can be removed. That seems inconsistent with the behaviour of LICM so I'm gonna try to look into this.

Best regards,

Thomas
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210312/4ce73c47/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: spurious_loop_peeling2.cpp
Type: text/x-c++src
Size: 656 bytes
Desc: spurious_loop_peeling2.cpp
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210312/4ce73c47/attachment.cpp>