[PATCH] D99723: [ARM] Transforming memcpy to Tail predicated Loop

Malhar via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Apr 5 01:53:36 PDT 2021


malharJ added a comment.

So I've updated transform to not generate a preHeader block as there seems to be an issue
when generating a preHeader during the transform:

**The issue:**

The phi-node-elimination pass introduces COPY operations (for each PHI instruction in the TP loopBody) into the preHeader.

While most of them get removed by simple-register-coalescing pass, one copy in particular is not
getting removed. This is the one involving memcpy transfer size/vector element count. Regarding
why the register coalescing is unable to get rid of this particular copy/mov, I had a look at the
llc --debug output and it seems that it cant remove the mov/copy because the liveness range of 
element count register intersects with liveness range of the target of the copy/mov.

An example of the generated (incorrect) assembly is shown below:

  Relevant CFG for transform:
  
  TP Entry
           ...
  	lr = t2WhileLoopStartLR r4 (r4 may be holding something other than element count)
  
  TP preHeader
  	...
  	mov r4, r2 (assume r2 holds element count)
  	...
  TP body
  	...
  	VCTP r4
  	...

**Existing logic:**

So this value (r4 above) feeds into the loopBody PHI nodes and then the VCTP receives it (which is fine).
But when the ARMLowOverHeadsLoop pass tries to use element count operand of VCTP to feed back to t2WhileLoopStartLR,
it is providing r4 (which is incorrect because the mov is happening after the t2WhileLoopStartLR).

So I tried to see if I could fix this by looking into LowOverheadLoop::ValidateTailPredicate(), 
as it defines the "TPNumElements" variable. There is some logic there that handles the case for
local redefinitions of the elementCount physical register, by moving it forward/backward using ReachingDefAnalysis. 
But in this instance, we have a redefinition (the mov) in a different BasicBlock so that code doesn't seem to fix this.

_______

I'm not entirely certain if it's acceptable to not generate the preHeader, but unless there is a reasonably
simple fix for the above issue, I can't see another way.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D99723/new/

https://reviews.llvm.org/D99723



More information about the llvm-commits mailing list