[PATCH] D66707: AMDGPU: Run AMDGPUCodeGenPrepare after scalar opts

Sat Aug 24 13:52:01 PDT 2019

arsenm created this revision.
arsenm added reviewers: rampitec, cfang.
Herald added subscribers: asbirlea, t-tye, tpr, dstuttard, yaxunl, nhaehnle, wdng, jvesely, kzhuravl.

The mul24 matching could interfere with SLSR and the other addressing
mode related passes. This probably is not the optimal placement, but
is an intermediate step. This should probably be moved after all the
generic IR passes, particularly LSR. Moving this after LSR seems to
help in some cases, and hurts others.

As-is in this patch, in idiv-licm, it saves 1-2 instructions inside
some of the loop bodies, but increases the number in others. Moving
this later helps these loops. In the new lsr tests in
mul24-pass-ordering, the intrinsic prevents introducing more
instructions in the loop preheader, so moving this later ends up
hurting them. This shouldn't be any worse than before the intrinsics
were introduced in r366094, and LSR should probably be smarter. I
think it's because it doesn't know the and inside the loop will be
folded away.

https://reviews.llvm.org/D66707

Files:
  lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
  test/CodeGen/AMDGPU/idiv-licm.ll
  test/CodeGen/AMDGPU/mul24-pass-ordering.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D66707.217027.patch
Type: text/x-patch
Size: 23319 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20190824/e2f19123/attachment.bin>