[PATCH] D111471: [NVPTX] Add a late SROA pass which allows optimizing away more allocas.

Mon Oct 11 13:52:34 PDT 2021

tra added a comment.

In D111471#3053159 <https://reviews.llvm.org/D111471#3053159>, @lebedev.ri wrote:

> While this may make sense, i would still like to see the actual reduced reproducer for the SROA failure.

@lebedev.ri  Reproducer IR is here:
https://gist.github.com/Artem-B/c8d048ce7666f5693a8c899458829f5a

I've included the llvm-reduce test script in the comment there.

> Unless that new LICM run prevented inlining, i suspect it might be fixable in SROA itself.

The problem is that one of the peeled-off parts of the loop made it impossible for SROA to figure out the fixed load offset. One of the later GVN passes simplified things enough to let the SROA pass figure out the offset of all the loads..

Specifically, here's the input to the failing SROS pass @ HEAD:
https://gist.github.com/Artem-B/a56797a9b918d4831a303ae2869dc83f

For comparison, here's what IR looked at the point where SROA was able to eliminate the alloca:
https://gist.github.com/Artem-B/0e8786afff6e6838b5cf5a9e21851b5c

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D111471/new/

https://reviews.llvm.org/D111471