[PATCH] D128582: Enable SeparateConstOffsetFromGEPPass() at -O3 and -O2

Shubham Narlawar via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Jul 7 22:47:23 PDT 2022


gsocshubham added a comment.

In D128582#3628937 <https://reviews.llvm.org/D128582#3628937>, @dmgreen wrote:

>> Passing `-aarch64-enable-gep-opt=true -O3` would do the job - https://clang.godbolt.org/z/hcjsh1vex what this patch was proposing. WDYT?
>>
>> ..
>>
>> Sure. For #50528, enabling `EnableGEPOpt` reduces GEP instructions by half but is done by splitting GEP into ptr2int and int2ptr.
>
> Oh yeah I see it does. I must have missed the -O3 off the time I tried it. That's good. I wonder why it was never enabled in the past.
>
> From the look of it, it runs after LSR, which I think it would need to run before. Otherwise it is likely to mess up what LSR has tried to do. That would be before the call to TargetPassConfig::addIRPasses(). I'm not sure if the LICM run is necessary either, but I see it is used in other backends. We would need to gather some benchmark to see how it behaves. Like do the issues in x264 still occur, and what happens across more benchmark cases.

Will running it before LSR make any difference compared to current location?

The issue in x264 was occuring due to pass registeration way early at IR level. It is irrelevant now since it is already enabled in AArch64 if passed relevant flags. As we have been given suggestion to move GEP pass just before Selection DAG, I have registered/moved GEP pass from `addIRPasses()` to at the end of `AArch64PassConfig::addCodeGenPrepare()` but there is no change in x264 benchmark result.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D128582/new/

https://reviews.llvm.org/D128582



More information about the llvm-commits mailing list