[PATCH] D49483: [AMDGPU] Optimize _L image intrinsic to _LZ when lod is zero

Wed Jul 18 08:30:14 PDT 2018

rtaylor added a comment.

In https://reviews.llvm.org/D49483#1166525, @arsenm wrote:

> Should this be done in an IR pass instead?

This could be done in the IR but this avoids a long(er) switch statement (# of combinations) and keeps more of the image intrinsic work in the same place, which seemed per conversations the best way to go. Is there some advantage to moving this to the IR (InstCombine for example)?

================
Comment at: lib/Target/AMDGPU/SIISelLowering.cpp:4588
+    if (auto ConstantLod = dyn_cast<ConstantFPSDNode>(VAddrs[NumVAddrs-1].getNode())) {
+      if (ConstantLod->isZero()) {
+        IntrOpcode = LZMappingInfo->LZ;  // set new opcode to _lz variant of _l
----------------
arsenm wrote:
> Does this need to check that it is positive zero?
I used isZero to cover both pos and neg zero.

Repository:
  rL LLVM

https://reviews.llvm.org/D49483