[PATCH] D49483: [AMDGPU] Optimize _L image intrinsic to _LZ when lod is zero

Fri Jul 27 05:52:51 PDT 2018

nhaehnle added a comment.

I don't think doing this as an IR pass has any advantage, so this is fine.

Please add tests for gather intrinsics as well, apart from that it looks good to me.

================
Comment at: lib/Target/AMDGPU/SIISelLowering.cpp:4588
+    if (auto ConstantLod = dyn_cast<ConstantFPSDNode>(VAddrs[NumVAddrs-1].getNode())) {
+      if (ConstantLod->isZero()) {
+        IntrOpcode = LZMappingInfo->LZ;  // set new opcode to _lz variant of _l
----------------
rtaylor wrote:
> arsenm wrote:
> > Does this need to check that it is positive zero?
> I used isZero to cover both pos and neg zero.
That's an interesting question. Since LOD is clamped at 0.0 anyway, the only possible difference could come from switching between magnification and minification filters. The OpenGL spec says that the magnification filter is used when LOD is less than or equal to 0.0 (and actually even when it's <= 0.5 for nearest mipmap minification). So this test could be relaxed to check whether ConstantLod is <= 0.0.

Repository:
  rL LLVM

https://reviews.llvm.org/D49483