[PATCH] D49483: [AMDGPU] Optimize _L image intrinsic to _LZ when lod is zero
Ryan Taylor via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Jul 18 08:30:14 PDT 2018
rtaylor added a comment.
In https://reviews.llvm.org/D49483#1166525, @arsenm wrote:
> Should this be done in an IR pass instead?
This could be done in the IR but this avoids a long(er) switch statement (# of combinations) and keeps more of the image intrinsic work in the same place, which seemed per conversations the best way to go. Is there some advantage to moving this to the IR (InstCombine for example)?
================
Comment at: lib/Target/AMDGPU/SIISelLowering.cpp:4588
+ if (auto ConstantLod = dyn_cast<ConstantFPSDNode>(VAddrs[NumVAddrs-1].getNode())) {
+ if (ConstantLod->isZero()) {
+ IntrOpcode = LZMappingInfo->LZ; // set new opcode to _lz variant of _l
----------------
arsenm wrote:
> Does this need to check that it is positive zero?
I used isZero to cover both pos and neg zero.
Repository:
rL LLVM
https://reviews.llvm.org/D49483
More information about the llvm-commits
mailing list