[PATCH] D81728: [InstCombine] Add target-specific inst combining

Fri Jun 12 05:53:54 PDT 2020

Flakebi added a comment.

To add more context to this, the problem I am facing is that amdgpu image intrinsics are usually called with float arguments. However, on some subtargets/hardware generations it is possible to call them with half arguments.
If llvm is compiling for such a subtarget, it is beneficial to combine

  %s32 = fpext half %s to float
  call <4 x float> @llvm.amdgcn.image.sample.2d.v4f32.f32(…, float %s32, …)

into

  call <4 x float> @llvm.amdgcn.image.sample.2d.v4f32.f16(…, half %s, …)

This combines instructions, so I think it belongs into the InstCombine pass. On the other hand, the f16 form of the intrinsics is not available on all targets, so this combination cannot be applied unconditionally but it needs to be gated depending on the target.

================
Comment at: llvm/lib/Transforms/InstCombine/InstructionCombining.cpp:3781
   auto &TLI = getAnalysis<TargetLibraryInfoWrapperPass>().getTLI(F);
+  auto &TTI = getAnalysis<TargetTransformInfoWrapperPass>().getTTI(F);
   auto &DT = getAnalysis<DominatorTreeWrapperPass>().getDomTree();
----------------
lebedev.ri wrote:
> This opens a dangerous floodgates of instcombine not being target-independent canonicalization pass.
That is the point of this change, to allow target-dependent combinations in TargetTransformInfo::instCombineIntrinsic.
Imo, all the target specific intrinsic combinations in InstCombineCalls.cpp (x86, amdgpu, etc.) can be moved to their respective target.

I don’t have a great overview of LLVM, so I might be wrong on this.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D81728/new/

https://reviews.llvm.org/D81728