[PATCH] D81728: [InstCombine] Add target-specific inst combining
Sebastian Neubauer via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Jun 12 05:53:54 PDT 2020
Flakebi added a comment.
To add more context to this, the problem I am facing is that amdgpu image intrinsics are usually called with float arguments. However, on some subtargets/hardware generations it is possible to call them with half arguments.
If llvm is compiling for such a subtarget, it is beneficial to combine
%s32 = fpext half %s to float
call <4 x float> @llvm.amdgcn.image.sample.2d.v4f32.f32(…, float %s32, …)
into
call <4 x float> @llvm.amdgcn.image.sample.2d.v4f32.f16(…, half %s, …)
This combines instructions, so I think it belongs into the InstCombine pass. On the other hand, the f16 form of the intrinsics is not available on all targets, so this combination cannot be applied unconditionally but it needs to be gated depending on the target.
================
Comment at: llvm/lib/Transforms/InstCombine/InstructionCombining.cpp:3781
auto &TLI = getAnalysis<TargetLibraryInfoWrapperPass>().getTLI(F);
+ auto &TTI = getAnalysis<TargetTransformInfoWrapperPass>().getTTI(F);
auto &DT = getAnalysis<DominatorTreeWrapperPass>().getDomTree();
----------------
lebedev.ri wrote:
> This opens a dangerous floodgates of instcombine not being target-independent canonicalization pass.
That is the point of this change, to allow target-dependent combinations in TargetTransformInfo::instCombineIntrinsic.
Imo, all the target specific intrinsic combinations in InstCombineCalls.cpp (x86, amdgpu, etc.) can be moved to their respective target.
I don’t have a great overview of LLVM, so I might be wrong on this.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D81728/new/
https://reviews.llvm.org/D81728
More information about the llvm-commits
mailing list