[llvm] [AMDGPU] Improve uniform argument handling in InstCombineIntrinsic (PR #105812)
Sameer Sahasrabuddhe via llvm-commits
llvm-commits at lists.llvm.org
Fri Aug 23 05:00:08 PDT 2024
================
@@ -440,6 +440,22 @@ static bool canContractSqrtToRsq(const FPMathOperator *SqrtOp) {
SqrtOp->getType()->isHalfTy();
}
+/// Return true if we can easily prove that use U is uniform.
+static bool isTriviallyUniform(const Use &U) {
+ Value *V = U.get();
+ if (isa<Constant>(V))
+ return true;
+ if (auto *I = dyn_cast<Instruction>(V)) {
+ // If I and U are in different blocks then there is a possibility of
+ // temporal divergence.
+ if (I->getParent() != cast<Instruction>(U.getUser())->getParent())
+ return false;
+ if (const auto *II = dyn_cast<IntrinsicInst>(I))
----------------
ssahasra wrote:
Would it be slightly faster if we checked this first, before checking whether it's the same block? I mean, return false if it is not a uniform intrinsic? If the dyn_cast to IntrinsicInst were the outer condition, then would also really bring out the fact that we are doing a trivial check on uniform intrinsics only, and not instructions in general.
https://github.com/llvm/llvm-project/pull/105812
More information about the llvm-commits
mailing list