[llvm] [clang] [AMDGPU] Improve selection of ballot.i64 intrinsic in wave32 mode. (PR #71556)
Pierre van Houtryve via cfe-commits
cfe-commits at lists.llvm.org
Wed Nov 29 01:04:08 PST 2023
================
@@ -961,6 +961,18 @@ GCNTTIImpl::instCombineIntrinsic(InstCombiner &IC, IntrinsicInst &II) const {
return IC.replaceInstUsesWith(II, Constant::getNullValue(II.getType()));
}
}
+ if (ST->isWave32() && II.getType()->getIntegerBitWidth() == 64) {
+ // %b64 = call i64 ballot.i64(...)
+ // =>
+ // %b32 = call i32 ballot.i32(...)
+ // %b64 = zext i32 %b32 to i64
+ Function *NewF = Intrinsic::getDeclaration(
+ II.getModule(), Intrinsic::amdgcn_ballot, {IC.Builder.getInt32Ty()});
+ CallInst *NewCall = IC.Builder.CreateCall(NewF, {II.getArgOperand(0)});
+ Value *CastedCall = IC.Builder.CreateZExtOrBitCast(NewCall, II.getType());
+ CastedCall->takeName(&II);
+ return IC.replaceInstUsesWith(II, CastedCall);
----------------
Pierre-vh wrote:
Nit: Can just reuse the same value, e.g.
```
Value *Call = IC.Builder.CreateCall(NewF, {II.getArgOperand(0)});
Call = IC.Builder.CreateZExtOrBitCast(NewCall, II.getType());
Call->takeName(&II);
return IC.replaceInstUsesWith(II, Call);
```
I also think you should be able to use something like `IC.Builder.CreateIntrinsic` ? There should be a function to create an intrinsic call directly.
https://github.com/llvm/llvm-project/pull/71556
More information about the cfe-commits
mailing list