[clang] [llvm] [AMDGPU] Improve selection of ballot.i64 intrinsic in wave32 mode. (PR #71556)
    Matt Arsenault via cfe-commits 
    cfe-commits at lists.llvm.org
       
    Mon Dec 18 06:11:55 PST 2023
    
    
  
================
@@ -961,6 +961,19 @@ GCNTTIImpl::instCombineIntrinsic(InstCombiner &IC, IntrinsicInst &II) const {
         return IC.replaceInstUsesWith(II, Constant::getNullValue(II.getType()));
       }
     }
+    if (ST->isWave32() && II.getType()->getIntegerBitWidth() == 64) {
+      // %b64 = call i64 ballot.i64(...)
+      // =>
+      // %b32 = call i32 ballot.i32(...)
+      // %b64 = zext i32 %b32 to i64
+      Value *Call = IC.Builder.CreateZExtOrBitCast(
----------------
arsenm wrote:
Should be CreateZExt, there's no way this can need a bitcast 
https://github.com/llvm/llvm-project/pull/71556
    
    
More information about the cfe-commits
mailing list