[clang] [llvm] [HLSL] Re-implement countbits with the correct return type (PR #113189)

Mon Oct 21 12:31:13 PDT 2024

================
@@ -461,6 +461,67 @@ class OpLowerer {
     });
   }
 
+  [[nodiscard]] bool lowerCtpopToCBits(Function &F) {
+    IRBuilder<> &IRB = OpBuilder.getIRB();
+    Type *Int32Ty = IRB.getInt32Ty();
+
+    return replaceFunction(F, [&](CallInst *CI) -> Error {
+      IRB.SetInsertPoint(CI);
+      SmallVector<Value *> Args;
+      Args.append(CI->arg_begin(), CI->arg_end());
+
+      Type *RetTy = Int32Ty;
+      Type *FRT = F.getReturnType();
+      if (FRT->isVectorTy()) {
+        VectorType *VT = cast<VectorType>(FRT);
+        RetTy = VectorType::get(RetTy, VT);
+      }
+
+      Expected<CallInst *> OpCall = OpBuilder.tryCreateOp(
+          dxil::OpCode::CBits, Args, CI->getName(), RetTy);
+      if (Error E = OpCall.takeError())
+        return E;
+
+      // If the result type is 32 bits we can do a direct replacement.
+      if (FRT->isIntOrIntVectorTy(32)) {
+        CI->replaceAllUsesWith(*OpCall);
+        CI->eraseFromParent();
+        return Error::success();
+      }
+
+      unsigned CastOp;
+      if (FRT->isIntOrIntVectorTy(16))
+        CastOp = Instruction::ZExt;
----------------
bfavela wrote:

Related to the signed vs unsigned: this is where things I think get confusing. int16_t would almost always be sign extended, not zero extended. But doing a sign extension here doesn't make sense as you'll count 16 extra bits for any negative int16_t.
I think this is why std::bitset basically ignores the type. It's a forcing function to the author to say "yes, I'm purposely counting a signed value now go away"

https://github.com/llvm/llvm-project/pull/113189