[llvm] [clang] [AMDGPU] Improve selection of ballot.i64 intrinsic in wave32 mode. (PR #71556)

Valery Pykhtin via cfe-commits cfe-commits at lists.llvm.org
Tue Nov 21 06:08:47 PST 2023


================
@@ -2314,9 +2314,8 @@ void AMDGPUDAGToDAGISel::SelectBRCOND(SDNode *N) {
     SDValue VCMP = Cond->getOperand(0);
     auto CC = cast<CondCodeSDNode>(Cond->getOperand(2))->get();
     auto *CRHS = dyn_cast<ConstantSDNode>(Cond->getOperand(1));
-    if ((CC == ISD::SETEQ || CC == ISD::SETNE) && CRHS && CRHS->isZero() &&
-        // TODO: make condition below an assert after fixing ballot bitwidth.
-        VCMP.getValueType().getSizeInBits() == ST->getWavefrontSize()) {
+    if ((CC == ISD::SETEQ || CC == ISD::SETNE) && CRHS && CRHS->isZero()) {
+      assert(VCMP.getValueType().getSizeInBits() == ST->getWavefrontSize());
----------------
vpykhtin wrote:

Good point. It looks like I need to perform the transformation in both places - instcombiner and SelectionDAG combiner if  I want to keep ballot folding on -O0 but I'm not sure this is really needed.

Also I need to keep original test running without opt and just add runs with opt.

https://github.com/llvm/llvm-project/pull/71556


More information about the cfe-commits mailing list