[llvm] Re apply 130577 narrow math for and operand (PR #133896)
Juan Manuel Martinez CaamaƱo via llvm-commits
llvm-commits at lists.llvm.org
Mon Apr 14 00:38:17 PDT 2025
================
@@ -1559,6 +1559,74 @@ void AMDGPUCodeGenPrepareImpl::expandDivRem64(BinaryOperator &I) const {
llvm_unreachable("not a division");
}
+/*
+This will cause non-byte load in consistency, for example:
+```
+ %load = load i1, ptr addrspace(4) %arg, align 4
+ %zext = zext i1 %load to
+ i64 %add = add i64 %zext
+```
+Instead of creating `s_and_b32 s0, s0, 1`,
+it will create `s_and_b32 s0, s0, 0xff`.
+We accept this change since the non-byte load assumes the upper bits
+within the byte are all 0.
+*/
+static bool tryNarrowMathIfNoOverflow(Instruction *I,
+ const SITargetLowering *TLI,
+ const TargetTransformInfo &TTI,
+ const DataLayout &DL) {
+ unsigned Opc = I->getOpcode();
+ Type *OldType = I->getType();
+
+ if (Opc != Instruction::Add && Opc != Instruction::Mul)
+ return false;
+
+ unsigned OrigBit = OldType->getScalarSizeInBits();
+
+ if (Opc != Instruction::Add && Opc != Instruction::Mul)
+ llvm_unreachable("Unexpected opcode, only valid for Instruction::Add and "
+ "Instruction::Mul.");
+
+ unsigned MaxBitsNeeded = computeKnownBits(I, DL).countMaxActiveBits();
+
+ MaxBitsNeeded = std::max<unsigned>(bit_ceil(MaxBitsNeeded), 8);
+ Type *NewType = DL.getSmallestLegalIntType(I->getContext(), MaxBitsNeeded);
+ if (!NewType)
+ return false;
+ unsigned NewBit = NewType->getIntegerBitWidth();
+ if (NewBit >= OrigBit)
+ return false;
+ NewType = I->getType()->getWithNewBitWidth(NewBit);
+
+ // Old cost
+ InstructionCost OldCost =
+ TTI.getArithmeticInstrCost(Opc, OldType, TTI::TCK_RecipThroughput);
+ // New cost of new op
+ InstructionCost NewCost =
+ TTI.getArithmeticInstrCost(Opc, NewType, TTI::TCK_RecipThroughput);
+ // New cost of narrowing 2 operands (use trunc)
+ NewCost += 2 * TTI.getCastInstrCost(Instruction::Trunc, NewType, OldType,
----------------
jmmartinez wrote:
I'm pretty sure that it doesn't matter, but what if one of the operands is a constant (truncating gives a new constant literal). Shouldn't `new cost += trunc_cost* 1` only ?
https://github.com/llvm/llvm-project/pull/133896
More information about the llvm-commits
mailing list