[PATCH] D138424: [TargetLowering][AArch64] Teach DemandedBits about SVE count intrinsics
Dave Green via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Nov 23 01:36:26 PST 2022
dmgreen added inline comments.
================
Comment at: llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp:1140
+ return false;
+ int64_t VScaleResultUpperbound = *MaxVScale;
+ if (auto *MulImm = dyn_cast<ConstantSDNode>(Op.getOperand(0))) {
----------------
benmxwl-arm wrote:
> dmgreen wrote:
> > Could this just use construct known bits of both sides and use KnownBits::mul? It might be able to get value out of the low bits then too.
> Please correct me if I've made a silly mistake here, but it does not seem to be exactly KnownBits::mul:
>
> ```
> Optional<unsigned> MaxVScale = Attr.getVScaleRangeMax();
> if (!MaxVScale.has_value())
> return false;
> if (auto *MulImm = dyn_cast<ConstantSDNode>(Op.getOperand(0))) {
> unsigned RequiredBits = Log2_64(*MaxVScale) + 1;
> if (RequiredBits >= BitWidth)
> return false;
> Known.Zero.setHighBits(BitWidth - RequiredBits);
> Known = KnownBits::mul(Known, KnownBits::makeConstant(MulImm->getAPIntValue()));
> }
> return false;
> ```
>
> If the MaxVScale is 16 (5 bits)
>
> The known zero bits are everything above Log2(16 * Mul) + 1.
> The above snippet seems to end up with Log2((2^5 - 1) * Mul) + 1 (which is off by 1 bit)
>
> Also it seems that KnowBits::mul can't handle negative multipliers and always reports no known bits in that case.
>
>
Ah - I was worried about that - because it works on bits and not ranges, the results are not as exact as they could otherwise be. I was also considering the vscale turning into a shift, but that may be SVE specific.
================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:23297
+ if (auto ElementSize = IsSVECntIntrinsic(Op)) {
+ // The SVE count intrinsics don't support the multiplier immediate so we
----------------
Can you move this into the switch under INTRINSIC_WO_CHAIN. Just so it doesn't need to be checked for every instruction.
================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:23303
+ // a value that's strictly less than "ALL".
+ unsigned MaxElements = AArch64::SVEMaxBitsPerVector / *ElementSize;
+ unsigned RequiredBits = Log2_64(MaxElements) + 1;
----------------
Is there a reason why this is this based on SVEMaxBitsPerVector and not the maximum value in VScaleRange? Or a combo of both if vscale_range is unbounded for some reason.
================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:23304
+ unsigned MaxElements = AArch64::SVEMaxBitsPerVector / *ElementSize;
+ unsigned RequiredBits = Log2_64(MaxElements) + 1;
+ unsigned BitWidth = Known.Zero.getBitWidth();
----------------
Nit: Log2_32?
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D138424/new/
https://reviews.llvm.org/D138424
More information about the llvm-commits
mailing list