[PATCH] D40554: [PowerPC] Fix bugs in sign-/zero-extension elimination

Nemanja Ivanovic via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Jul 4 15:02:31 PDT 2022


nemanjai added a comment.

As tends to be the case with huge patches such as this, it is difficult to get this patch to converge and it still needs a bit of work. But I think it is pretty close overall.



================
Comment at: llvm/lib/Target/PowerPC/PPCInstr64Bit.td:897
 defm CNTLZW8 : XForm_11r<31,  26, (outs g8rc:$rA), (ins g8rc:$rS),
-                        "cntlzw", "$rA, $rS", IIC_IntGeneral, []>;
+                        "cntlzw", "$rA, $rS", IIC_IntGeneral, []>, ZExt32To64;
 defm CNTTZW8 : XForm_11r<31, 538, (outs g8rc:$rA), (ins g8rc:$rS),
----------------
I don't follow why this is only marked as zero-extended (whereas for example, `andi.` is marked as both). The 32-bit sign bit can only be a zero, so a sign-extend from 32-bits (`extsw`) is guaranteed to essentially be a nop.


================
Comment at: llvm/lib/Target/PowerPC/PPCInstr64Bit.td:962
                         "cntlzd", "$rA, $rS", IIC_IntGeneral,
-                        [(set i64:$rA, (ctlz i64:$rS))]>;
+                        [(set i64:$rA, (ctlz i64:$rS))]>, ZExt32To64;
 defm CNTTZD : XForm_11r<31, 570, (outs g8rc:$rA), (ins g8rc:$rS),
----------------
Similar comment as for the word versions regarding sign extension..


================
Comment at: llvm/lib/Target/PowerPC/PPCInstrInfo.cpp:5234
+  switch (MI->getOpcode()) {
   case PPC::COPY: {
+    Register SrcReg = MI->getOperand(1).getReg();
----------------
I still don't like the deep nesting in this block. It seems like we can get rid of all that deep nesting by rewriting this block as:
```
  case PPC::COPY: {
    Register SrcReg = MI->getOperand(1).getReg();

    // In both ELFv1 and v2 ABI, method parameters and the return value
    // are sign- or zero-extended.
    const MachineFunction *MF = MI->getMF();

    if (!MF->getSubtarget<PPCSubtarget>().isSVR4ABI()) {
      // If this is a copy from another register, we recursively check source.
      auto SrcExt = isSignOrZeroExtended(SrcReg, BinOpDepth, MRI);
      return std::pair<bool, bool>(SrcExt.first || IsSExt,
                                   SrcExt.second || IsZExt);
    }

    const PPCFunctionInfo *FuncInfo = MF->getInfo<PPCFunctionInfo>();
    // We check the ZExt/SExt flags for a method parameter.
    if (MI->getParent()->getBasicBlock() ==
        &MF->getFunction().getEntryBlock()) {
      Register VReg = MI->getOperand(0).getReg();
      if (MF->getRegInfo().isLiveIn(VReg)) {
        IsSExt |= FuncInfo->isLiveInSExt(VReg);
        IsZExt |= FuncInfo->isLiveInZExt(VReg);
        return std::pair<bool, bool>(IsSExt, IsZExt);
      }
    }

    // For a method return value, we check the ZExt/SExt flags in attribute.
    // We assume the following code sequence for method call.
    //   ADJCALLSTACKDOWN 32, implicit dead %r1, implicit %r1
    //   BL8_NOP @func,...
    //   ADJCALLSTACKUP 32, 0, implicit dead %r1, implicit %r1
    //   %5 = COPY %x3; G8RC:%5
    if (SrcReg != PPC::X3) {
      // If this is a copy from another register, we recursively check source.
      auto SrcExt = isSignOrZeroExtended(SrcReg, BinOpDepth, MRI);
      return std::pair<bool, bool>(SrcExt.first || IsSExt,
                                   SrcExt.second || IsZExt);
    }

    const MachineBasicBlock *MBB = MI->getParent();
    MachineBasicBlock::const_instr_iterator II =
        MachineBasicBlock::const_instr_iterator(MI);
    if (II == MBB->instr_begin() || (--II)->getOpcode() != PPC::ADJCALLSTACKUP)
      return std::pair<bool, bool>(IsSExt, IsZExt);

    const MachineInstr &CallMI = *(--II);
    if (!CallMI.isCall() || !CallMI.getOperand(0).isGlobal())
      return std::pair<bool, bool>(IsSExt, IsZExt);

    const Function *CalleeFn =
        dyn_cast<Function>(CallMI.getOperand(0).getGlobal());
    if (!CalleeFn)
      return std::pair<bool, bool>(IsSExt, IsZExt);

    const IntegerType *IntTy = dyn_cast<IntegerType>(CalleeFn->getReturnType());
    if (IntTy && IntTy->getBitWidth() <= 32) {
      const AttributeSet &Attrs = CalleeFn->getAttributes().getRetAttrs();
      IsSExt |= Attrs.hasAttribute(Attribute::SExt);
      IsZExt |= Attrs.hasAttribute(Attribute::ZExt);
    }
    return std::pair<bool, bool>(IsSExt, IsZExt);
  }
```
I have neither compiled nor tested this, so please scrutinize it carefully rather than assuming that my suggestion is verified.


================
Comment at: llvm/lib/Target/PowerPC/PPCInstrInfo.cpp:5290-5292
+    auto SrcExt = isSignOrZeroExtended(SrcReg, BinOpDepth, MRI);
+    return std::pair<bool, bool>(SrcExt.first || IsSExt,
+                                 SrcExt.second || IsZExt);
----------------
Can any of this ever produce anything useful? If I am reading this correctly, this is basically:

- The copy is from X3
- We check if the definition of X3 is sign/zero extended

But this seems pointless since we won't track anything about what defined X3 here.

Edit: actually, the above is so deeply nested that I missed the fact that this code is for non-SVR4 ABI's.


================
Comment at: llvm/lib/Target/PowerPC/PPCInstrInfo.cpp:5335
     // The input registers for others are operand 1 and 2.
     unsigned E = 3, D = 1;
+    if (MI->getOpcode() == PPC::PHI) {
----------------
I understand that these names predate this patch, but since we are modifying it, can we give them more descriptive names? Perhaps `OperandEnd, OperandStride`?


================
Comment at: llvm/lib/Target/PowerPC/PPCInstrInfo.cpp:5119
+
+  // If at least one bit from left in a lower word is masked,
+  // all of 0 to 32-th bits of the output are cleared.
----------------
s/masked/masked out


================
Comment at: llvm/lib/Target/PowerPC/PPCInstrInfo.cpp:5180-5183
+  const PPCInstrInfo *TII =
+      MI->getMF()->getSubtarget<PPCSubtarget>().getInstrInfo();
+  if (TII->isZExt32To64(Opcode))
+    return true;
----------------
Just for consistency, this check is first in the sign extending check... You should probably move it up in this function so it is consistent with that.


================
Comment at: llvm/lib/Target/PowerPC/PPCInstrInfo.cpp:3859
+    auto SrcExt = isSignOrZeroExtended(SrcReg, BinOpDepth, MRI);
+    return std::pair<bool,bool>(SrcExt.first | IsSExt, SrcExt.second | IsZExt);
+  }
----------------
nemanjai wrote:
> These can probably be logical or's as well. Same with the below ones.
Is there a reason this comment was not addressed and these are still bitwise OR's?


================
Comment at: llvm/lib/Target/PowerPC/PPCInstrP10.td:1322
   def SETBC : XForm_XT5_BI5<31, 384, (outs gprc:$RT), (ins crbitrc:$BI),
-                            "setbc $RT, $BI", IIC_IntCompare, []>;
+                            "setbc $RT, $BI", IIC_IntCompare, []>, SExt32To64;
   def SETBCR : XForm_XT5_BI5<31, 416, (outs gprc:$RT), (ins crbitrc:$BI),
----------------
Did you mean to mark all of these as `ZExt32To64`? The possible results are 0/1 for these. So I suppose they are technically both sign and zero extending.
Furthermore, the negative forms (`setnbc/setnbcr`) are not marked as sign extending even though the values are possible values are -1/0.


================
Comment at: llvm/lib/Target/PowerPC/PPCInstrVSX.td:1303
                                "mfvsrwz $rA, $XT", IIC_VecGeneral,
-                               [(set i32:$rA, (PPCmfvsr f64:$XT))]>;
+                               [(set i32:$rA, (PPCmfvsr f64:$XT))]>, ZExt32To64;
   // FIXME: Setting the hasSideEffects flag here to match current behaviour.
----------------
It would seem that the `vextu[bhw][lr]x` family of instructions have been missed and they're all zero extending.


================
Comment at: llvm/lib/Target/PowerPC/PPCMIPeephole.cpp:787-800
+          auto is64Bit = [](unsigned Opcode) {
+            return Opcode == PPC::EXTSH8 || Opcode == PPC::EXTSH8_32_64;
           };
           auto isXForm = [] (unsigned Opcode) {
             return Opcode == PPC::LHZX;
           };
           auto getSextLoadOp = [] (bool is64Bit, bool isXForm) {
----------------
This code is so unnecessarily lambda-ey. Can you please commit an NFC patch (I don't care if it is before or after this patch) to flatten this out?


================
Comment at: llvm/lib/Target/PowerPC/PPCMIPeephole.cpp:804-807
+          // The transformation from a zero-extending load to a sign-extending
+          // load is only legal when the displacement is a multiple of 4.
+          // If the displacement is not at least 4 byte aligned, don't perform
+          // the transformation.
----------------
Why? LHA is D-Form (not DS-Form) and LHAX of course doesn't care since it is X-Form.


================
Comment at: llvm/lib/Target/PowerPC/PPCMIPeephole.cpp:862-865
+          // The transformation from a zero-extending load to a sign-extending
+          // load is only legal when the displacement is a multiple of 4.
+          // If the displacement is not at least 4 byte aligned, don't perform
+          // the transformation.
----------------
This is true if we are transforming it to the DS-Form `lwa`. It shouldn't matter for the X-Form `lwax`. Perhaps this code should just set a `bool` variable `IsWordAligned` or something similar and then below, we break out:
```
if (IsWordAligned && (Opc == PPC::LWA || Opc == PPC::LWA_32))
```


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D40554/new/

https://reviews.llvm.org/D40554



More information about the llvm-commits mailing list