[PATCH] D40554: [PowerPC] Fix bugs in sign-/zero-extension elimination
Nemanja Ivanovic via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Jul 4 15:02:31 PDT 2022
nemanjai added a comment.
As tends to be the case with huge patches such as this, it is difficult to get this patch to converge and it still needs a bit of work. But I think it is pretty close overall.
================
Comment at: llvm/lib/Target/PowerPC/PPCInstr64Bit.td:897
defm CNTLZW8 : XForm_11r<31, 26, (outs g8rc:$rA), (ins g8rc:$rS),
- "cntlzw", "$rA, $rS", IIC_IntGeneral, []>;
+ "cntlzw", "$rA, $rS", IIC_IntGeneral, []>, ZExt32To64;
defm CNTTZW8 : XForm_11r<31, 538, (outs g8rc:$rA), (ins g8rc:$rS),
----------------
I don't follow why this is only marked as zero-extended (whereas for example, `andi.` is marked as both). The 32-bit sign bit can only be a zero, so a sign-extend from 32-bits (`extsw`) is guaranteed to essentially be a nop.
================
Comment at: llvm/lib/Target/PowerPC/PPCInstr64Bit.td:962
"cntlzd", "$rA, $rS", IIC_IntGeneral,
- [(set i64:$rA, (ctlz i64:$rS))]>;
+ [(set i64:$rA, (ctlz i64:$rS))]>, ZExt32To64;
defm CNTTZD : XForm_11r<31, 570, (outs g8rc:$rA), (ins g8rc:$rS),
----------------
Similar comment as for the word versions regarding sign extension..
================
Comment at: llvm/lib/Target/PowerPC/PPCInstrInfo.cpp:5234
+ switch (MI->getOpcode()) {
case PPC::COPY: {
+ Register SrcReg = MI->getOperand(1).getReg();
----------------
I still don't like the deep nesting in this block. It seems like we can get rid of all that deep nesting by rewriting this block as:
```
case PPC::COPY: {
Register SrcReg = MI->getOperand(1).getReg();
// In both ELFv1 and v2 ABI, method parameters and the return value
// are sign- or zero-extended.
const MachineFunction *MF = MI->getMF();
if (!MF->getSubtarget<PPCSubtarget>().isSVR4ABI()) {
// If this is a copy from another register, we recursively check source.
auto SrcExt = isSignOrZeroExtended(SrcReg, BinOpDepth, MRI);
return std::pair<bool, bool>(SrcExt.first || IsSExt,
SrcExt.second || IsZExt);
}
const PPCFunctionInfo *FuncInfo = MF->getInfo<PPCFunctionInfo>();
// We check the ZExt/SExt flags for a method parameter.
if (MI->getParent()->getBasicBlock() ==
&MF->getFunction().getEntryBlock()) {
Register VReg = MI->getOperand(0).getReg();
if (MF->getRegInfo().isLiveIn(VReg)) {
IsSExt |= FuncInfo->isLiveInSExt(VReg);
IsZExt |= FuncInfo->isLiveInZExt(VReg);
return std::pair<bool, bool>(IsSExt, IsZExt);
}
}
// For a method return value, we check the ZExt/SExt flags in attribute.
// We assume the following code sequence for method call.
// ADJCALLSTACKDOWN 32, implicit dead %r1, implicit %r1
// BL8_NOP @func,...
// ADJCALLSTACKUP 32, 0, implicit dead %r1, implicit %r1
// %5 = COPY %x3; G8RC:%5
if (SrcReg != PPC::X3) {
// If this is a copy from another register, we recursively check source.
auto SrcExt = isSignOrZeroExtended(SrcReg, BinOpDepth, MRI);
return std::pair<bool, bool>(SrcExt.first || IsSExt,
SrcExt.second || IsZExt);
}
const MachineBasicBlock *MBB = MI->getParent();
MachineBasicBlock::const_instr_iterator II =
MachineBasicBlock::const_instr_iterator(MI);
if (II == MBB->instr_begin() || (--II)->getOpcode() != PPC::ADJCALLSTACKUP)
return std::pair<bool, bool>(IsSExt, IsZExt);
const MachineInstr &CallMI = *(--II);
if (!CallMI.isCall() || !CallMI.getOperand(0).isGlobal())
return std::pair<bool, bool>(IsSExt, IsZExt);
const Function *CalleeFn =
dyn_cast<Function>(CallMI.getOperand(0).getGlobal());
if (!CalleeFn)
return std::pair<bool, bool>(IsSExt, IsZExt);
const IntegerType *IntTy = dyn_cast<IntegerType>(CalleeFn->getReturnType());
if (IntTy && IntTy->getBitWidth() <= 32) {
const AttributeSet &Attrs = CalleeFn->getAttributes().getRetAttrs();
IsSExt |= Attrs.hasAttribute(Attribute::SExt);
IsZExt |= Attrs.hasAttribute(Attribute::ZExt);
}
return std::pair<bool, bool>(IsSExt, IsZExt);
}
```
I have neither compiled nor tested this, so please scrutinize it carefully rather than assuming that my suggestion is verified.
================
Comment at: llvm/lib/Target/PowerPC/PPCInstrInfo.cpp:5290-5292
+ auto SrcExt = isSignOrZeroExtended(SrcReg, BinOpDepth, MRI);
+ return std::pair<bool, bool>(SrcExt.first || IsSExt,
+ SrcExt.second || IsZExt);
----------------
Can any of this ever produce anything useful? If I am reading this correctly, this is basically:
- The copy is from X3
- We check if the definition of X3 is sign/zero extended
But this seems pointless since we won't track anything about what defined X3 here.
Edit: actually, the above is so deeply nested that I missed the fact that this code is for non-SVR4 ABI's.
================
Comment at: llvm/lib/Target/PowerPC/PPCInstrInfo.cpp:5335
// The input registers for others are operand 1 and 2.
unsigned E = 3, D = 1;
+ if (MI->getOpcode() == PPC::PHI) {
----------------
I understand that these names predate this patch, but since we are modifying it, can we give them more descriptive names? Perhaps `OperandEnd, OperandStride`?
================
Comment at: llvm/lib/Target/PowerPC/PPCInstrInfo.cpp:5119
+
+ // If at least one bit from left in a lower word is masked,
+ // all of 0 to 32-th bits of the output are cleared.
----------------
s/masked/masked out
================
Comment at: llvm/lib/Target/PowerPC/PPCInstrInfo.cpp:5180-5183
+ const PPCInstrInfo *TII =
+ MI->getMF()->getSubtarget<PPCSubtarget>().getInstrInfo();
+ if (TII->isZExt32To64(Opcode))
+ return true;
----------------
Just for consistency, this check is first in the sign extending check... You should probably move it up in this function so it is consistent with that.
================
Comment at: llvm/lib/Target/PowerPC/PPCInstrInfo.cpp:3859
+ auto SrcExt = isSignOrZeroExtended(SrcReg, BinOpDepth, MRI);
+ return std::pair<bool,bool>(SrcExt.first | IsSExt, SrcExt.second | IsZExt);
+ }
----------------
nemanjai wrote:
> These can probably be logical or's as well. Same with the below ones.
Is there a reason this comment was not addressed and these are still bitwise OR's?
================
Comment at: llvm/lib/Target/PowerPC/PPCInstrP10.td:1322
def SETBC : XForm_XT5_BI5<31, 384, (outs gprc:$RT), (ins crbitrc:$BI),
- "setbc $RT, $BI", IIC_IntCompare, []>;
+ "setbc $RT, $BI", IIC_IntCompare, []>, SExt32To64;
def SETBCR : XForm_XT5_BI5<31, 416, (outs gprc:$RT), (ins crbitrc:$BI),
----------------
Did you mean to mark all of these as `ZExt32To64`? The possible results are 0/1 for these. So I suppose they are technically both sign and zero extending.
Furthermore, the negative forms (`setnbc/setnbcr`) are not marked as sign extending even though the values are possible values are -1/0.
================
Comment at: llvm/lib/Target/PowerPC/PPCInstrVSX.td:1303
"mfvsrwz $rA, $XT", IIC_VecGeneral,
- [(set i32:$rA, (PPCmfvsr f64:$XT))]>;
+ [(set i32:$rA, (PPCmfvsr f64:$XT))]>, ZExt32To64;
// FIXME: Setting the hasSideEffects flag here to match current behaviour.
----------------
It would seem that the `vextu[bhw][lr]x` family of instructions have been missed and they're all zero extending.
================
Comment at: llvm/lib/Target/PowerPC/PPCMIPeephole.cpp:787-800
+ auto is64Bit = [](unsigned Opcode) {
+ return Opcode == PPC::EXTSH8 || Opcode == PPC::EXTSH8_32_64;
};
auto isXForm = [] (unsigned Opcode) {
return Opcode == PPC::LHZX;
};
auto getSextLoadOp = [] (bool is64Bit, bool isXForm) {
----------------
This code is so unnecessarily lambda-ey. Can you please commit an NFC patch (I don't care if it is before or after this patch) to flatten this out?
================
Comment at: llvm/lib/Target/PowerPC/PPCMIPeephole.cpp:804-807
+ // The transformation from a zero-extending load to a sign-extending
+ // load is only legal when the displacement is a multiple of 4.
+ // If the displacement is not at least 4 byte aligned, don't perform
+ // the transformation.
----------------
Why? LHA is D-Form (not DS-Form) and LHAX of course doesn't care since it is X-Form.
================
Comment at: llvm/lib/Target/PowerPC/PPCMIPeephole.cpp:862-865
+ // The transformation from a zero-extending load to a sign-extending
+ // load is only legal when the displacement is a multiple of 4.
+ // If the displacement is not at least 4 byte aligned, don't perform
+ // the transformation.
----------------
This is true if we are transforming it to the DS-Form `lwa`. It shouldn't matter for the X-Form `lwax`. Perhaps this code should just set a `bool` variable `IsWordAligned` or something similar and then below, we break out:
```
if (IsWordAligned && (Opc == PPC::LWA || Opc == PPC::LWA_32))
```
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D40554/new/
https://reviews.llvm.org/D40554
More information about the llvm-commits
mailing list