[PATCH] D93370: [PowerPC] Add new infrastructure to select load/store instructions, update P8/P9 load/store patterns.

Sat Mar 20 15:42:43 PDT 2021

nemanjai requested changes to this revision.
nemanjai added a comment.
This revision now requires changes to proceed.

Thank you for handling this much needed refactoring. My comments are mostly related to readability so this is very close to approval, but let's have another look when you address the comments.

================
Comment at: llvm/lib/Target/PowerPC/PPCISelLowering.cpp:1425
+/// initializeAddrModeMap - Initialize the map that relates the different
+/// instruction formats of load and store instructions to a set of flags.
+/// This ensures the load/store instruction is correctly matched during
----------------
s/instruction formats/addressing modes

================
Comment at: llvm/lib/Target/PowerPC/PPCISelLowering.cpp:1430
+  AddrModesMap[PPC::AM_DForm] = {
+      PPC::MOF_ZExt | PPC::MOF_RPlusSImm16 | PPC::MOF_WordInt,
+      PPC::MOF_ZExt | PPC::MOF_RPlusSImm16 | PPC::MOF_SubWordInt,
----------------
I think you should add some comments regarding sample instructions that correspond to these sets of flags. For example:
```
PPC::MOF_ZExt | PPC::MOF_RPlusSImm16 | PPC::MOF_WordInt,    // LWZ
PPC::MOF_ZExt | PPC::MOF_RPlusSImm16 | PPC::MOF_SubWordInt, // LBZ, LHZ
PPC::MOF_SExt | PPC::MOF_RPlusSImm16 | PPC::MOF_SubWordInt, // LHA
...
```

================
Comment at: llvm/lib/Target/PowerPC/PPCISelLowering.cpp:2480

+/// provablyDisjointOr - used when computing address flags for selecting
+/// loads and stores. If we have an OR, check if the LHS and RHS are provably
----------------
I think we no longer need to put the name of the function in Doxygen comments.

================
Comment at: llvm/lib/Target/PowerPC/PPCISelLowering.cpp:2482-2483
+/// loads and stores. If we have an OR, check if the LHS and RHS are provably
+/// disjoint. This is for when we have an OR of disjoint bitfields, we can
+/// codegen it as an add (for better address arithmetic).
+static bool provablyDisjointOr(SelectionDAG &DAG, const SDValue &N) {
----------------
Replace:
```
This is for when we have an OR of disjoint bitfields, we can codegen it as an add (for better address arithmetic).
```
with
```
An OR of two provably disjoint values is equivalent to an ADD. Most PPC load/store instructions compute the effective address as a sum, so doing this conversion is useful.
```

================
Comment at: llvm/lib/Target/PowerPC/PPCISelLowering.cpp:16781
+      return PPC::AM_DQForm;
+  // If no other forms are selected, return an X-Form instructions can
+  // always be matched.
----------------
s/return an X-Form instructions can always be matched/return an X-Form as it is the most general addressing mode

================
Comment at: llvm/lib/Target/PowerPC/PPCISelLowering.cpp:16788-16789
+/// the address flags of the load/store instruction that is to be matched.
+/// The address flags are stored in a map, which is then searched through
+/// to determine the optimal load/store instruction format to match by.
+unsigned PPCTargetLowering::computeMOFlags(const SDNode *Parent, SDValue N,
----------------
Remove this part of the comment. That is not done here AFAICT.

================
Comment at: llvm/lib/Target/PowerPC/PPCISelLowering.cpp:16799
+    FlagSet |= PPC::MOF_SubtargetP9;
+    if (Subtarget.isISA3_1())
+      FlagSet |= PPC::MOF_SubtargetP10;
----------------
I think this should be checking for prefixed instructions. It is entirely possible to have
`-mcpu=pwr9 -Xclang -target-feature -Xclang -prefix-instrs` (or `llc -mattr=-prefix-instrs`).

================
Comment at: llvm/lib/Target/PowerPC/PPCISelLowering.cpp:16812
+
+  auto computeFlagsForAddressComputation = [&](SDValue N) {
+    if (ConstantSDNode *CN = dyn_cast<ConstantSDNode>(N)) {
----------------
This is a large lambda that is only called once. AFAICT, it only captures `FlagSet`. I think it is probably better off as a `static` function and `FlagSet` can be passed by reference.

Removing it will also remove `SetAlignFlagsForImm` from this function and the whole thing should become significantly more readable.

Also as a minor nit, you are less likely to have these mismatches in naming/capitalization such as between `SetAlignFlagsForImm` and `computeFlagsForAddressComputation`. Both are lambdas, one is capitalized as a variable and the other as a function - understandable omission since a lambda is both.

================
Comment at: llvm/lib/Target/PowerPC/PPCISelLowering.cpp:16873
+      FlagSet |= PPC::MOF_DoubleWordInt;
+  } else if (MemVT.isVector() && !MemVT.isFloatingPoint()) { // Vectors only.
+    if (Size == 128)
----------------
s/Vectors only/Integer vectors.

================
Comment at: llvm/lib/Target/PowerPC/PPCISelLowering.cpp:16884
+    else if (MemVT == MVT::f128 || MemVT.isVector())
+      FlagSet |= PPC::MOF_Vector;
+  }
----------------
Seems like we should have another unreachable here in case we end up with an illegal floating point type such as half precision or some weird FP type (such as Intel's 80-bit FP).

================
Comment at: llvm/lib/Target/PowerPC/PPCISelLowering.cpp:16907
+
+  // For integers, no extension is the same as zero extension.
+  if (MemVT.isScalarInteger() && (FlagSet & PPC::MOF_NoExt)) {
----------------
Add this to the comment:
```
// We set the extension mode to zero extension so we don't have
// to add separate entries in AddrModesMap for stores and loads.
```

================
Comment at: llvm/lib/Target/PowerPC/PPCISelLowering.cpp:16913-16914
+
+  // Prior to P10, constants that fit in 34-bits on should be marked with
+  // `PPC::MOF_NotAddNorCst` to match by D-Form.
+  if (N.getOpcode() != ISD::ADD && N.getOpcode() != ISD::OR &&
----------------
```
// If we don't have prefixed instructions, 34-bit constants should be
// treated as PPC::MOF_NotAddNorCst so they can match D-Forms.
```

================
Comment at: llvm/lib/Target/PowerPC/PPCISelLowering.cpp:16915-16918
+  if (N.getOpcode() != ISD::ADD && N.getOpcode() != ISD::OR &&
+      (FlagSet & PPC::MOF_RPlusSImm34) && !(FlagSet & PPC::MOF_AddrIsSImm32) &&
+      !(FlagSet & PPC::MOF_SubtargetP10))
+    FlagSet |= PPC::MOF_NotAddNorCst;
----------------
```
  bool Is34BitConstNoP10 =
      (PPC::MOF_RPlusSImm34 | PPC::MOF_AddrIsSImm32 | PPC::MOF_SubtargetP10) &
      FlagSet == PPC::MOF_RPlusSImm34;
  if (N.getOpcode() != ISD::ADD && N.getOpcode() != ISD::OR &&
      IsNonP1034BitConst)
    FlagSet |= PPC::MOF_NotAddNorCst;
```

================
Comment at: llvm/lib/Target/PowerPC/PPCISelLowering.cpp:16985
+      SDValue Op1 = N.getOperand(1);
+      if (isIntS16Immediate(Op1, Imm) && (!Align || isAligned(*Align, Imm))) {
+        Disp = DAG.getTargetConstant(Imm, DL, N.getValueType());
----------------
This condition seems superfluous. Why do we check that the operand is a signed 16-bit constant (and then create another constant with the same value) when we have `PPC::MOF_RPlusSImm16` set? Doesn't the flag already assure us that this is so?

Can we not just assert that this is so?

================
Comment at: llvm/lib/Target/PowerPC/PPCISelLowering.cpp:17020
+      // Handle 32-bit sext immediate with LIS + Addr mode.
+      if ((CNType == MVT::i32 || (int64_t)CNImm == (int)CNImm) &&
+          (!Align || isAligned(*Align, CNImm))) {
----------------
Why can't this be
`(CNType == MVT::i32 || isInt<32>(CNImm))`

================
Comment at: llvm/lib/Target/PowerPC/PPCISelLowering.cpp:17022
+          (!Align || isAligned(*Align, CNImm))) {
+        int Addr = (int)CNImm;
+        // Otherwise, break this down into LIS + Disp.
----------------
For purposes where the width of the value matters, please refrain from using types with an implicit size and use explicitly sized types (i.e. `int32_t` vs. `int`, `int16_t` vs. `short`).

================
Comment at: llvm/lib/Target/PowerPC/PPCISelLowering.h:678
+
+      // Type extend type flags.
+      MOF_SExt = 1,
----------------
`// Extension mode for integer loads.`

================
Comment at: llvm/lib/Target/PowerPC/PPCISelLowering.h:684-692
+      MOF_NotAddNorCst = 1 << 5,
+      MOF_RPlusSImm16 = 1 << 6,
+      MOF_RPlusLo = 1 << 7,
+      MOF_RPlusSImm16Mult4 = 1 << 8,
+      MOF_RPlusSImm16Mult16 = 1 << 9,
+      MOF_RPlusSImm34 = 1 << 10,
+      MOF_RPlusR = 1 << 11,
----------------
```
      MOF_NotAddNorCst = 1 << 5,      // Not const. or sum of ptr and scalar.
      MOF_RPlusSImm16 = 1 << 6,       // Reg plus signed 16-bit constant.
      MOF_RPlusLo = 1 << 7,           // Reg plus signed 16-bit relocation
      MOF_RPlusSImm16Mult4 = 1 << 8,  // Reg plus 16-bit signed multiple of 4.
      MOF_RPlusSImm16Mult16 = 1 << 9, // Reg plus 16-bit signed multiple of 16.
      MOF_RPlusSImm34 = 1 << 10,      // Reg plus 34-bit signed constant.
      MOF_RPlusR = 1 << 11,           // Sum of two variables.
      MOF_PCRel = 1 << 12,            // PC-Relative relocation.
      MOF_AddrIsSImm32 = 1 << 13,     // A simple 32-bit constant.
```

================
Comment at: llvm/lib/Target/PowerPC/PPCISelLowering.h:698-699
+      MOF_DoubleWordInt = 1 << 17,
+      MOF_ScalarFloat = 1 << 18,
+      MOF_Vector = 1 << 19,
+      MOF_Vector256 = 1 << 20,
----------------
```
      MOF_ScalarFloat = 1 << 18, // Scalar single or double precision.
      MOF_Vector = 1 << 19,      // Vector types and quad precision scalars.
```

================
Comment at: llvm/lib/Target/PowerPC/PPCISelLowering.h:1385
+    /// the address flags of the load/store instruction that is to be matched.
+    /// The address flags are are stored in a map, which is then searched
+    /// through to determine the optimal load/store instruction format.
----------------
s/are are/are

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D93370/new/

https://reviews.llvm.org/D93370