[PATCH] D37121: [DivRemHoist] add a pass to move div/rem pairs into the same block (PR31028)

Thu Sep 7 16:54:54 PDT 2017

hfinkel added inline comments.

================
Comment at: lib/Transforms/Scalar/DivRemPairs.cpp:80
+    BasicBlock *DivBB = DivInst->getParent();
+    if (RemBB == DivBB)
+      continue;
----------------
I'd like to remove this check. There are two reasons for this:

 1. Consistency (ending up with the expanded form if we started with the operations in different blocks, but not if they started in the same block, seems unfortunate).

 2. To allow us to get optimal results reliably in the backend in a straightforward way. For example, if a target does not have a legal rem instruction, we get the desired behavor because the rem will be legalized into the code with the div, and SDAG will reuse an existing identical div in the same block. If there is a legal rem instruction, however, this doesn't work automatically. In fact, the PowerPC backend has this comment:

  // PowerPC has no SREM/UREM instructions unless we are on P9
  // On P9 we may use a hardware instruction to compute the remainder.
  // The instructions are not legalized directly because in the cases where the
  // result of both the remainder and the division is required it is more
  // efficient to compute the remainder from the result of the division rather
  // than use the remainder instruction.
  if (Subtarget.isISA3_0()) {
    setOperationAction(ISD::SREM, MVT::i32, Custom);
    setOperationAction(ISD::UREM, MVT::i32, Custom);
    setOperationAction(ISD::SREM, MVT::i64, Custom);
    setOperationAction(ISD::UREM, MVT::i64, Custom);
  } else {
    setOperationAction(ISD::SREM, MVT::i32, Expand);
    setOperationAction(ISD::UREM, MVT::i32, Expand);
    setOperationAction(ISD::SREM, MVT::i64, Expand);
    setOperationAction(ISD::UREM, MVT::i64, Expand);
  }

and if we always expand here, I think we can clean this up.

I think that you can do this just by writing:

  if (HasDivRemOp && RemBB == DivBB)
    continue;

  bool DivDominates = DT.dominates(DivInst, RemInst);

what Danny said earlier about the instruction-level dominance checks being expensive within the same block is true, but there's no extra expense compared to the BB version if the instructions are different BBs. EarlyCSE should ensure that there are at most one div or rem with any particular set of arguments, so I don't think we need to do anything special to deal with algorithmic complexity issues (although if we're really concerned about that, we can use OrderedInstructions).

================
Comment at: lib/Transforms/Scalar/DivRemPairs.cpp:93
+      // remainder when the target has support for it would be wasteful and
+      // potentially misleading to other passes.
+      if (DivDominates)
----------------
I wouldn't phrase this as misleading. Arguably, it's more misleading this way. The point is that, if you decompose here, some later pass might disturb the pattern such that it is not recognizable in the backend.

https://reviews.llvm.org/D37121