[PATCH] D29896: [BypassSlowDivision] Refactor fast division insertion logic (NFC)

Justin Lebar via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Feb 15 10:18:48 PST 2017


jlebar added inline comments.


================
Comment at: lib/Transforms/Utils/BypassSlowDivision.cpp:80
+  unsigned Opcode;
+  Instruction *SlowDiv;
+  IntegerType *SlowType;
----------------
n.bozhenov wrote:
> jlebar wrote:
> > This is actually SlowInstr or InstrToReplace or SlowDivOrRem -- we should be careful here and elsewhere not to call things divs that may be rems.
> During this optimization we generally use 'slow division' to refer to both Div and Rem operations. I believe that strictly following your suggestion would be overkill. For example, the file itself and its main entry point are named BypassSlowDivision.cpp and llvm::bypassSlowDivision. I'm not sure if we should rename them into something like bypassSlowDivisionOrRemainder.
> 
> To make the naming somewhat less confusing I suggest using the full word (Division) to refer to both Div and Rem operations, and use the abbreviations (Div/Rem) to refer to specific operation kinds. According to that, SlowDiv should be renamed into SlowDivision, FastDivInsertionTask into FastDivisionInsertionTask, and isDivisionOp into isDivOp. What do you think?
> 
> During this optimization we generally use 'slow division' to refer to both Div and Rem operations. I believe that strictly following your suggestion would be overkill. For example, the file itself and its main entry point are named BypassSlowDivision.cpp and llvm::bypassSlowDivision. I'm not sure if we should rename them into something like bypassSlowDivisionOrRemainder.

I agree that renaming the file and pass would be overkill.  But I think we can still improve readability by being just a little less strict about the suggestion.

Even if we had a comment explaining that "division" == "div or mod" at the top of the file, that would be us inventing a new naming idiom, which is going to add to the cognitive cost to users of understanding our code.  And realistically speaking, a large fraction of people aren't going to notice this comment at all, in which case they're just going to be lost.

In contrast, `SlowDivOrRem` is the same number of characters as `SlowDivision`, but is unambiguous.  I'd also be OK with `SlowInstr` if you prefer that.

Again I don't think we need to be totally strict about the rule.  I don't think that `FastDivInsertionTask` is ambiguous, for example, although perhaps a better name is FastDivRemInsertionTask would more closely match what it does (it always inserts both instrs).  But the `SlowDiv` member here is, I think, ambiguous, because some of the members here *are* necessarily divs.  Similarly, I think `createDivRuntimeCheck` would be less ambiguous as `insertOperandRuntimeCheck` or maybe something like `insertFastSlowBranch`, and `insertFastDiv` would be better named `insertFastDivAndRem`.


================
Comment at: lib/Transforms/Utils/BypassSlowDivision.cpp:346
+  FastDivInserter FDI(BypassWidths);
+  FastDivInsertionTask Task;
 
----------------
n.bozhenov wrote:
> jlebar wrote:
> > n.bozhenov wrote:
> > > jlebar wrote:
> > > > I'm totally behind the notion of creating classes that encapsulate our temporary state.  But I'm not wild about the way we use FastDivInsertionTask as basically a bucket of *mutable* state.  We have had really nasty bugs in LLVM with similar designs where we forget to reset one piece of the mutable state.  (I think Chandler was saying that some pass effectively had a global cost cap instead of a per-function cost cap, because they just forgot to set the cost to 0 at the end of the function.)
> > > > 
> > > > I went through a few iterations, and the one that seems best to me right now is to get rid of FastDivInserter, which is already very simple, and make FastDivInsertionTask a one-shot class.
> > > > 
> > > >   DivCacheTy Cache;
> > > >   while (Next != nullptr) {
> > > >     Instruction *I = Next;
> > > >     Next = Next->getNextNode();
> > > > 
> > > >     if (Value *Replacement = FastDivInsertionTask(I, Cache, BypassWidths).getReplacement()) {
> > > >       I->RAUW(Replacement);
> > > >       MadeChange |= true;
> > > >     }
> > > >   }
> > > > 
> > > > WDYT?
> > > Hi Justin,
> > > 
> > > Thanks for reviewing the code so quickly. And I really like the code sample you suggested. If I understand correctly, your main concern is that in my code the Task object gets reused for different instructions during the pass, isn't it? If this is the case, then moving the code from FastDivInserter into FastDivInsertionTask constructor is indeed one of the possible solutions.
> > > 
> > > Another approach is to create a new Task object for each instruction inside an FastDivInserter object and never expose it outside. Something like this:
> > > 
> > > ```
> > > // In bypassSlowDivision:
> > > FastDivInserter FDI;
> > > while (Next != nullptr) {
> > >   Instruction *I = Next;
> > >   Next = Next->getNextNode();
> > >   MadeChange |= FDI.tryReplaceSlowDiv(I);
> > > }
> > > ```
> > > 
> > > ```
> > > // In FastDivInserter::tryReplaceSlowDiv:
> > > FastDivInsertionTask Task(I, BypassWidths);
> > > if (!Task.isSlowDivision())
> > >   return false;
> > > 
> > > Value *R = Task.getReplacement(Cache);
> > > if (!R)
> > >   return false;
> > > 
> > > I->RAUW(R);
> > > return true;
> > > ```
> > > 
> > > Currently I'm playing with these two approaches and will come with an updated patch by tomorrow. Please, comment if you're strongly in favour of one of these approaches.
> > > 
> > > If I understand correctly, your main concern is that in my code the Task object gets reused for different instructions during the pass, isn't it?
> > 
> > Yes.
> > 
> > I'm also a little concerned about the mutable state inside of the Task object used for passing between functions, but I wanted to get this part figured out first.
> > 
> > > Another approach is to create a new Task object for each instruction inside an FastDivInserter object and never expose it outside
> > 
> > Right.  If FastDivInserter has nontrivial complexity that it doesn't make sense to push into FastDivInsertionTask or the outer pass function body, this might make sense.  When I looked at it, I didn't think it did.
> I have updated the patch. I have got rid of FastDivInserter class. And the Task is not reused any more, but a new object is constructed for each instruction.
> 
> As for mutable fields used for passing data between functions, it's possible to get rid of them as well. We could define
> 
> ```
> struct IncomingDivRemPair {
>   BasicBlock *BB; //< PHINode predecessor for the following values.
>   Value *Quotient;
>   Value *Remainder;
> };
> ```
> 
> and return such a structure by value from createSlowBB and createFastBB methods. Later, we could pass a pair of such structures into createDivRemPhiNodes to make the latter both more flexible and easier to understand. The only reason I haven't done this yet is that I don't like very much returning structures by value. Do you think it is worth doing?
> 
Oh yes, I like this much better.

> The only reason I haven't done this yet is that I don't like very much returning structures by value.

If it makes you feel any better, the calling convention returns large structs "by outparam".  :)


================
Comment at: lib/Transforms/Utils/BypassSlowDivision.cpp:80
+  bool IsSlowDivision;
+  unsigned Opcode;
+  Instruction *SlowDiv;
----------------
Do we really need this field, since it's just SlowDiv->getOpcode()?


================
Comment at: lib/Transforms/Utils/BypassSlowDivision.cpp:82
+  Instruction *SlowDiv;
+  IntegerType *SlowType;
+  IntegerType *BypassType;
----------------
Similarly, do we need this field, since it can be derived from SlowDiv?

You could have a member function for it if it's annoying to get N separate times.  I think a member function would be better because it makes it explicit that this is SlowDiv->getType().


================
Comment at: lib/Transforms/Utils/BypassSlowDivision.cpp:84
+  IntegerType *BypassType;
+  BasicBlock *MainBB;
+  Value *Dividend;
----------------
Similarly for the MainBB field.


================
Comment at: lib/Transforms/Utils/BypassSlowDivision.cpp:86
+  Value *Dividend;
+  Value *Divisor;
+  DivCacheTy &Cache;
----------------
Similarly for the Dividend and Divisor fields.


================
Comment at: lib/Transforms/Utils/BypassSlowDivision.cpp:98
+  BasicBlock *createFastBB(BasicBlock *Successor);
+  void createDivRemPhiNodes(BasicBlock *ShortBB, BasicBlock *LongBB,
+                            BasicBlock *PhiBB);
----------------
FastBB and SlowBB, to match the functions right above?


================
Comment at: lib/Transforms/Utils/BypassSlowDivision.cpp:118
+    return Opcode == Instruction::SDiv || Opcode == Instruction::UDiv;
+  }
+};
----------------
Can some of these be made private and/or deleted now?


================
Comment at: lib/Transforms/Utils/BypassSlowDivision.cpp:126
+      ShortRemainderV(nullptr), LongQuotientV(nullptr),
+      LongRemainderV(nullptr) {
+  SlowDiv = I;
----------------
Consider using inline initializers for these, otherwise it's easy to add a member variable and forget to initialize it.

  class Foo {
    void* ptr = nullptr;
  };


================
Comment at: lib/Transforms/Utils/BypassSlowDivision.cpp:128
+  SlowDiv = I;
+  Opcode = I->getOpcode();
+
----------------
Can we do this in the initialization list?


================
Comment at: lib/Transforms/Utils/BypassSlowDivision.cpp:146
+  SlowType = cast<IntegerType>(I->getType());
+  unsigned bitwidth = SlowType->getBitWidth();
+
----------------
Perhaps

  // Skip division on vector types; only optimize integer instructions.
  auto* SlowType = dyn_cast<IntegerType>(I->getType());
  if (!SlowType) return;
  auto BI = BypassWidths.find(SlowType->getBitWidth());

(Make SlowType a local variable, use dyn_cast instead of checking twice, and use auto for the iterator type.  Also get rid of the comma splice in the comment and add a period.)


================
Comment at: lib/Transforms/Utils/BypassSlowDivision.cpp:148
+
+  // Skip if bitwidth is not bypassed
+  BypassWidthsTy::const_iterator BI = BypassWidths.find(bitwidth);
----------------
Nit, please add periods at the end of sentences.


================
Comment at: lib/Transforms/Utils/BypassSlowDivision.cpp:166
+
+/// Reuses previously computed dividend or remainder from the current BB if
+/// operands and operation are identical. Otherwise calls insertFastDiv to
----------------
previously-computed


================
Comment at: lib/Transforms/Utils/BypassSlowDivision.cpp:177
+  DivOpInfo Key(isSignedDiv(), Dividend, Divisor);
+  DivCacheTy::iterator CacheI = Cache.find(Key);
+
----------------
auto


================
Comment at: lib/Transforms/Utils/BypassSlowDivision.cpp:181
+    // If previous instance does not exist, try to insert fast div
+    insertFastDiv();
+    CacheI = Cache.find(Key);
----------------
It seems like much less of a layering violation to me if insertFastDiv (possibly renamed to `createFastDivOrRem`) would return Optional<std::pair<Value*, Value*>>.  Then we would update the cache here.


================
Comment at: lib/Transforms/Utils/BypassSlowDivision.cpp:185
 
-  // If the numerator is a constant, bail if it doesn't fit into BypassType.
-  if (ConstantInt *ConstDividend = dyn_cast<ConstantInt>(Dividend))
-    if (ConstDividend->getValue().getActiveBits() > BypassType->getBitWidth())
-      return false;
+  // If it's still not in Cache, there's nothing we can do.
+  if (CacheI == Cache.end())
----------------
s/Cache/cache/


================
Comment at: lib/Transforms/Utils/BypassSlowDivision.cpp:196
+
+/// Add new basic block for slow divide operation and put it before SuccessorBB.
+BasicBlock *FastDivInsertionTask::createSlowBB(BasicBlock *SuccessorBB) {
----------------
slow div and rem operations


================
Comment at: lib/Transforms/Utils/BypassSlowDivision.cpp:214
+
+/// Add new basic block for fast divide operation and put it before SuccessorBB.
+BasicBlock *FastDivInsertionTask::createFastBB(BasicBlock *SuccessorBB) {
----------------
fast div and rem operations


================
Comment at: lib/Transforms/Utils/BypassSlowDivision.cpp:224
 
   // udiv/urem because optimization only handles positive numbers
+  Value *ShortQV = Builder.CreateUDiv(ShortDividendV, ShortDivisorV);
----------------
While we're here, probably should add missing word: "because *this* optimization only handles positive numbers"


================
Comment at: lib/Transforms/Utils/BypassSlowDivision.cpp:253
+
+/// Creates a run-time check to test whether both operands fit the shorter type.
+/// The check is inserted at the end of MainBB.
----------------
"Creates a runtime check to test whether both the divisor and dividend fit into BypassType"?


================
Comment at: lib/Transforms/Utils/BypassSlowDivision.cpp:255
+/// The check is inserted at the end of MainBB.
+/// True return value means that the operands fit.
+Value *FastDivInsertionTask::createDivRuntimeCheck() {
----------------
Nit, a linebreak without any intervening whitespace like we have here isn't a meaningful punctuation.  Please flow as a single paragraph, or split into two paragraphs by inserting a blank line.


================
Comment at: lib/Transforms/Utils/BypassSlowDivision.cpp:282
+/// operands and uses a shorter-faster div/rem instruction when possible and the
+/// longer-slower div/rem instruction otherwise.
+void FastDivInsertionTask::insertFastDiv() {
----------------
This sentence runs on.  We could remove "and the longer-slower div/rem instruction otherwise."  Otherwise, can you split into two sentences?


================
Comment at: lib/Transforms/Utils/BypassSlowDivision.cpp:295
 
-  // Remove redundant operation
-  I->eraseFromParent();
-  return true;
+  // Basic Block is split before divide
+  BasicBlock *SuccessorBB = MainBB->splitBasicBlock(SlowDiv);
----------------
Maybe "Split the basic block before the div/rem."


================
Comment at: lib/Transforms/Utils/BypassSlowDivision.cpp:297
+  BasicBlock *SuccessorBB = MainBB->splitBasicBlock(SlowDiv);
+  MainBB->getInstList().back().eraseFromParent();
+  BasicBlock *FastBB = createFastBB(SuccessorBB);
----------------
This line could use a comment, I think.


================
Comment at: lib/Transforms/Utils/BypassSlowDivision.cpp:306
 
-// bypassSlowDivision - This optimization identifies DIV instructions in a BB
-// that can be profitably bypassed and carried out with a shorter, faster
-// divide.
-bool llvm::bypassSlowDivision(
-    BasicBlock *BB, const DenseMap<unsigned int, unsigned int> &BypassWidths) {
-  DivCacheTy DivCache;
+/// This optimization identifies DIV instructions in a BB that can be profitably
+/// bypassed and carried out with a shorter, faster divide.
----------------
Perhaps we should update this comment.


================
Comment at: lib/Transforms/Utils/BypassSlowDivision.cpp:324
+      I->eraseFromParent();
+      MadeChange |= true;
+    }
----------------
I know I wrote it this way initially, but now that I see it...`MadeChange = true`?  :)


https://reviews.llvm.org/D29896





More information about the llvm-commits mailing list