[PATCH] D67392: [ARM][ParallelDSP] Change smlad insertion order

Fri Sep 13 12:09:34 PDT 2019

efriedma added inline comments.

================
Comment at: lib/Target/ARM/ARMParallelDSP.cpp:654
+    else
+      V = DT->dominates(cast<Instruction>(A), cast<Instruction>(B)) ? B : A;
+
----------------
samparker wrote:
> efriedma wrote:
> > You decided not to modify this dominates() call?
> I assumed it wouldn't be worth it... My understanding is that I'd have to create a new ordered block each time, because the block will change after each call.
My general concern here would be that the pass is O(N^2) in the number of transformations in a given BB.  (If you unroll a loop containing a transformable operation N times, for example.)  This contributes to that, constructing the OrderedBB later in InsertParallelMACs contributes to that.  But there are a bunch of other places with similar issues... for example, RecordMemoryOps has a loop that's O(N^2) in the number of loads.

Actually, thinking about it a bit more, I have another concern; you might not be picking a legal insertion point.  The instruction that produces the accumulator could be anything: for example, a phi, or an invoke, or an exception-handling instruction.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D67392/new/

https://reviews.llvm.org/D67392