[PATCH] D28147: [LICM] Allow promotion of some stores that are not guaranteed to execute

Thu Dec 29 14:20:59 PST 2016

mkuper added inline comments.

================
Comment at: lib/Transforms/Scalar/LICM.cpp:910
   // It is safe to promote P if all uses are direct load/stores and if at
-  // least one is guaranteed to be executed.
-  bool GuaranteedToExecute = false;
+  // least one store is guaranteed to be executed.
+  bool PromotionIsLegal = false;
----------------
reames wrote:
> You don't need a store.  A load is sufficient for this part.
No, if the load is guaranteed to be executed, but the store is under control flow, then it's not safe to promote, unless you have additional information (next paragraph).

I can try to rewrite this whole thing, but I'm not sure I can make it both precise and clearer.

================
Comment at: lib/Transforms/Scalar/LICM.cpp:253
   // memory references to scalars that we can.
-  if (!DisablePromotion && (Preheader || L->hasDedicatedExits())) {
+  // Don't sink stores from loops without dedicated block exits. Exits
+  // containing indirect branches are not transformed by loop simplify,
----------------
reames wrote:
> You've got a scattering of unrelated trivial cleanup changes here (style fixes, clarifying comments on the original code, etc..).  Please separate those and submit them.  No further review needed, but I'd like to get them out of this patch for future rounds.
I'm not sure all of those changes makes sense while ExitBlocks are still computed lazily. I'll separate as many of them as reasonable, let me know if you still want things unbundled at the next iteration.

================
Comment at: lib/Transforms/Scalar/LICM.cpp:258
+  // is available.
+  if (!DisablePromotion && Preheader && L->hasDedicatedExits()) {
+    // Figure out the loop exits and their insertion points
----------------
reames wrote:
> Changing from the or and exit scheme to the and scheme should be NFC.  Please separate and land without further review.  (In particular, if this turns out *not* to be NFC, I want to be able to bisect to the refactoring.)
So, it's not NFC on its own, because it also requries the sinking the formLCSSARecursively() call out of the if.

Note that:
1) If Changed was true, then "Preheader || L->hasDedicatedExits()" is true, because the arguments to the or guard the previous transofrmations.
2) The "if (Changed)" on line 269 can be true even if promoteLoopAccessesToScalars() made no changes, since it's the same Changed variable, not one defined within the if block.
So we end up running formLCSSARecursively() for *any* change. Which works, by accident, but is really counter-intuitive. And would breaks if DisablePromotion actually happend to be true.

The two changes should be NFC when combined, though, so I'll commit that separately.

================
Comment at: lib/Transforms/Scalar/LICM.cpp:268
+
+    if (!HasCatchSwitch) {
+      SmallVector<Instruction *, 8> InsertPts;
----------------
reames wrote:
> You switching from a lazy to an eager computation of both exit blocks and insert pts.  In this case, it shouldn't matter since hasDedicatedExits also does the same (relatively expensive) computation of the loop exits.
> 
> At some point, we really do need to cache the loop exit walk.... (not for this change)
It's not just that - isGuaranteedToExecute() also does the same computation...
But yes, that's a separate change.

================
Comment at: lib/Transforms/Scalar/LICM.cpp:288
+  // loop used in the outer loop.
+  // FIXME: This is really heavy handed. It would be a bit better to use an
+  // SSAUpdater strategy during promotion that was LCSSA aware and reformed
----------------
reames wrote:
> Use of the Changed variable here is overly conservative.  We actually only need to run this if *store promotion* changed the loop.  If you wanted to do that as a a separate change, it would be a useful cleanup and compile time win.
I think that's actually wrong (and the comment is misleading, I should fix it)- see above for explanation.

I mean, I'm not saying that we *need* to run it in other cases (although I'm pretty sure we do), I'm saying we *did* run it in other cases, and I'd prefer this part to be NFC.

================
Comment at: lib/Transforms/Scalar/LICM.cpp:955
   bool Changed = false;
   for (const auto &ASI : AS) {
     Value *ASIV = ASI.getValue();
----------------
reames wrote:
> This loop is getting really complicated and hard to follow.  I think we need to find an alternate structure here which makes your changes easier to reason about.  
> 
> One idea would be to introduce two variables for the two different things we're trying to establish.  (i.e. LocationIsDereferencable, and CanInsertStoresOnExits.)  The key simplification is that we don't need to track *why* we know we can insert the stores.
> 
> I'm also open to other organizations here.
I agree. I'll see if I can simplify. 

================
Comment at: test/Transforms/LICM/scalar_promote.ll:208
+
+loop:
+  %j = phi i32 [ 0, %entry ], [ %next, %loop ]
----------------
reames wrote:
> Looking at this test, it really should be handled by the thread local code we already have.  Have you looked to see why it isn't?
> 
> In fact, looking at *all* of your tests, the thread local case should handle it...?
> 
> (Note: This is a very important question to that I'll want an answer to before we commit the semantics parts of this code.)
Why would it be handled by the thread-local code?
%local is an alloca, so it's not thread-local memory.

https://reviews.llvm.org/D28147