[PATCH] D27508: [CodeGen] Make MachineInstr::isIdenticalTo() symmetric.

Thu Dec 15 02:34:53 PST 2016

bjope added a comment.

Thanks for your feedback Andy!

================
Comment at: lib/CodeGen/MachineInstr.cpp:1003
   if (isBundle()) {
     // Both instructions are bundles, compare MIs inside the bundle.
     MachineBasicBlock::const_instr_iterator I1 = getIterator();
----------------
andrew.w.kaylor wrote:
> The getOpcode() check above should guarantee that Other.isBundle() will return true, but it would be nice to mention that explicitly in the comment here.  As someone reading this code for the first time, I saw one call to isBundle() followed by a comment saying "Both instructions are bundles" and it took me a minute to understand how we could know that.
Ok, I agree, such an explanation would be good. I actually got confused in the same way myself.
I can update the comment as part of this patch (even though my initial thought was to make a rather minimal fix for the actual fault).

================
Comment at: lib/CodeGen/MachineInstr.cpp:1005
     MachineBasicBlock::const_instr_iterator I1 = getIterator();
-    MachineBasicBlock::const_instr_iterator E1 = getParent()->instr_end();
+    MachineBasicBlock::const_instr_iterator E1 = getBundleEnd(I1);
     MachineBasicBlock::const_instr_iterator I2 = Other.getIterator();
----------------
andrew.w.kaylor wrote:
> These calls to getBundleEnd() are introducing a bit of redundancy.  The implementation of getBundleEnd() iterates from the given instruction until it finds an MI that returns false for isBundledWithSucc().  It should always be a small redundancy, but it could be eliminated by just checking isBundledWithSucc() here.
I'll see if I can do something with isBundledWithSucc instead of using end-iterators. But for someone looking at the code for the first time, then I think using end-iterators are easier to understand (we are after all using iterators for stepping between instructions, and we must be really careful not to dereference the end-iterator etc).

(The old code that were using the end iterator for the basic block was really confusing. And I thought that using the bundle-end-iterator at least made the code less error-prone even if it includes a small redundancy.)

================
Comment at: lib/CodeGen/MachineInstr.cpp:1011
       ++I2;
-      if (I2 == E2 || !I2->isInsideBundle() || !I1->isIdenticalTo(*I2, Check))
+      if (I1 == E1 || I2 == E2)
+        break;
----------------
andrew.w.kaylor wrote:
> I don't really like the way this loop is defined with the actual expected condition for return being checked outside the loop.  You could avoid that if you rewrite it like this:
> ```
> // Loop until we reach the end of both bundles.
> while (I1->isBundledWithSucc() && I2->isBundledWithSucc()) {
>   // If we've reached the end of just one of the two bundles, but not both,
>   // the instructions are not identical.
>   if (!I1->isBundledWithSucc() || !I2->isBundledWithSucc())
>     return false;
>   ...
> }
> 
> ```
I should probably have explained why my loop was designed that way :-)

(1)
My loop also handles the weird case that one bundle is empty (I'm not sure if that is legal, but there are no asserts, and I've seen this happen in our downstream backend when doing late optimisations on bundled code, removing instructions from a bundle). Even though it could be seen as a fault to have empty bundles, my loop is defensive and works even for such weird situations.
With your suggestion I either need to add asserts before the while loop that both bundles are non-empty (if empty bundles should be considered as illegal), or deem the bundles as different if only one of them is empty. The latter check is the one that you moved into the loop, so I would need to duplicate that check outside the loop. So this is one reason why I decided to go for an ugly "while(true)" loop and only have the logic for "reached the end of both bundles" in one place.

(2)
When I look into the future I see that there will be more patches in this code. We need to make it possible to skip past DBG_VALUE when stepping the iterators (or otherwise, for example, BranchFolding could end up doing different optimisation depending on the existence of DBG_VALUE inside the bundle). With that said, maybe the existence of DBG_VALUE inside bundles also is a weird scenario? We have it in our downstream backend, but that could be our own problem/fault.
The idea was to make the loop valid also if the stepping of iterators could skip DBG_VALUE. So I need to check if I reached the end of the bundle somewhere between stepping the iterator and dereferencing the iterator (the recursive call to isIdenticalTo). I'm not really sure how to do that in your example (it does not include stepping of the iterators). All my attempts of having the exit condition as a loop-condition has become messy when considering a future patch where the iterators could be stepped more than one step at a time. For such loops we need additional checks to avoid calling isIdentical when iterating past the bundle end.

My primary goal is to fix the bug. A secondary, much less important, goal is to make it easy to do a future fix about how to handle DBG_VALUE inside the bundle. Maybe I should forget about the secondary goal for now to get the bugfix accepted.
I hope that we at least agree that the old code was incorrect and the isIdenticalTo should be symmetric?

https://reviews.llvm.org/D27508