[PATCH] D11648: InstCombinePHI: Partial simplification of identity operations

Jakub Kuderski jakub.kuderski at arm.com
Tue Aug 4 02:44:55 PDT 2015


kuhar marked 7 inline comments as done.

================
Comment at: lib/Transforms/InstCombine/InstCombinePHI.cpp:838
@@ +837,3 @@
+  if (const auto *Incoming = dyn_cast<Instruction>(&IncomingVal))
+    if (!IC.getDominatorTree()->dominates(Incoming, Terminator))
+      return nullptr;
----------------
jmolloy wrote:
> You need to handle the case where dyn_cast returns nullptr here.
If it returns null, we are fine and can continue (ex. value that is a function parameter).

================
Comment at: lib/Transforms/InstCombine/InstCombinePHI.cpp:843
@@ +842,3 @@
+  // as an incoming value of the PHI node.
+  if (const auto *Operand =
+          dyn_cast<Instruction>(PNUser.getOperand(OpOperandIdx)))
----------------
jmolloy wrote:
> Same as line 838.
saa

================
Comment at: lib/Transforms/InstCombine/InstCombinePHI.cpp:846
@@ +845,3 @@
+    if (!IC.getDominatorTree()->dominates(Operand, Terminator) ||
+        !IC.getDominatorTree()->dominates(Operand, &PN))
+      return nullptr;
----------------
jmolloy wrote:
> I think this second check is unnecessary. As the terminator leads to the block with the PHI in, any instruction that dominates the terminator must also dominate PN.
Well, sometimes it is necessary when things get little more tricky. Consider duff's device with the do-while inside a switch. You end up with such IR:

```
for.cond:                                         ; preds = %for.body, %entry
  %i.0 = phi i32 [ 0, %entry ], [ %inc, %for.body ]
  %cmp = icmp eq i32 %i.0, 100
  br i1 %cmp, label %for.end, label %for.body

for.body:                                         ; preds = %for.cond
  %conv = trunc i32 %i.0 to i16
  %idxprom = sext i32 %i.0 to i64
  %arrayidx = getelementptr inbounds [100 x i16], [100 x i16]* %Array, i64 0, i64 %idxprom
  store i16 %conv, i16* %arrayidx, align 2
  %inc = add nsw i32 %i.0, 1
  br label %for.cond

for.end:                                          ; preds = %for.cond
  %arraydecay = getelementptr inbounds [100 x i16], [100 x i16]* %Array, i64 0, i64 0
  br label %sw.bb.10.i

do.body.i:                                        ; preds = %sw.bb.10.i
  %incdec.ptr.i = getelementptr inbounds i16, i16* %incdec.ptr29.i, i64 1
  %0 = load i16, i16* %incdec.ptr29.i, align 2
  %add2.i = add i16 %add32.i, %0
  %incdec.ptr5.i = getelementptr inbounds i16, i16* %incdec.ptr.i, i64 1
  %1 = load i16, i16* %incdec.ptr.i, align 2
  %add8.i = add i16 %add2.i, %1
  %2 = add i16 %add8.i, %4
  br label %sw.bb.10.i

sw.bb.10.i:                                       ; preds = %do.body.i, %for.end
  %3 = phi i16 [ %add8.i, %do.body.i ], [ 0, %for.end ]
  %from.addr.2.i = phi i16* [ %incdec.ptr5.i, %do.body.i ], [ %arraydecay, %for.end ]
  %n.2.i = phi i32 [ %dec.i, %do.body.i ], [ 17, %for.end ]
  %incdec.ptr11.i = getelementptr inbounds i16, i16* %from.addr.2.i, i64 1
  %4 = load i16, i16* %from.addr.2.i, align 2
  %add14.i = add i16 %3, %4
  %incdec.ptr17.i = getelementptr inbounds i16, i16* %incdec.ptr11.i, i64 1
  %5 = load i16, i16* %incdec.ptr11.i, align 2
  %add20.i = add i16 %add14.i, %5
  %incdec.ptr23.i = getelementptr inbounds i16, i16* %incdec.ptr17.i, i64 1
  %6 = load i16, i16* %incdec.ptr17.i, align 2
  %add26.i = add i16 %add20.i, %6
  %incdec.ptr29.i = getelementptr inbounds i16, i16* %incdec.ptr23.i, i64 1
  %7 = load i16, i16* %incdec.ptr23.i, align 2
  %add32.i = add i16 %add26.i, %7
  %dec.i = add nsw i32 %n.2.i, -1
  %cmp.i = icmp sgt i32 %n.2.i, 1
  br i1 %cmp.i, label %do.body.i, label %sum.exit

```  
Here you want to replace
`%3 = phi i16 [ %add8.i, %do.body.i ], [ 0, %for.end ]`
`%add14.i = add i16 %3, %4`
with
`%2 = add i16 %add8.i, %4` (in for.end)
`%3 = phi i16 [ %2, %do.body.i ], [ %4, %for.end ]`
Which is illegal (detected by the domination check of Operand and PN). 



================
Comment at: lib/Transforms/InstCombine/InstCombinePHI.cpp:856
@@ +855,3 @@
+  for (const auto &U : PNUser.uses())
+    if (IC.getDominatorTree()->dominates(Terminator, U))
+      return nullptr;
----------------
jmolloy wrote:
> What is this doing? What situation can cause whatever this is checking against?
It's checking if the newly inserted instruction would dominate all of its uses. I'm checking against Terminator not to add the new instruction and remove it afterwards (equivalent).
The `if (PN.getIncomingBlock(IncomingValIdx) == PN.getParent())` is essentially detecting some subset of illegal cases of the check in the for loop, but it's the most common one and can be detected without using dominator tree at all.


Repository:
  rL LLVM

http://reviews.llvm.org/D11648







More information about the llvm-commits mailing list