[llvm-dev] When can the dominator tree not contain a node for a basic block?

Philip Reames via llvm-dev llvm-dev at lists.llvm.org
Tue Sep 22 16:05:31 PDT 2015



On 09/21/2015 06:11 PM, Joseph Tremoulet wrote:
>
> <pedantry>
>
> FWIW, I've always understood "unreachable nodes are dominated by 
> everything and dominate nothing but other unreachable nodes" to be the 
> conventional interpretation of the formal definition, since universal 
> quantification over an empty set is vacuously true; i.e. " of the 0 
> paths that exist, **all** of them go through any other node".
>
> > we claim … that *every* path from the entry node goes to the 
> unreachable node
>
> I don't think that follows; saying that d /doesn't/ dominate n says 
> that a path exists form start to n that doesn't go through d, but 
> saying that d /does/ dominate n just says something about all paths in 
> a possibly empty set.
>
> > most dominance algorithms will never consider an unreachable node to 
> dominate anything
>
> Agreed, but I think here we're talking about the inverse; whether 
> anything dominates an unreachable node.
>
> </pedantry>
>
> As to the more important point, I've certainly seen dominance queries 
> on unreachable code cause trouble before and agree there's just not a 
> good answer.  I don't know if this would be acceptable for the API in 
> LLVM, but you'd almost want dominates(d, n) to assert that n is 
> reachable and force callers to decide what they want to do with 
> unreachable code.
>
> I agree with your point that confusion/problems arise when people 
> expect pairwise dominance queries to equate to reachability in a tree 
> and "extra" edges to unreachable nodes defeat their expectations.  I 
> think what happens if you go all the way to the other extreme (and 
> effectively use the predicate "d dominates n and n is reachable" in 
> place of "d dominates n") is that sometimes a pass will be processing 
> a fragment of the CFG (like the diamond of an if/else) and expect 
> certain dominance properties on it (like that the head dominates both 
> arms and the join point), which would then be reported as false if the 
> whole fragment is unreachable.  Likewise, one would expect entry to 
> dominate all nodes, etc.
>
> I suppose that many unreachable code fragments have natural heads and 
> you could figure out the "expected" answers for those, but 
> cycle-breaking would still be arbitrary and I think usually the right 
> answer is that the caller should be ignoring or pre-removing 
> unreachable code.
>
I completely utterly agree with this last point.

I just don't have the time to go fix all the callers right now.  :)
>
> Thanks
>
> -Joseph
>
> *From:*llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] *On Behalf Of 
> *Daniel Berlin via llvm-dev
> *Sent:* Monday, September 21, 2015 12:48 PM
> *To:* Philip Reames <listmail at philipreames.com>
> *Cc:* llvm-dev <llvm-dev at lists.llvm.org>
> *Subject:* Re: [llvm-dev] When can the dominator tree not contain a 
> node for a basic block?
>
> So, there is no good answer.
>
> It's not documented anywhere, and genericdomtree.h makes assumptions 
> both ways.
>
> For example, :
>
>  void getDescendants(NodeT *R, SmallVectorImpl<NodeT *> &Result) const {
>
> ...
>
>  if (!RN)
>
>        return; // If R is unreachable, it will not be present in the 
> DOM tree. :)
>
> ...
>
> }
>
> vs
>
> dominates()
>
> {
> ...
>
>   // Any unreachable use is dominated, even if Def == User.
>
>    if (!isReachableFromEntry(UseBB))
>
>      return true;
>
>    // Unreachable definitions don't dominate anything.
>
>    if (!isReachableFromEntry(DefBB))
>
>      return false;
>
> }
>
> ...
>
> From what I can tell, the split here is because we expect that nodes 
> that *become unreachable* from entry will still be in the dom tree, 
> but nodes that are completely unreachable generally don't end up in 
> the domtree.  GetNode will work on the former, but not the latter.  
> This seems ... wrong.
>
> This is *dominators*, mind you.
>
> post-dominators goes through explicit machinations (and fails in some 
> cases) when building to try to make sure even unreachable nodes end up 
> in the dominator tree.
>
> My suggestion would be that we clearly think about this, document and 
> define what should happen, declare anything violating these invariants 
> to be a bug, and go fix them.
>
> As to what "the right answers are", this seems harder.
>
> I've previously pointed out why giving dominance answers on 
> unreachable nodes rarely, if ever, makes sense given the formal 
> definition of a dominator:
>  "a node 
> <https://na01.safelinks.protection.outlook.com/?url=https%3a%2f%2fen.wikipedia.org%2fwiki%2fBasic_block&data=01%7c01%7cjotrem%40microsoft.com%7c0eaa02c7e0ca4450610008d2c2a47ef5%7c72f988bf86f141af91ab2d7cd011db47%7c1&sdata=ZIpgDcoJDbRUlp1xT%2bvQTEdSshYDO3cT2CLRsVDikB4%3d>*d* 
> /dominates/ a node *n* if every path from the /entry node/ to *n* must 
> go through *d"*
>
> If n is unreachable, it is not dominated  by anything, since no path 
> from the entry node to n exist, and of the 0 paths that exist, none of 
> them go through any other node.  Essentially, right now we claim the 
> opposite - that *every* path from the entry node goes to the 
> unreachable node, and in fact, that it goes through *every* block.
>
> This means, for example, that if you tried to reconstruct the 
> dominator tree by calling dominates on pairs of instructions, you'd 
> get wrong answers (you'd simultaneously try to make the unreachable 
> block a child of every node in the dominator tree, as well as no node 
> in the dominator tree)
>
> In fact, you can show that most dominance algorithms will never 
> consider an unreachable node to dominate anything, even going back to 
> the Lowry and Medlock algorithm from 1969, or even the original 
> Prosser description of a dominator on flow diagrams from 1959:
> "We say box i dominates box j if every path (leading from input to 
> output through the diagram) which passes through box j must also pass 
> through box i. Thus box i dominates box j if box j is subordinate to 
> box i in the program"
>
> Now, whether we care about this or not, it's up in the air.
>
> (Note that forward/reverse unreachable for dom/post-dom is different 
> from "never exits", since never-exiting nodes are reachable)
>
> On Mon, Sep 21, 2015 at 9:26 AM, Philip Reames 
> <listmail at philipreames.com <mailto:listmail at philipreames.com>> wrote:
>
>     When looking into https://llvm.org/bugs/show_bug.cgi?id=24866
>     <https://na01.safelinks.protection.outlook.com/?url=https%3a%2f%2fllvm.org%2fbugs%2fshow_bug.cgi%3fid%3d24866&data=01%7c01%7cjotrem%40microsoft.com%7c0eaa02c7e0ca4450610008d2c2a47ef5%7c72f988bf86f141af91ab2d7cd011db47%7c1&sdata=bNpncZVa7gkv5td%2bmmcfAkf3t%2bffFuZftsg6kJ2Ztpk%3d>,
>     I discovered that the root cause of the crash is that I was
>     expecting every basic block to have a corresponding Node in the
>     dominator tree.  Apparently, the "while.end" basic block in the
>     example does not have a Node in the Dominator Tree.  Can anyone
>     tell me if this is expected?   If so, under what circumstances?
>
>     Interestingly, the crash in question appears to be fairly recently
>     introduced.  A checkout from week before last didn't crash,
>     whereas a checkout from this morning does.  I have yet tried to
>     isolate the change in question; before doing so, I wanted to
>     determine if this was a newly introduced problem (i.e. every BB
>     should have a Node), or merely a newly exposed problem (i.e. not
>     every BB does).
>
>     Philip
>
>     For reference, here's the reproduction command:
>     opt -S <input> -value-tracking-dom-conditions -licm -load-combine
>
>     And here's the IR:
>     ; ModuleID = '../bugpoint-reduced-simplified.bc'
>     target datalayout = "e-m:w-i64:64-f80:128-n8:16:32:64-S128"
>     target triple = "x86_64-pc-windows-msvc18.0.0"
>
>     %struct.c_derived_tbl.2.5.8.11.14.17.23.38.59.80.92.98.104.107.155.183
>     = type { [256 x i32], [256 x i8] }
>
>     ; Function Attrs: nounwind uwtable
>     define void
>     @encode_one_blockX2(%struct.c_derived_tbl.2.5.8.11.14.17.23.38.59.80.92.98.104.107.155.183*
>     nocapture readonly %actbl) #0 {
>     entry:
>       br i1 undef, label %L_KLOOP_01, label %L_KLOOP.preheader
>
>     L_KLOOP_01:                                       ; preds =
>     %while.end, %entry
>       br label %L_KLOOP.preheader
>
>     L_KLOOP_08:                                       ; preds = %while.end
>       br label %L_KLOOP.preheader
>
>     L_KLOOP.preheader:                                ; preds =
>     %L_KLOOP_08, %L_KLOOP_01, %entry
>       %r.2.ph
>     <https://na01.safelinks.protection.outlook.com/?url=http%3a%2f%2fr.2.ph&data=01%7c01%7cjotrem%40microsoft.com%7c0eaa02c7e0ca4450610008d2c2a47ef5%7c72f988bf86f141af91ab2d7cd011db47%7c1&sdata=jXpaIGmv6qhI7eaSNnoEvdGICfE53T553k6j7cGti94%3d>
>     = phi i32 [ undef, %L_KLOOP_08 ], [ 0, %entry ], [ undef,
>     %L_KLOOP_01 ]
>       br label %L_KLOOP
>
>     L_KLOOP:                                          ; preds =
>     %while.end, %L_KLOOP.preheader
>       %r.2 = phi i32 [ 0, %while.end ], [ %r.2.ph
>     <https://na01.safelinks.protection.outlook.com/?url=http%3a%2f%2fr.2.ph&data=01%7c01%7cjotrem%40microsoft.com%7c0eaa02c7e0ca4450610008d2c2a47ef5%7c72f988bf86f141af91ab2d7cd011db47%7c1&sdata=jXpaIGmv6qhI7eaSNnoEvdGICfE53T553k6j7cGti94%3d>,
>     %L_KLOOP.preheader ]
>       br i1 undef, label %while.body, label %while.end
>
>     while.body:                                       ; preds =
>     %while.body, %L_KLOOP
>       br label %while.body
>
>     while.end:                                        ; preds = %L_KLOOP
>       %shl105 = shl i32 %r.2, 4
>       %add106 = add nsw i32 %shl105, undef ;; <-- THIS is the context
>     instruction.  V is undef
>       %idxprom107 = sext i32 %add106 to i64
>       %arrayidx108 = getelementptr inbounds
>     %struct.c_derived_tbl.2.5.8.11.14.17.23.38.59.80.92.98.104.107.155.183,
>     %struct.c_derived_tbl.2.5.8.11.14.17.23.38.59.80.92.98.104.107.155.183*
>     %actbl, i64 0, i32 0, i64 %idxprom107
>       %0 = load i32, i32* %arrayidx108, align 4, !tbaa !2
>       %arrayidx110 = getelementptr inbounds
>     %struct.c_derived_tbl.2.5.8.11.14.17.23.38.59.80.92.98.104.107.155.183,
>     %struct.c_derived_tbl.2.5.8.11.14.17.23.38.59.80.92.98.104.107.155.183*
>     %actbl, i64 0, i32 1, i64 %idxprom107
>       %1 = load i8, i8* %arrayidx110, align 1, !tbaa !6
>       indirectbr i8* undef, [label %L_KLOOP_DONE, label %L_KLOOP_01,
>     label %L_KLOOP_08, label %L_KLOOP]
>
>     L_KLOOP_DONE:                                     ; preds = %while.end
>       ret void
>     }
>
>     attributes #0 = { nounwind uwtable "disable-tail-calls"="false"
>     "less-precise-fpmad"="false" "no-frame-pointer-elim"="false"
>     "no-infs-fp-math"="false" "no-nans-fp-math"="false"
>     "stack-protector-buffer-size"="8" "target-cpu"="x86-64"
>     "target-features"="+sse,+sse2" "unsafe-fp-math"="false"
>     "use-soft-float"="false" }
>
>     !llvm.module.flags = !{!0}
>     !llvm.ident = !{!1}
>
>     !0 = !{i32 1, !"PIC Level", i32 2}
>     !1 = !{!"clang version 3.8.0 (http://llvm.org/git/clang.git
>     <https://na01.safelinks.protection.outlook.com/?url=http%3a%2f%2fllvm.org%2fgit%2fclang.git&data=01%7c01%7cjotrem%40microsoft.com%7c0eaa02c7e0ca4450610008d2c2a47ef5%7c72f988bf86f141af91ab2d7cd011db47%7c1&sdata=5BuQ3CnZsj0Hi7fOkJrCNa9e2n6FCRfyRiMaYYvQrYw%3d>
>     37c860b7d0d43dde1edcd324e00a10a62b2c5776)
>     (http://llvm.org/git/llvm.git
>     <https://na01.safelinks.protection.outlook.com/?url=http%3a%2f%2fllvm.org%2fgit%2fllvm.git&data=01%7c01%7cjotrem%40microsoft.com%7c0eaa02c7e0ca4450610008d2c2a47ef5%7c72f988bf86f141af91ab2d7cd011db47%7c1&sdata=wjkU7x%2bTSbgGd6ikqBsqKtOMmAYVUhHV4mtB6VqKMMc%3d>
>     fb73c0262af4183555253a5c8c13025a6e7630bc)"}
>     !2 = !{!3, !3, i64 0}
>     !3 = !{!"int", !4, i64 0}
>     !4 = !{!"omnipotent char", !5, i64 0}
>     !5 = !{!"Simple C/C++ TBAA"}
>     !6 = !{!4, !4, i64 0}
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150922/520ce07f/attachment-0001.html>


More information about the llvm-dev mailing list