[PATCH] D30731: [SLP] Visualize SLP trees with -view-slp-tree

Wed Mar 8 10:16:38 PST 2017

mkuper added a comment.

The SLP part LGTM - it's less broken than the rest of the code, and we need to fix SLP itself to fix it (and avoid things like the Container hack...)

The GraphTraits implementation generally LGTM as well, but that may be because I don't know enough about it to tell a good one from a bad one...

================
Comment at: lib/Transforms/Vectorize/SLPVectorizer.cpp:517
+
+    /// The TreeEntry index containing the user of this entry.
+    int UserTreeIdx;
----------------
anemet wrote:
> mssimpso wrote:
> > mkuper wrote:
> > > tl;dr: I think this is broken, but it's not something you can easily fix here.
> > > 
> > > Fair warning: I haven't had time to really think this through, so maybe there's something trivial I'm missing here. Having said that - I just realized yesterday the tree is a lie. It isn't actually a tree, it's a DAG - so "the user" is not well-defined (see line 1095 on the left, and the discussion about the trouble it causes here: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20170306/435431.html )
> > > 
> > > I think the whole thing should be rewritten, to be explicit about the graph it contains. But right now, I believe the only correct way to look up which tree nodes feed a node T is the way vectorizeTree() does it - look at the (scalar) operands of T's scalars, and map them back to the nodes through ScalarToTreeEntry. I'm not sure trying to wrap GraphTraits around that is a good idea, though.
> > > 
> > > We can either:
> > > 1) Leave this as is, and be explicit about the fact that the rendered graph may be imprecise.
> > > 2) Try to walk the graph "properly".
> > > 3) Put this on hold until the underlying "graph" has a saner representation. I'm not sure when I'll have time to do it, though. If anybody else volunteers...
> > > 
> > > None of those sound appealing, TBH. Not sure what the least evil is.
> > I'd like to better understand the imprecision here. You're essentially saying that the graph output could be missing some edges because a tree entry could have multiple users? And that these uses need not even be of the same vector value (in the unordered load case)? How difficult would it be to walk the DAG properly?
> > tl;dr: I think this is broken, but it's not something you can easily fix here.
> 
> Well not more broken that the entire code then ;).  We are badly missing a FIXME somewhere.
>  
> > Fair warning: I haven't had time to really think this through, so maybe there's something trivial I'm missing here. Having said that - I just realized yesterday the tree is a lie. It isn't actually a tree, it's a DAG - so "the user" is not well-defined (see line 1095 on the left, and the discussion about the trouble it causes here: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20170306/435431.html )
> > 
> > I think the whole thing should be rewritten, to be explicit about the graph it contains. But right now, I believe the only correct way to look up which tree nodes feed a node T is the way vectorizeTree() does it - look at the (scalar) operands of T's scalars, and map them back to the nodes through ScalarToTreeEntry. I'm not sure trying to wrap GraphTraits around that is a good idea, though.
> 
> I have actually seen that we're trying to recognize existing entries in the tree but my conclusion was that at that point we just just terminate the tree.  I am curious to look at your testcase.
> 
> Anyhow, it's trivial to make the UsingTreeIdx a SmallVector<int, 1> to be able to represent a DAG without bloating the TreeEntry much.  Hopefully we can find a place to easily update it too.  It would also remove my C++ magic to make a single int an iterate-able "container". 
> 
> > We can either:
> > 
> > Leave this as is, and be explicit about the fact that the rendered graph may be imprecise.
> > Try to walk the graph "properly".
> > Put this on hold until the underlying "graph" has a saner representation. I'm not sure when I'll have time to do it, though. If anybody else volunteers...
> > None of those sound appealing, TBH. Not sure what the least evil is.
> 
> Let me look at this today.  Hopefully the testcase you quote is a non-tree case (please confirm).  I can use that to make this more precise.
> 
> Either way, I'd like to check this in one form or another as it's super-useful even if we make it experimental for now.  We can improve it in tree.  If we make the user node a container the data structure is equipped to deal wit this.
> 
@mssimpso : yes. I think we're going to unroll the unordered load case for now, so they'll always be the same value. And, apparently, not as difficult as I thought, because Adam just did it.

@anemet : Yes, not more broken than the entire code. I was going to write "but it's not your fault". :-)
And yes, you're completely right, it is trivial... I didn't consider the fact the initial walk is correct, even if the results it produces is nonsense, so you can just keep using the walk order as is. Thanks!

================
Comment at: lib/Transforms/Vectorize/SLPVectorizer.cpp:1188
     }
+    E->UserTreeIndices.push_back(UserTreeIdx);
     DEBUG(dbgs() << "SLP: Perfect diamond merge at " << *VL[0] << ".\n");
----------------
I think this is where we want the FIXME. :-)
Can you please add it -not for the entire thing, just that this here is a huge hack that makes the graph used for printing *more* precise than the graph actually used for vectorization...

https://reviews.llvm.org/D30731