[llvm-dev] [RFC] How to manifest information in LLVM-IR, or, revisiting llvm.assume

Wed Dec 18 12:21:28 PST 2019

On 12/18, John McCall wrote:
> On 18 Dec 2019, at 14:18, Doerfert, Johannes wrote:
> > Hi John,
> > 
> > Is it correct to assume that you are in favor of
> >   - changing llvm.assume to be more expressive
> >   - operand bundle uses, especially w/o outlining
> >   - outlining, if we can show it plays well with transformations, e.g.
> >     the binding to the code is "weak"
> 
> Yes, this is all correct.  I’m excited that we’re looking into making this
> less useless.

Perfect! I'll go ahead and prototype something then so we can collect
data and experience.

> > > > > - Anything passed to the predicate function will by default
> > > > > look like it
> > > > >   escapes.  This is particularly true if the predicate takes local
> > > > >   variables by references, which is the easiest and most
> > > > > straightforwardly
> > > > >   correct way for frontends to emit these predicates.  So
> > > > > this will block
> > > > >   basic memory analyses (including mem2reg!) unless they’re
> > > > > taught to
> > > > >   remove or rewrite assumptions.
> > > > 
> > > > Partially true and we already have that problem though.
> > > > 
> > > > Mem2reg, and others, might need to know about llvm.assume uses but I
> > > > fail to see why they need to rewrite much (in the short therm). The
> > > > frontend generated code would naturally look like this:
> > > > 
> > > > %ptr.addr = alloca i32*
> > > > store %ptr, %ptr.addr
> > > > ...
> > > > %ptr.val = load %ptr.addr
> > > > llvm.assume() ["align"(%ptr.val)]
> > > 
> > > I disagree; the natural way to generate this code in frontends will
> > > actually be to take the variable by reference.  We can, of course,
> > > make
> > > frontends smart enough to take the variable by value if it’s
> > > obviously only loaded from in the expressions, but if the optimizers
> > > still aren’t generally aware of the intrinsic, that will just mean
> > > that assumptions pessimize slightly more abstracted code.
> > > 
> > > For example, if I had this:
> > > 
> > > ```
> > >   Clang::QualType type = …;
> > >   __builtin_assume(!type.hasLocalQualifiers());
> > > ```
> > > 
> > > At a high level, I want to be able to apply mem2reg to the value of
> > > this `QualType`; but at a low level, this method call takes `type`
> > > by reference, and so the predicate function will take it by reference
> > > as well.
> > 
> > At some point we need to realize code is only used in an assumption in
> > order to actually outline. There is no question about that. Where it
> > happens is a good question though. I could write a pass to do that in
> > the IR, in order to test the idea.
> 
> You mean for things like loads that are just passed to the intrinsic?
> I agree, although I think other people who’ve worked with `llvm.assume`
> have noticed that the presence of the load can change optimization in
> ways that are hard to eliminate, e.g. if a load gets hoisted because
> it’s “done twice”.

That's right and I don't know what the perfect solution looks like,
maybe always outlining is not so bad after all. I guess we all agree
that we have to minimize the llvm.assume impact while keeping
information available and correct, e.g., wrt. control dependences.

A crazy idea we could try further down the road:
  Once we have control dependences [0] we can start "moving" the assume
  calls, e.g., towards function entries. We can hoist it over
  conditionals when we can keep track of the control dependences and we
  know it would have been reached, e.g., the original position is in the
  "must-be-executed-context" of the new position [1].

[0] https://reviews.llvm.org/D71578
[1] https://reviews.llvm.org/D65186

> > Since we don't have this code I won't speculate about it now. What I
> > will say instead is that we have code to modify the IR in order to pass
> > an argument by reference instead of by value. I am more than happy to
> > make it aware of llvm.assume operand bundle uses and I am very much
> > certain the amount of code needed to transform these from pass by
> > reference to pass by value is negligible.
> 
> Okay.
> 
> > > > Mem2reg should kick in just fine even if %ptr now has a
> > > > "unknown" use.
> > > > But that "unknown" use is much less problematic than what we
> > > > have now
> > > > because the user is the `llvm.assume` call and not some
> > > > `ptrtoint` which
> > > > is used two steps later in an `llvm.assume.
> > > > 
> > > > If you feel I missed a problem here, please let me know.
> > > > 
> > > > 
> > > > 
> > > > > Unfortunately, I don’t have a great counter-proposal that
> > > > > isn’t a
> > > > > major project.
> > > > > 
> > > > > (The major project is to make the predicates sub-functions within
> > > > > the caller.  This doesn’t fix all the problems, and sub-functions
> > > > > introduce a host of new issues, but they do have the benefit of
> > > > > making the analysis much more obviously intra-procedural.)
> > > > 
> > > > I don't think inter-procedural reasoning is a problem or bad.
> > > > Especially
> > > > here with internal functions that have a single use, it is
> > > > really not
> > > > hard to make the connection.
> > > 
> > > It’s certainly not a problem *in theory*.  *In theory* every
> > > intraprocedural analysis can be taught to go interprocedural
> > > into a predicate.
> > 
> > We might have different expectations how assumes are used or where our
> > analyses/transformation are heading. Not every pass needs to look at
> > assumes at the end of the day. Most information that we use can already,
> > or should be described through an attribute and we have a working way to
> > deduce those interprocedurally. The future, I hope, is dominated by
> > interprocedural analysis and optimization. Given how far we have come
> > this summer alone, and what was said on the IPO panel at LLVM'dev, I am
> > confident we'll get there.
> 
> I suspect this will always be a practical point of tension, and I
> did not come away from that panel feeling like there was anything
> other than a fairly hand-wavey hope that somehow we’d get there.

FWIW. A few people do actively work towards that goal by actually
implementing interprocedural analyses and optimizations.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 228 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20191218/d2cc8012/attachment.sig>