<!DOCTYPE html>
<html>
<head>
<meta http-equiv="Content-Type" content="text/xhtml; charset=utf-8">
</head>
<body>
<div style="font-family:sans-serif"><div style="white-space:normal">
<p dir="auto">On 18 Dec 2019, at 11:28, Doerfert, Johannes wrote:</p>
</div>
<div style="white-space:normal"><blockquote style="border-left:2px solid #777; color:#777; margin:0 0 5px; padding-left:5px"><p dir="auto">On 12/18, John McCall wrote:</p>
<blockquote style="border-left:2px solid #777; color:#999; margin:0 0 5px; padding-left:5px; border-left-color:#999"><p dir="auto">On 16 Dec 2019, at 18:16, Doerfert, Johannes via llvm-dev wrote:</p>
<blockquote style="border-left:2px solid #777; color:#BBB; margin:0 0 5px; padding-left:5px; border-left-color:#BBB"><p dir="auto">Abstract:<br>
<br>
It is often hard or impossible to encode complex, e.g., non-boolean,<br>
information in an `llvm.assume(i1)`. This RFC describes various problems<br>
we have right now and provides alternative design ideas.<br>
<br>
<br>
<br>
Some Existing Problems:<br>
<br>
A) The boolean requirement.<br>
The current `llvm.assume(i1)` expects a boolean that is known to hold<br>
true at runtime (once the `llvm.assume` call is reached). However,<br>
forming this boolean for "arbitrary" information is hard or even<br>
impossible. Alignment, which is an easy case, already requires 3 extra<br>
instructions, one of which is a `ptrtoint` and therefore not really<br>
optimizer friendly. Dereferenceability, is even scarier. Thus, we are<br>
currently limited to (almost) boolean information when we want to<br>
manifest information in the IR (which happens due to inlining or code<br>
motion, see <a href="https://reviews.llvm.org/D69477" style="color:#BBB">https://reviews.llvm.org/D69477</a> for an example).<br>
<br>
B) The one-use checks.<br>
Various pattern matching passes check the number of uses of a value.<br>
However, while `llvm.assume` is eventually eliminated by the backend<br>
it will still increase the use count of the operand. I doubt we are<br>
able to not increase the use count at all (and keep everything else<br>
working), but what we can do is make sure the uses in "assumptions"<br>
are easy to spot, thus filter. This is not the case right now because<br>
of the additional instructions we need to make the values boolean.<br>
Even if you have `__builtin_assume(ptr);` the `ptr` use will not be<br>
the `llvm.assume` call but a `icmp`.<br>
<br>
C) The side-effect avoidance.<br>
`__assume`, `__builtin_assume`, `__builtin_assume_aligned`, and OpenMP<br>
`omp assume` are all defined to not evaluate their argument, thus to<br>
not cause the side effects that the evaluation of the argument would<br>
otherwise imply. The way we implement this restriction is by not<br>
emitting the argument expression in IR if it might cause a side<br>
effect. We warn the user if that happens. While this is generally<br>
speaking fine, it would be interesting to lift the *implementation*<br>
restriction. One benefit would be that we could implement `assert`<br>
through `__builtin_assume` properly.<br>
<br>
D) The singleton ranges.<br>
An `llvm.assume` will only provide information for a single program<br>
point not a range. Even if the beginning and the end of a range have<br>
an `llvm.assume`, there are cases where the information will not be<br>
as good as a proper range assumption. OpenMP 5.1 introduces such<br>
range assumptions but other situations would benefit from them as<br>
well. Take for example function attributes and inlining. Since we know<br>
they hold for the entire function and not only when it is entered we<br>
could encode the information over the entire range of the inlined<br>
code.<br>
<br>
<br>
Some Site Notes:<br>
<br>
- It seems of little use that we interleave the code for the assumed<br>
expression with the user code. Having the `llvm.assume` allows us to<br>
tie information to a program point, beyond that we just clutter the<br>
function with instructions that we later remove anyway.<br>
<br>
- Reconstructing information from the pattern of instructions that feed<br>
into the `llvm.assume` is also not optimal, especially since we do<br>
not need to "optimize" these instructions anyway.<br>
<br>
- The current (=arbitrary) side-effects of `llvm.assume` are too strong.<br>
We have `inaccessiblemem_or_argmemonly` and we might be able to come<br>
up with something even more specialized for this, e.g.,<br>
`control_dependences_only` to indicate that there are only control<br>
dependences.</p>
</blockquote><p dir="auto">This is all well put; I think you’ve covered the major weaknesses.<br>
<br>
</p>
<blockquote style="border-left:2px solid #777; color:#BBB; margin:0 0 5px; padding-left:5px; border-left-color:#BBB"><p dir="auto">Some Design Ideas:<br>
<br>
1) Use named operand bundles to encode information.<br>
If we want to encode property XYZ for a value %V holds at a certain<br>
program point and the property is dependent on %N we could encode<br>
that as:<br>
`llvm.assume() ["XYZ"(%V, %N)]`<br>
There are various benefits, including:<br>
- It is freely extensible.<br>
- The property is directly tied to the value. Thus, no need for<br>
encoding patterns that introduce extra instructions and uses and<br>
which we need to preserve and decode later.<br>
- Allows dynamic properties even when corresponding attributes do<br>
not, e.g., `llvm.assume() ["align"(%arg_ptr, %N)]` is fine and<br>
once `%N` becomes a constant, or we determine a lower bound, we<br>
can introduce the `align(C)` attribute for `%arg_ptr`.<br>
<br>
2) Outline assumption expression code (with side-effects).<br>
If we potentially have side-effects, or we simply have a non-trivial<br>
expression that requires to be lowered into instructions, we can<br>
outline the assumption expression code and tie it to the<br>
`llvm.assume` via another operand bundle property. It could look<br>
something like this:<br>
`__builtin_assume(foo(i) == bar(j));`<br>
will cause us to generate<br>
```<br>
/// Must return true!<br>
static bool llvm.assume.expression_#27(int i, int j) {<br>
return foo(i) == bar(j);<br>
}<br>
```<br>
and a `llvm.assume` call like this:<br>
`llvm.assume() ["assume_fn"(@llvm.assume.expression_#27, %i, %j))]<br>
So we generate the expression in a new function which we (only) tie to<br>
the `llvm.assume()` through the "assume_fn" operand bundle. This will<br>
make sure we do not accidentally evaluate the code, or assume it is<br>
evaluated and produced side-effects. We can still optimize the code<br>
and use the information that we learn from it at the `llvm.assume`<br>
site though.</p>
</blockquote><p dir="auto">I think outlining is abstractly very promising, but I’m worried about<br>
it impacting existing optimizations:<br>
<br>
- It’s won’t be obvious at all from the IR that the predicate function<br>
is dead code. It would be a shame if we ended up emitting the predicate<br>
function, or some global only it references, because we didn’t delete<br>
all the llvm.assume calls early enough to recognize that it was dead.</p>
</blockquote><p dir="auto">So, we already "delete" llvm.assume and its operands in the backends.<br>
Having a dedicated middle-end pass to do that which also deletes the<br>
associated "assume_fn" doesn't seem hard or complicated to set up.<br>
<br>
Worst case (which I really don't think would ever happen), we emit the<br>
functions which are internal and never referenced. The linker should<br>
strip them.<br>
<br>
<br>
</p>
<blockquote style="border-left:2px solid #777; color:#999; margin:0 0 5px; padding-left:5px; border-left-color:#999"><p dir="auto">- Anything passed to the predicate function will by default look like it<br>
escapes. This is particularly true if the predicate takes local<br>
variables by references, which is the easiest and most straightforwardly<br>
correct way for frontends to emit these predicates. So this will block<br>
basic memory analyses (including mem2reg!) unless they’re taught to<br>
remove or rewrite assumptions.</p>
</blockquote><p dir="auto">Partially true and we already have that problem though.<br>
<br>
Mem2reg, and others, might need to know about llvm.assume uses but I<br>
fail to see why they need to rewrite much (in the short therm). The<br>
frontend generated code would naturally look like this:<br>
<br>
%ptr.addr = alloca i32*<br>
store %ptr, %ptr.addr<br>
...<br>
%ptr.val = load %ptr.addr<br>
llvm.assume() ["align"(%ptr.val)]</p>
</blockquote></div>
<div style="white-space:normal">
<p dir="auto">I disagree; the natural way to generate this code in frontends will<br>
actually be to take the variable by reference. We can, of course, make<br>
frontends smart enough to take the variable by value if it’s<br>
obviously only loaded from in the expressions, but if the optimizers<br>
still aren’t generally aware of the intrinsic, that will just mean<br>
that assumptions pessimize slightly more abstracted code.</p>
<p dir="auto">For example, if I had this:</p>
<pre style="background-color:#F7F7F7; border-radius:5px 5px 5px 5px; margin-left:15px; margin-right:15px; max-width:90vw; overflow-x:auto; padding:5px" bgcolor="#F7F7F7"><code style="background-color:#F7F7F7; border-radius:3px; margin:0; padding:0" bgcolor="#F7F7F7"> Clang::QualType type = …;
__builtin_assume(!type.hasLocalQualifiers());
</code></pre>
<p dir="auto">At a high level, I want to be able to apply mem2reg to the value of<br>
this <code style="background-color:#F7F7F7; border-radius:3px; margin:0; padding:0 0.4em" bgcolor="#F7F7F7">QualType</code>; but at a low level, this method call takes <code style="background-color:#F7F7F7; border-radius:3px; margin:0; padding:0 0.4em" bgcolor="#F7F7F7">type</code><br>
by reference, and so the predicate function will take it by reference<br>
as well.</p>
</div>
<div style="white-space:normal"><blockquote style="border-left:2px solid #777; color:#777; margin:0 0 5px; padding-left:5px"><p dir="auto">Mem2reg should kick in just fine even if %ptr now has a "unknown" use.<br>
But that "unknown" use is much less problematic than what we have now<br>
because the user is the `llvm.assume` call and not some `ptrtoint` which<br>
is used two steps later in an `llvm.assume.<br>
<br>
If you feel I missed a problem here, please let me know.<br>
<br>
<br>
</p>
<blockquote style="border-left:2px solid #777; color:#999; margin:0 0 5px; padding-left:5px; border-left-color:#999"><p dir="auto">Unfortunately, I don’t have a great counter-proposal that isn’t a<br>
major project.<br>
<br>
(The major project is to make the predicates sub-functions within<br>
the caller. This doesn’t fix all the problems, and sub-functions<br>
introduce a host of new issues, but they do have the benefit of<br>
making the analysis much more obviously intra-procedural.)</p>
</blockquote><p dir="auto">I don't think inter-procedural reasoning is a problem or bad. Especially<br>
here with internal functions that have a single use, it is really not<br>
hard to make the connection.</p>
</blockquote></div>
<div style="white-space:normal">
<p dir="auto">It’s certainly not a problem <em>in theory</em>. <em>In theory</em> every<br>
intraprocedural analysis can be taught to go interprocedural<br>
into a predicate.</p>
<p dir="auto">John.</p>
</div>
<div style="white-space:normal"><blockquote style="border-left:2px solid #777; color:#777; margin:0 0 5px; padding-left:5px"><p dir="auto">We can easily teach the Attributor to follow these "pseudo-calls"<br>
in order to derive information from them.<br>
<br>
Thanks,<br>
Johannes</p>
</blockquote></div>
<div style="white-space:normal">
</div>
</div>
</body>
</html>