[LLVMdev] [RFC] Invariants in LLVM

Thu Jul 17 14:31:10 PDT 2014

On 07/17/2014 01:39 PM, Hal Finkel wrote:
> Hello everyone,
>
> I'm starting a new thread, as a follow-up to several older ones, on invariants in LLVM. Fundamentally, invariants are conditions that the optimizer is allowed to assume will be valid during the execution of the program. Many other compilers support a concept like this, MSVC's __assume, for example. GCC has a builtin called __builtin_assume_aligned which also neatly falls into this class.
First of all, thank you for working on this!  I'm thrilled to see 
progress being made here.
>
> In my initial patches, I've named the associated IR-level intrinsic @llvm.invariant(i1) (maybe it should be @llvm.assume, etc. -- please feel free to express an opinion).
My bikeshed preference would be "assume".  "invariant" makes me think 
"loop invariant" - i.e. something which needs to be proven. "assume" is 
a traditional name (in the verification world) for something which can 
be assumed true without verification and can result in invalid results 
if actually untrue.

>
> I'll attempt to highlight the primary issue involved with a simple example:
>
> int foo(int a, int b) {
>    if (b < 0) {
>      __builtin_assume(a > 5);
>      return a > 3;
>    } else {
>      __builtin_assume((a & 3) == 0);
>      return a & 1;
>    }
> }
>
> (with the current patches below, this simplifies as you'd expect)
>
> When simplifying the expression "a > 3" or "a & 1", how do you find the applicable invariants? The invariants are not direct users of the variables they affect, a in this case, but only use those variables transitively (the invariant is a user of the (a > 5) icmp, which is a user of 'a'). Also not all transitive invariant users of the variable apply at any particular point in the execution. Specifically, when we call computeKnownBits on 'a' in the above example, I can't use the invariants without knowing whether I want to know the answer at "return a > 3;" or at "return a & 1;".
This path dependence property is important.  We need to make sure this 
is clearly spelled out in the documentation or it will lead to ongoing 
confusion.
> Generally speaking, I need two things: The instruction specifying the control-flow context, and a DominatorTree in order to efficiently determine if a particular invariant applies at the specified context instruction.
>
> As a result, my currently implementations contains:
>
>   1. Changes to several analysis passes, ValueTracking (and InstSimplify), InstCombine, LazyValueInfo, and ScalarEvolution in order to find and use invariants. Invariants are generally found by doing a depth-limited filtered recursive search of users of the Value being queried for a relevant invariant.
>
>   2. Changes to the APIs of those analysis passes to take optional context instructions and DominatorTree objects, and changes to almost all users of those analysis passed to provide that information when available (this obviously touches a fair amount of code).
>
>   3. Documentation changes, lowering, aliasing-analysis updates, etc.
>
> As with MSVC's __assume(0), which means 'unreachable', I've done the same with @llvm.invariant(false).
>
> Over the past several weeks, I've posted a number of patches comprising a proposed implementation of an invariant intrinsic for LLVM. I've listed them here in reverse order (they apply bottom up).
I'll be happy to help review once we settle on desired semantics. :)
>
> Patches:
>
> http://reviews.llvm.org/D149 - A patch for Clang that adds __builtin_assume (and __assume in MS-compatibility mode) and GCC's __builtin_assume_aligned.
>
> http://reviews.llvm.org/D4568 - Adds awareness of invariants to ScalarEvolution (specifically when looking for loop guards)
> http://reviews.llvm.org/D4567 - A small addition to InstCombine to look for return values with all known bits
> http://reviews.llvm.org/D4543 - Adds a bunch more patterns for recognizing known bits from invariants in value tracking
> http://reviews.llvm.org/D4511 - Adds invariant awareness to LazyValueInfo (so invariants can be used by jump threading and related optimizations)
> http://reviews.llvm.org/D4491 - Adds basic canonicalization of the invariant intrinsics themselves (splitting invariant(a && b), etc.)
> http://reviews.llvm.org/D181 - This adds a scalar-evolution-powered pass, run late in the pipeline, to do more sophisticated updating of pointer alignments based on invariants.
> http://reviews.llvm.org/D4490 - An update to computeKnownBits and other functions in ValueTracking, and almost all callers of these functions, to use invariants to compute known bits of quantities that applicable invariant conditions (this also handles updating basic pointer alignments).
> http://reviews.llvm.org/D180 - Adds "ephemeral" value recognition to the inliner and basic code metrics so that side-effect-free instructions only contributing to invariants don't affect inlining, loop unrolling, etc.
> http://reviews.llvm.org/D178 - Adds the llvm.invariant intrinsic and basic lowering properties.
>
> Some points for discussion [by Philip, edited slightly by me]:
>
>   1. The current patches remove llvm.invariant(undef) as dead. Would it be better to transform this into 'unreachable' as is done for llvm.invariant(false).
>
>   2. Would adding a canonicalization of if(c) { unreachable } to llvm.invariant(c) would be worthwhile?
A couple to add:
3. An "llvm.invariant" has zero code generation cost.  Given that, a lot 
of pattern matching and profitability heuristics will need adjusted to 
ignore them.  I worry about the maintenance cost of this in the 
optimizer.  I can see a couple of possibilities here:
- Canonicalize the placement of "llvm.invariants: at the end of each 
basic block.  This at least reduces the patterns to be matched.
- Remove the invariant instructions entirely from the in flight IR. Use 
the "llvm.invariant" instruction only for serialization, but have the 
conditions stored directly on the basic blocks when in memory.  This 
would reduce the pattern matching problem, but at cost of extra complexity.

4. How will the presence of "llvm.invariants" effect things like code 
placement?  I could see have lots of extra users (which are dropped late 
in the optimizer) causing somewhat surprising results. (i.e. we perform 
CSE, instruction gets placed differently, performance differs)

>
> Finally, I'd like to mention that alternate schemes are certainly possible:
>
>   1. We could 'hide' all invariants inside dominating blocks guarded by never-true conditional branches (this has the benefit of being able to express invariants using expressions that normally might have side effects, and could simplify the process of finding applicable invariants) [a suggestion by Richard and Nick].
I'm not really clear on what this proposal would entail.  Could you 
spell out an example?  (If you think this is worth discussing further.)
>
>   2. We might want to have passes precompute the Value->(set of Invariants) map, and update it as necessary instead of doing transitive-user searches [a suggestion by Chandler].
This seems like something to consider as a future enhancement.
>
>   3. I'm sure there is other design space :-)
>
> If you're interested in this functionality, please provide feedback, use cases, etc. (and help review patches).
Interested!
>
> Thanks in advance,
> Hal
Philip