[LLVMdev] [RFC] Invariants in LLVM

Thu Jul 17 14:44:03 PDT 2014

On Thu, Jul 17, 2014 at 5:31 PM, Philip Reames <listmail at philipreames.com>
wrote:

> 3. An "llvm.invariant" has zero code generation cost.  Given that, a lot
> of pattern matching and profitability heuristics will need adjusted to
> ignore them.
>

FWIW, this has been the fundamental point of contention in the entire
design. I've discussed this several times with Andy, Hal, and others. I'll
try to summarize a few points here, although getting all of the points is
beyond me. Also, this is *my* takeaway, I'm probably not doing justice to
other positions here, but hopefully others will chime in.

IMO, this cannot be avoided. This is *the* tradeoff of encoding
assumptions: you risk the complexity of the assumption you are expressing
imposing more optimization costs than you get in simplifications due to the
extra information. I think we should just make this very clear in the
documentation, and then not worry about it too much. The idea is to only
use this to express really important, and impossible to deduce invariants
of the program or high-level abstraction in the IR. Unless it has really
significant effects, it isn't worth it, don't do it. =/ It may not be the
most theoretically satisfying result, but I think it is practical.

The other ideas have been to try to use some metadata encoding scheme to
partition the abstraction from the actual IR. However, all of these IMO
degenerate to specifying a "shadow IR" -- an entirely different IR that we
still have to canonicalize, optimize, and pattern match, but which is
somehow different and separated from the actual IR. I think the complexity
of this is tremendously higher and would be a real problem long-term. I
also don't think that the gain is worth the tradeoff.

I worry about the maintenance cost of this in the optimizer.

As long as we don't go "crazy" trying to recover the performance, it should
be OK. Hal has already put together the patches needed for his current
approach. We'll likely have to tweak this approach a little, but we don't
need to go throguh and change every single hasOneUse() check to do
something special with these invariants (or assumptions).

>  I can see a couple of possibilities here:
> - Canonicalize the placement of "llvm.invariants: at the end of each basic
> block.  This at least reduces the patterns to be matched.
>

I think that the use of use-def SSA graph information makes the placement
of these unimportant. On the flip side, sinking the invariant below a call
which might unwind would reduce scope over which the invariant applies.

- Remove the invariant instructions entirely from the in flight IR. Use the
> "llvm.invariant" instruction only for serialization, but have the
> conditions stored directly on the basic blocks when in memory.  This would
> reduce the pattern matching problem, but at cost of extra complexity.
>

This is very similar to the metadata approach -- it boils down to needing
to have and hold a reasonably complete "IR" which isn't the actual IR.

>
> 4. How will the presence of "llvm.invariants" effect things like code
> placement?  I could see have lots of extra users (which are dropped late in
> the optimizer) causing somewhat surprising results. (i.e. we perform CSE,
> instruction gets placed differently, performance differs)
>

Somewhat surprising, but I suspect it won't be terribly different. These
won't make past codegen prep (and maybe removed even before LSR or such)
and so shouldn't really change the scheduling within a basic block.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140717/83116d38/attachment.html>