[PATCH] Infer known bits from dominating conditions

Tue Mar 3 16:32:05 PST 2015

> I think all of the issues you raised should be evaluated and addressed.   I'm even willing to do most of the work to do so. However, I would really strongly prefer doing that *in tree*.  You have access to workloads I don't.  So does Nick.  So do other interested parties.  What I want is to check in a shared implementation (or two, or three), and give each of us a chance to measure on our workloads.  Running with a set of flags is generally much easier than building a custom build with a potentially tweaked patch and keeping track of which patch generated which set of results.

Okay. However, being asked to try an alternate implementation technique is part of the code review process. I think there are very good reasons to prefer this alternate strategy if it is practical...

> Now, on the technical side again, I think we're coming from two different perspectives.  My overwhelming concern is compile time. We've tried various attempts at value property propagation in the past and all of them have failed due to that issue.  What I'm aiming to do is find *something* which can at least catch the simple cases and can be turned on without impacting performance.  I'd like to push that as far as possible, but that's a second order concern. You seem to be focused more on the effectiveness side, with a second order interest in performance.   Given past history, I don't think that's the right approach to take here.

This seems a bit unfair.  I am concerned with compile time, but...

1. ValueTracking is slow currently, and the way it is used by InstCombine (etc.) make it very expensive. Really, we need a way of caching the results (instead of recomputing things each time each instruction is visited, which can happen many times as each instructions users are processed). To really fix the compile time issues here, we need to fix this.
2. Should we settle for a less-general analysis to work around (1)? It obviously depends on the price. But let's not compound our problems unnecessarily. Walking all the way up the DT is likely to be too expensive, but maybe searching N dominators (for some small N) is not too expensive. This is the implementation we should desire, however, because it will allow us to do the analysis in a general way, and share that code with the code that handles @llvm.assume. Then, once (1) is fixed, we might be able to raise N somewhat. If we cannot do this, even for a small N, then we'll have to settle for doing something even cheaper, at the price of effectiveness, consistency, and future development time. We might choose to do this, but let's not go there without evidence that it is necessary. Otherwise, if there is some value of N that hits your use cases and meets your compile-time impact targets, let's begin as we mean to go on.

> Also, can you give an example of a code change that concerns you? At least on the surface this seems pretty stable to me.  Early exits from functions and loops gives you information the optimizer can exploit.   Anything else might, but might not.  That seems entirely understandable.

I'm actually concerned about patterns like this:

  $ cat /tmp/c.c 
  void bar(); // pretend this can be inlined.
  void har(); // and/or this too.

  void foo1(int a) {
    if (a < 5)
      bar();
    else
      har();
  }

  bool foo2(int a) {
    if (a < 5)
      bar();
    else
      har();

    return a < 5;
  }

where, as you might expect, in foo2 there is a second use of the comparison. Introducing this second use of the comparison should not affect whether or not ValueTracking can deduce some fact about 'a' when processing the inlined code.

  clang++ -O3 -emit-llvm -S -o - /tmp/c.c

  target datalayout = "E-m:e-i64:64-n32:64"
  target triple = "powerpc64-unknown-linux-gnu"

  define void @_Z4foo1i(i32 signext %a) #0 {
  entry:
    %cmp = icmp slt i32 %a, 5
    br i1 %cmp, label %if.then, label %if.else

  if.then:                                          ; preds = %entry
    tail call void @_Z3barv()
    br label %if.end

  if.else:                                          ; preds = %entry
    tail call void @_Z3harv()
    br label %if.end

  if.end:                                           ; preds = %if.else, %if.then
    ret void
  }

  declare void @_Z3barv() #0
  declare void @_Z3harv() #0

  define zeroext i1 @_Z4foo2i(i32 signext %a) #0 {
  entry:
    %cmp = icmp slt i32 %a, 5
    br i1 %cmp, label %if.then, label %if.else

  if.then:                                          ; preds = %entry
    tail call void @_Z3barv()
    br label %if.end

  if.else:                                          ; preds = %entry
    tail call void @_Z3harv()
    br label %if.end

  if.end:                                           ; preds = %if.else, %if.then
    ret i1 %cmp
  }

a user can "do nothing wrong", but by repeating the comparison as part of some other expression, cause known bit information to change inside otherwise unrelated blocks. This would be, as I'm sure you appreciate, quite unfortunate.

Thanks again!

http://reviews.llvm.org/D7708

EMAIL PREFERENCES
  http://reviews.llvm.org/settings/panel/emailpreferences/