[llvm-dev] Reassociate lose parallelism

Jesper Antonsson via llvm-dev llvm-dev at lists.llvm.org
Wed Sep 12 05:58:57 PDT 2018


I compile this very simple c-program:

#define T unsigned
T foo(T a, T b, T c, T d) {
  return (a+b)+(c+d);
}

Before reassociate, the first two adds in the IR are made in parallel:

entry:
  %add = add i16 %a, %b
  %add1 = add i16 %c, %d
  %add2 = add i16 %add, %add1
  ret i16 %add2

After reassociate, the adds have been serialized:

entry:
  %add1 = add i16 %b, %a
  %add = add i16 %add1, %c
  %add2 = add i16 %add, %d
  ret i16 %add2

It seems to me that RewriteExprTree() does this and there's this comment:
    // Not the last operation.  The left-hand side will be a sub-expression
    // while the right-hand side will be the current element of Ops.
So I gather the serialization is a result of this algorithm.

Now, my question is if the reassociate pass is supposed to care about the
depth of expression trees, or if a conscious tradeoff has been made to not
care?

(I made a quick hack to bail out if the depth of the original expression
would increase in RewriteExprTree(). Our benchmark suite had the hack kick
in a few times, with a clear improvement in one benchmark and another
benchmark being better in unweighted cycles but worse in loop weighted
cycles.)

Regards,
Jesper


More information about the llvm-dev mailing list