[llvm-dev] Reassociate lose parallelism
Jesper Antonsson via llvm-dev
llvm-dev at lists.llvm.org
Wed Sep 12 05:58:57 PDT 2018
I compile this very simple c-program:
#define T unsigned
T foo(T a, T b, T c, T d) {
return (a+b)+(c+d);
}
Before reassociate, the first two adds in the IR are made in parallel:
entry:
%add = add i16 %a, %b
%add1 = add i16 %c, %d
%add2 = add i16 %add, %add1
ret i16 %add2
After reassociate, the adds have been serialized:
entry:
%add1 = add i16 %b, %a
%add = add i16 %add1, %c
%add2 = add i16 %add, %d
ret i16 %add2
It seems to me that RewriteExprTree() does this and there's this comment:
// Not the last operation. The left-hand side will be a sub-expression
// while the right-hand side will be the current element of Ops.
So I gather the serialization is a result of this algorithm.
Now, my question is if the reassociate pass is supposed to care about the
depth of expression trees, or if a conscious tradeoff has been made to not
care?
(I made a quick hack to bail out if the depth of the original expression
would increase in RewriteExprTree(). Our benchmark suite had the hack kick
in a few times, with a clear improvement in one benchmark and another
benchmark being better in unweighted cycles but worse in loop weighted
cycles.)
Regards,
Jesper
More information about the llvm-dev
mailing list