[LLVMdev] Optimization hints for "constant" loads
Philip Reames
listmail at philipreames.com
Tue Sep 9 22:11:28 PDT 2014
I'm looking at how to optimize IR which contains reads from a field
which is known to be initialized exactly once. I've got an idea on how
to approach this, but wanted to see if others have alternate ideas or
similar problems which might spark discussion. It feels like there's a
potentially generally useful optimization hint here if we can generalize
it sufficiently without loosing optimization potential.
The problem:
struct array {
private:
// once initialized 'len' never changes
int len;
// data can be modified at will
char data[0];
public:
static array* make(int len) {
array* a = ... allocate uninitialized space
a->len = len;
return a;
}
};
void access(array* a, int idx) {
if( idx >= 0 && idx <- a->len ) {
a->data[idx] = 5;
}
}
void foo(array* a) {
for(int i = 0; i < a->len; i++) {
access(a, i);
}
}
// assume 'access' is inlined into 'foo' and the loop is unrolled a time
or two
To phrase that again in english, I've got a field which is initialized
once, with naive code which reads from it many times. I know at IR
generation time that a load from array::len is special, but I loose this
information as soon as I generate IR. In particular, I run into
aliasing problems where the location a->len is considered 'mayalias'
with unrelated stores thus preventing value forwarding, LICM, and other
desirable optimizations.
Existing Approaches:
1) Use TBAA - By tagging loads and stores to the two fields of the
array struct as disjoint branches in the TBAA tree, I can inform LLVM
that a load of 'len' never aliases with a store through 'data'. This
mostly works, and enables many loads to be forwarded by GVN, but (due to
the intervening stores) is a complete loss in EarlyCSE and (due to
intervening calls) LICM.
a) Things like http://llvm.org/bugs/show_bug.cgi?id=20805 could
improve the situation in EarlyCSE.
2) Use "invariant.load" metadata - This metadata indicates that the
field loaded from is initialized before the execution of the code being
compiled. In particular, "invariant.load" implies that the load is not
control dependent on any branch, is safe to speculate, and that no write
aliases the location read from. This mostly works, but only if
'array::make' is never visible in the IR. As soon as 'array::make'
gets inlined, all bets are off and mis-compiles may result.
a) Also, in practice, only LICM really knows about
"invariant.load". It would be pretty straight forward to teach both
EarlyCSE and GVN about them though.
http://llvm.org/bugs/show_bug.cgi?id=20806
New Approaches:
(This isn't so much "being proposed" as "being put forward for discussion".)
1) Introduce a new metadata type "initialized-before-load" which implies
that the value of any two loads tagged with the metadata along any
execution path must yield the same result.
This doesn't give much freedom to the 'first' load; it's still control
dependent, can't be reordered with preceding stores, can't be lifted out
of loops, etc... It can however be reordered with following stores or
sunk out of a loop provided the loop body is known to execute at least
once.
The second load has a lot more freedom. Provided that there is always
another load to the same location (with the metadata) provable preceding
it on all paths, it can be reordered freely, lifted over control
branches, lifted out of loops, etc... Importantly, it is also legal to
forward the value of a preceding load to a later load provided that a)
both have the metadata and b) that one load executes strictly before the
other.
A store marked "initialized-before-load" is undefined if there is a load
with the same metadata on the same location preceding it. There may be
*multiple* stores to the location along a path, provided that the first
load is strictly after *all* of them.
This seems staight forward to implement in EarlyCSE and LICM. I haven't
looked closely at GVN, but expect it's probably not hard either.
2) Introduce a slightly different metadata "initialized-once". Semantics
are very similar to the preceding except that there can only be a single
store to the location along any path.
Value forwarding from the store to a following load (with metadata) is
allowed regardless of potentially aliasing intervening stores.
This was actually my original idea, but it has a couple of problems.
First, it breaks on surprisingly common initialization patterns such as
default initialization followed by real initialization. Secondly, I'm
not sure the optimizer would always preserve the "write once" property.
In particular, the optimizer is free to divide a large write into
several smaller ones (assuming the write is not atomic.)
Thoughts? Suggestions? Similar sounding problems this might be able to
solve?
Philip
More information about the llvm-dev
mailing list