[LLVMdev] LICM/store-aliasing of global loads

Tue Jul 22 10:22:35 PDT 2008

On Jul 21, 2008, at 3:51 PM, Stefanus Du Toit wrote:

> Our frontend can guarantee that loads from globals are
> rematerializable and do not alias with any stores in any function in
> the given module. We'd like the optimization passes (and ideally the
> register allocator as well) to be able to use this fact. The globals
> are not constant "forever" but are constant during the calling of any
> given function in the module.
>
> There seem to be two major ways to expose this to the optimization
> passes and code gen:
>  - build a custom alias analysis pass that indicates that these loads
> never alias with any stores in a given function
>  - declare these globals as external constants within the module

If you can convince yourself that no interprocedural optimization
will ever get in trouble, the second approach here sounds reasonable
and simpler. But if the values aren't really constant, it may be
difficult to be sure. Building a custom alias analysis is reasonable
too.

>
> The former should give optimizations like LICM the freedom to move
> these loads around, allow them to be CSE'd, etc.
>
> The latter should technically allow the same freedom to these
> optimizations, but doesn't currently seem to. Furthermore, the latter
> should give the RA enough information to rematerialize these loads
> instead of spilling them if necessary.
>
> Below is a simple example module that illustrates this. It's just a
> memcpy loop copying between two external arrays. With unmodified TOT,
> opt -basicaa -licm for example will not move the invariant loads of @b
> and @a (into %tmp3 and %tmp5) out of the body of the for loop.

Good catch!

One way to fix this would be to have AliasSetTracker pretend that
pointers to constant memory never alias anything. That's a little
sneaky though, so offhand I think an approach such as what's in
your patch is better.

> If I apply the patch found further down, LICM moves the loads out (as
> expected), but of course this is a fairly specific fix.

Slightly better than checking for GlobalVariabls directly
is to call the AliasAnalysis' pointsToConstantMemory method.
BasicAliasAnalysis' implementation of that does exactly the same thing,
checking for constant GlobalVariables, but it would allow alias
analyses to do more sophisticated things. Could you submit a patch
for this?

> What's the right way to handle this? Should Basic AA handle this case?
> Will the RA be aware that it can remat these loads or do I need to do
> something else to allow it to know this? Will the scheduler be aware
> that it can reorder them?

It would be nice to have an AA that's smart enough to do things
like this. However for now, having code use
AliasAnalysis::pointsToConstantMemory should cover many of the
obvious cases.

>
> Obviously I can also move the loads to the entry block of the
> function, but that does not address the RA/scheduling issues and is
> difficult to do in general due to some additional semantics in our
> frontend.

In the scheduling department, LLVM is not yet using any alias
information. You can experiment with the -combiner-alias-analysis and
-combiner-global-alias-analysis options, which use AliasAnalysis
queries and do a pretty good job, but aren't very efficient and not
very widely tested. Ideally we'd like to do something better here.

For register allocation, LLVM currently has some simple hooks which
individual targets use to specify which loads are rematerializable.
See isReallyTriviallyReMaterializable. Currently this code is all
target-specific and doesn't use AliasAnalysis information, but I
think it could be reasonably generalized to use the new
MachineMemOperand information to be less target-dependent and to
make at least AliasAnalysis::pointsToConstantMemory queries.

Dan