[llvm-commits] byval arg lowering (was: [PATCH, RFC] Fix PR13891 (AliasChain not properly maintained in ScheduleDAGInstrs::buildSchedGraph()))

Thu Oct 4 16:27:37 PDT 2012

On Sep 28, 2012, at 11:59 AM, Akira Hatanaka <ahatanak at gmail.com> wrote:

> Sergei,
> 
> I don't have any code that I can produce using llc. mips doesn't do tail call optimization yet.
> 
> I was misunderstanding the problem when I asked the question, but I wanted to know what kind of measure is taken to ensure the incoming argument area isn't overwritten when a tail call writes its outgoing arguments to the stack before the original value is read. This is a more general question which probably has nothing to do with how functions with byval arguments are handled.
> 
> Using the following program as an example, if tail call optimization were enabled for mips,
> 
> $ cat tail.c
> int g1, g2;
> 
> typedef struct {
>   int a0, a1, a2, a3, a4;
> } S1;
> 
> static int f2(S1 a) { ... }
> 
> int f1(S1 a) {
>   g2 = a.a4;
>   a.a4 = g1;
>   return f2(a);
> }
> 
> the generated code might look like this:
> 
> $ cat tail.s
> 
> 1. lw      $2,%got(g1)($28)
> 2. lw      $3,16($sp)                 // $3 = a.a4. read out a.a4.
> 3. lw      $2,0($2)                    // $2 = g1
> 4. sw      $2,16($sp)                // store g1 to stack slot of a.a4. 
> 5. lw      $25,%got(f2)($28)
> 6. addiu   $25,$25,%lo(f2)
> 7. lw      $2,%got(g2)($28)
> 8. jr      $25                           // tail call
> 9. sw      $3,0($2)                   // g2 = $3
> 
> Since store (instruction 4) writes to the same location as the earlier load (instruction 2) reads from, we should make sure the schedulers know that the underlying objects can possibly alias.
> 
> I am not sure whether the targets which currently implement tail call optimization have this problem or what is done to prevent it, but I am guessing setting the store's MachinePointerInfo.V to 0 after the node is created will suffice.

Akira,

This is a legitimate concern, but I wasn't able to come up with a test case that breaks. If the caller doesn't fully optimize the tail call, we end up with some stack adjustment that acts as a scheduling barrier.

The best solution I can suggest is that we must be careful when lowering outgoing args. If we're smart enough to reuse the caller's frame for those args, then we must be smart enough to check whether our own args are byval. If so, we either can't do the optimization, or must lower caller frame  stack accesses into "unknown" MachinePointerInfo.

If anyone can find a test case that shows llvm breaking the above rule, please file a PR. If you come across some code that isn't following the rule, patches would be great.

-Andy