[PATCH] D73209: [AIX] Implement caller arguments passed in stack memory

Wed Feb 5 07:07:36 PST 2020

cebowleratibm marked 10 inline comments as done.
cebowleratibm added a comment.

Addressed Sean's comments.  Notable change: when a float arg passes in FPR as well as PSA memory we will mark the memLoc custom.  LowerFormalArguments_AIX can ignore custom memLoc.  If a float arg passes in PSA memory and there was no FPR available, then the memLoc will be normal (non-custom.)  I've updated LowerFormalArguments_AIX such that it will pass over custom memloc but continue to report_fatal_error on non-custom memloc.  Zarko will have a patch coming to remove the report_fatal_error and implement memLoc for LowerFormalArguments_AIX.

================
Comment at: llvm/lib/Target/PowerPC/PPCISelLowering.cpp:6990
+      // register.
+      if ((ValVT == MVT::f32 || ValVT == MVT::f64) && I != E &&
+          ArgLocs[I].isMemLoc() && ArgLocs[I].getValNo() == VA.getValNo())
----------------
sfertile wrote:
> Moving the increment of the index into the loop makes this a lot more complicated even though its a small change to the code. If we have to peek the next memloc then its unavoidable, but did we consider marking the memloc we want to skip as custom and having the normal memloc handling skip it?
I can use the customMem to make it cleaner to skip over these.  I will use custom mem when the argument is also passed in FPR.  If the argument is only passed in memory, it will remain a regular memLoc.

================
Comment at: llvm/test/CodeGen/PowerPC/aix-cc-abi.ll:1098
+  %0 = load double, double* @d, align 8
+  %1 = load float, float* @f, align 4
+  call void (i32, i32, i32, i32, i32, i32, ...) @test_stackarg_float2(i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, double %0)
----------------
sfertile wrote:
> This is unused in the test.
carry over from call_test_stackarg_float.  Good catch.

================
Comment at: llvm/test/CodeGen/PowerPC/aix-cc-abi.ll:1117
+; 32BIT-DAG:   $r8 = LI 6
+; 32BIT-DAG:   STFD renamable $f1, 0, %stack.0 :: (store 8 into %stack.0)
+; 32BIT-DAG:   renamable $r9 = LWZ 0, %stack.0 :: (load 4 from %stack.0, align 8)
----------------
sfertile wrote:
> Just to make sure I understand correctly: 
> 
> This store is just for moving the double value into the gprs. We don't have to initialize the register image in the parameter save area (even though the call is variadic) because it maps into the gprs available for argument passing. If the gprs were exhausted but we still have available fprs then we have to initialize both the fpr and the register image in the PSA (this case being tested by the following test).
Your understanding is correct.  call_test_stackarg_float3, passes float args in FPR as well as PSA memory.  call_test_stackarg_float2 tests that the required GPRs are initialized.  The stack store is only an autolocal artifact required to initialize the GPRs.  On Pwr8 up the operation can be done without going through memory but we can't test it yet because it requires vector instructions that we've blocked on AIX.

================
Comment at: llvm/test/CodeGen/PowerPC/aix-cc-abi.ll:1265
+; ASM64PWR4-DAG:   li 9, 7
+; ASM64PWR4-NEXT:  bl .test_stackarg_float3
+; ASM64PWR4-NEXT:  nop
----------------
sfertile wrote:
> Where do we initialize $r10? I don't think we are checking for the asm that should be generated from 
> ```
> ; 64BIT-DAG:   STFD renamable $f1, 0, %stack.0 :: (store 8 into %stack.0)
> ; 64BIT-DAG:   renamable $x10 = LD 0, %stack.0 :: (load 8 from %stack.0)
> ```
I missed it.  We're generating and it belongs in the test.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D73209/new/

https://reviews.llvm.org/D73209