[PATCH] D70401: [WIP][RISCV] Implement ilp32e ABI

Thu Dec 12 05:19:35 PST 2019

shiva0217 added inline comments.

================
Comment at: llvm/lib/Target/RISCV/RISCVISelLowering.cpp:1525
   unsigned TwoXLenInBytes = (2 * XLen) / 8;
   if (!IsFixed && ArgFlags.getOrigAlign() == TwoXLenInBytes &&
       DL.getTypeAllocSize(OrigTy) == TwoXLenInBytes) {
----------------
lenary wrote:
> shiva0217 wrote:
> > The variadic argument for ilp32e doesn't need to align to even register. We could also add a test line in vararg.ll.
> I'm not sure I agree with this interpretation of the psABI. The [[ https://github.com/riscv/riscv-elf-psabi-doc/blob/master/riscv-elf.md#ilp32e-calling-convention | ILP32E Section ]] makes no exception for variadic arguments, and the base calling convention is only defined in relation to `XLEN`, not in terms of stack alignment.
> 
> I will add a test to `vararg.ll` so the behaviour is at least tested. 
It seems to be the current GCC behavior and the following case could observe that double will not align to even pair.
  #include <stdarg.h>
  void va_double (int n, ...) {
    va_list args;
    va_start (args, n);
    if (va_arg (args, double) != 2.0)
      abort ();
    va_end (args);
   }
   int main (int a) {
    va_double (1, 2.0);
    return a;
   }

In a second thought, it seems that non-fixed double arguments may generate incorrect code, even with align even pair.
For ilp32 or lp64 ABI with feature D, stack alignment will be 16, so even pair can make sure when pushing/popping the non-fixed double to/from the stack, it will be 8-byte alignment. For ilp32e with 4-byte alignment, even pair can not guarantee the double will be pushed to stack with 8-byte alignment.

================
Comment at: llvm/test/CodeGen/RISCV/callee-saved-fpr64s.ll:11
+; This test currently fails because the machine code does not validate:
+; RUN: not llc -mtriple=riscv32 -mattr=+d -target-abi ilp32e -verify-machineinstrs < %s
+; It will need FileCheck %s -check-prefix=ILP32-LP64-NO-D
----------------
lenary wrote:
> lenary wrote:
> > shiva0217 wrote:
> > > shiva0217 wrote:
> > > > Jim wrote:
> > > > > shiva0217 wrote:
> > > > > > lenary wrote:
> > > > > > > @shiva0217 I think this test is failing because of the base pointer patch, but I'm not sure. Can you look at the issue? It thinks that x8 gets killed by a store (which I don't think should be using x8), and therefore x8 is not live when we come to the epilog. It's a super confusing issue.
> > > > > > Hi @lenary, it seems that hasBP() return false in this case, the issue trigger by register allocation allocating x8 which should be preserved. I'm not sure why it will happen, I try to write a simple C code to reproduce the case but fail to do that. Could you obtain the C code for the test case?
> > > > > It seems that RISCVRegisterInfo::getReservedRegs doesn't add x8(fp) into reserved registers (TFI->hasFP(MF) return false), then x8 is a candidate register for register allocation. After register allocation, some of fpr64 splitted into stack that makes stack need to be realign (MaxAlignment(8) > StackAlignment(4)), therefore x8 should be used as frame pointer (TFI->hasFP(MF) return true). In emitting epilogue, instructions for fp adjustment is inserted.
> > > > With the investigation from @Jim, here is the simple C could reproduce the case.
> > > >   extern double var;
> > > >   extern void callee();
> > > >   void test(){
> > > >     double val = var;
> > > >     callee();
> > > >     var = val;
> > > >   }
> > > > Thanks, @Jim 
> > > There're might be few ways to fix the issue:
> > > 1. hasFP() return true for ilp32e ABI with feature D
> > > 2. hasFP() return true for ilp32e ABI with feature D and there's is a virtual register with f64 type.
> > > 3. Not allow  ilp32e ABI with feature D.
> > > Given that most of the targets supported double float instructions have stack alignment at least eight bytes to avoid frequently realignment. Would it more reasonable to have a new embedded ABI with stack alignment at least eight bytes to support feature D?
> > @Jim, @shiva0217, thank you very much for tracking down this bug, and providing a small testcase, that's very helpful.
> > 
> > We talked about this on the call this week, and I indicated I was going to go with a solution as close to 2 as I could.
> > 
> > I have since started an investigation (which I hoped would be quicker than it is) of what happens if we implement `canRealignStackFrame` to check if FP is unused, and this also seems to solve the problem. I'm doing some deeper checks (which require implementing parts of the backend around MIR that I haven't looked at before), but I think this might be a better solution? I'll keep this patch updated on when I upload the fix for stack realignment to cover this case. In the case that this fix isn't enough, I'll look to implement solution 2.
> > 
> > In any case, it's evident that allocating a spill slot for a register that has higher spill alignment than the stack slot, is the kernel of the problem, and this may arise again depending on how we choose to implement other extensions.
> > 
> > 
> I couldn't find a reasonable way to check for a virtual (or physical) register of type fp64, without iterating over all the instructions in a function, which I'd prefer not to do.
> 
> So Instead I have implemented option 1 in `hasFP`.
I think option 1 could be a reasonable way to fix the issue.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D70401/new/

https://reviews.llvm.org/D70401