[PATCH] Add bulk of returning of values to Mips fast-isel

Tue Feb 10 03:29:20 PST 2015

================
Comment at: lib/Target/Mips/MipsFastISel.cpp:1017-1020
@@ +1016,6 @@
+
+    // Don't bother handling odd stuff for now.
+    if ((VA.getLocInfo() != CCValAssign::Full) &&
+        (VA.getLocInfo() != CCValAssign::BCvt))
+      return false;
+
----------------
rkotler wrote:
> dsanders wrote:
> > Something seems odd here. This condition should be rejecting the rets() and retc() cases from the test case since they should have CCValAssign::SExt. However, the test case is using -fast-isel-abort.
> > 
> > I see that RetCC_O32 doesn't have a CCPromoteToType like the others do which might explain why it isn't SExt but then the question is how does a function guarantee an i16/i8 result is sign extended (and conversely, how does the caller know it doesn't need to do it itself).
> > 
> > Could you explain how the rets() and retc() cases are passing? Are they using CCValAssign::Full here?
> I'm not sure exactly what you are asking here.If you debug this you will see that these tests pass.
> 
> 1005	    CCAssignFn *RetCC = RetCC_Mips;
> (gdb) 
> 1006	    CCInfo.AnalyzeReturn(Outs, RetCC);
> (gdb) 
> 1009	    if (ValLocs.size() != 1)
> (gdb) 
> 1012	    CCValAssign &VA = ValLocs[0];
> (gdb) 
> 1013	    const Value *RV = Ret->getOperand(0);
> (gdb) 
> 1016	    if ((VA.getLocInfo() != CCValAssign::Full) &&
> (gdb) call RV->dump()
>   %0 = load i16* @s, align 2
> (gdb) next
> 1021	    if (!VA.isRegLoc())
> (gdb) print VA.getLocInfo()
> $1 = llvm::CCValAssign::Full
> (gdb) 
> 
> For this test I'm just using rets
> 
> ; RUN: llc -march=mipsel -relocation-model=pic -O0 -mips-fast-isel -fast-isel-abort -mcpu=mips32r2 \
> ; RUN:     < %s | FileCheck %s
> 
> @i = global i32 75, align 4
> @s = global i16 -345, align 2
> @c = global i8 118, align 1
> @f = global float 0x40BE623360000000, align 4
> @d = global double 1.298330e+03, align 8
> 
> ; Function Attrs: nounwind
> define signext i16 @rets() {
> entry:
> ; CHECK-LABEL: rets
>   %0 = load i16* @s, align 2
>   ret i16 %0
> ; CHECK:    	lui	$[[REG_GPa:[0-9]+]], %hi(_gp_disp)
> ; CHECK:	addiu	$[[REG_GPb:[0-9]+]], $[[REG_GPa]], %lo(_gp_disp)
> ; CHECK:	addu	$[[REG_GP:[0-9]+]], $[[REG_GPb]], $25
> ; CHECK:	lw	$[[REG_S_ADDR:[0-9]+]], %got(s)($[[REG_GP]])
> ; CHECK:	lhu	$[[REG_S:[0-9]+]], 0($[[REG_S_ADDR]])
> ; CHECK:	seh	$2, $[[REG_S]]
> ; CHECK:	jr	$ra
> }
> 
> Fast-isel (not just the mips port) uses a kind of fake legalization to deal with sizes that are not supported directly by the machine like chars and shorts. It's a little
> suspect to me but it seems to work and this patch plus the next 12 after this, pass all of test-suite. Even small mistakes in any type of return types not being
> sign extended properly will cause test-suite failures so I think everything is fine here.
> 
> 
> 
> 
> 
I believe I've found the root of my confusion. The promotion from i16 to i32 is being handled by GetReturnInfo() and it is this function that sets the SExt flag in CCValAssign. The strange bit is our SelectionDAG implementation never calls this function nor the function that would normally call it for (TargetLowering::LowerCallTo()). This is going to need looking into at some point.

I see how this code works now. It's promoting types in a different from the way our SelectionDAG implementation which worries me a little but it does look like it's doing the right thing.

> Even small mistakes in any type of return types not being sign extended properly will cause test-suite failures so I think everything is fine here.

Not necessarily. If both caller and callee agree to do the wrong thing then the test-suite will pass despite the calling convention being wrong. Big-endian N32 and N64 had several examples of this, and there was one in O32 too. One concrete example, is if the callee sign-extends when it isn't supposed to and the caller doesn't sign-extend when it is supposed to. In this situation, calling clang-compiled code from clang-compiled code will work but calling gcc-compiled code will not.

================
Comment at: lib/Target/Mips/MipsFastISel.cpp:1186-1198
@@ -1107,14 +1185,15 @@
 
 bool MipsFastISel::emitIntExt(MVT SrcVT, unsigned SrcReg, MVT DestVT,
                               unsigned DestReg, bool IsZExt) {
   if (IsZExt)
     return emitIntZExt(SrcVT, SrcReg, DestVT, DestReg);
   return emitIntSExt(SrcVT, SrcReg, DestVT, DestReg);
 }
 
 unsigned MipsFastISel::emitIntExt(MVT SrcVT, unsigned SrcReg, MVT DestVT,
                                   bool isZExt) {
   unsigned DestReg = createResultReg(&Mips::GPR32RegClass);
-  return emitIntExt(SrcVT, SrcReg, DestVT, DestReg, isZExt);
+  bool Success = emitIntExt(SrcVT, SrcReg, DestVT, DestReg, isZExt);
+  return Success? DestReg: 0;
 }
 
----------------
rkotler wrote:
> rkotler wrote:
> > dsanders wrote:
> > > The inconsistency between the return types of these two overloaded functions is going to really confuse someone when they accidentally get boolean 1 instead of a register number. Going by the change in the patch, I suspect it already has.
> > > 
> > > It doesn't have to be in this patch but please make them consistent ASAP. My preference is to settle on returning the result register since that's usable as a boolean and a register number.
> > Agreed. I'll fix this in a separate patch.
> These two functions have different numbers of parameters.
> 
> In once case you pass the destination register and it returns boolean if it was successful.
> 
> In the other case it allocates a destination register and returns it, returning 0 if it was unable to. This is a common convention in fast-isel.
> 
> Depending on how this function is used, it makes sense to let it allocation the destination or to allocate it yourself.
> 
> 
> 
You've already said you'll fix this in a later patch but you followed up with another comment that appeared to be implying that it's not a problem. Apologies if I've read your second comment incorrectly.

Being given a register number vs allocating one internally isn't the issue. The problem is that the return types can't be used in the same way. 

Consider the following code:
  unsigned ResultReg = emitIntExt(SrcVT, SrcReg, DestVT, isZExt);
  if (ResultReg)
    updateValueMap(I, ResultReg); 
and this code:
  unsigned ResultReg = emitIntExt(SrcVT, SrcReg, DestVT, ResultReg, isZExt);
  if (ResultReg)
    updateValueMap(I, ResultReg); 
Both look correct and compile but the second one is wrong since ResultReg is actually 0 or 1. If we want the function to be overloaded then we need to return semantically compatible types so that we don't leave this kind of trap for other programmers.

================
Comment at: test/CodeGen/Mips/Fast-ISel/retabi.ll:34-35
@@ +33,4 @@
+; CHECK:	lw	$[[REG_S_ADDR:[0-9]+]], %got(s)($[[REG_GP]])
+; CHECK:	lhu	$[[REG_S:[0-9]+]], 0($[[REG_S_ADDR]])
+; CHECK:	seh	$2, $[[REG_S]]
+; CHECK:	jr	$ra
----------------
rkotler wrote:
> dsanders wrote:
> > I think I already know the answer to this question but why not use lh instead of lhu+seh?
> This is happening from different steps combining. This is not an optimizing pass, it's fast-isel.
> 
> As long as 16 bit quantity is just loaded into a register, it does not need to be
> sign extended. later if it is converted to a 32 bit quantify, then the half word needs
> to be sign extended.
Thanks for confirming what I was thinking.

http://reviews.llvm.org/D5920

EMAIL PREFERENCES
  http://reviews.llvm.org/settings/panel/emailpreferences/