[llvm-dev] target ABI: call parameters handling

Wed Sep 29 04:42:05 PDT 2021

Hi,

We have many times now had severe problems relating to parameter passing 
where the signext/zeroext attribute has been forgotten when building the 
LLVM I/R call instruction somewhere during compilation. The last example 
of this is MLIR: some function related to async runtime takes an int32_t 
as argument, but the place that builds the call is not adding the 
signext attribute on that parameter. As a result wrong code(!) resulted 
on SystemZ and just out of luck we happened to discover this when 
building MLIR with gcc (bug hides with clang). Details at 
https://bugs.llvm.org/show_bug.cgi?id=51898.

The first question one might ask about this is why is there no assert 
catching this in the SystemZ (or any other target with the same ABI 
handling of arguments)? The reason is that there is a third type of 
integer parameter: struct passed in register. So if there is neither a 
signext or a zeroext attribute on an i32 argument, it must be assumed 
that it contains a struct (of four bytes, for example).

I personally feel there is something missing here and a shame that this 
very serious problem lives on. I totally understand that people working 
on a target that does not have this ABI requirement may not be aware 
that their new code causes wrong-code on some platforms. Therefore I 
think it all the more important that they see some test failing as soon 
as they make the mistake.

In order to implement the assert in the backend one idea might be to 
give the struct-in-reg parameter an explicit attribute. Would it be 
reasonable/doable to invent a new attribute such as StructArg and add it 
on all such arguments throughout the whole LLVM code base? Then we could 
assert for signext/zeroext/structarg and eliminate this type of errors.

Some examples:

The SystemZ ABI requires that all integer parameters always are to be 
extended to full register width (64-bit), either signed or unsigned 
depending on the type.

-- For exampe, this function is loading 32-bits which are passed to 
another function:

int bar(int *a) {
 B  return foo(*a);
}

=> LLVM I/R:

 B  %0 = load i32, i32* %a
 B  %call = tail call signext i32 @foo(i32 signext %0)

=> SystemZ MachineInstructions:

 B  %1:gr64bit = LGF %0:addr64bit, 0, $noreg :: (load (s32) from %ir.a, 
!tbaa !2)
 B  $r2d = COPY %1:gr64bit
 B  CallJG @foo, implicit $r2d

The important mechanism here is that on LLVM I/R the 'signext' attribute 
is added to the parameter %0. That way the backend knows that a sign 
extension is needed (there is no other way to know the signedness), and 
emits the sign-extending load (LGF).

-- A matching example, on the callee side would be:

long bar(int a) {
 B  return ((long) a);
}

=>

 B  %conv = sext i32 %a to i64
 B  ret i64 %conv

=>

 B  Return implicit $r2d

The 32-bit ingoing parameter is per the ABI required to already have 
been sign-extended so the backend can simply do nothing. NB! This leads 
to wrong code if the caller forgets about that :-)

-- A struct in register:

struct S {
 B  short el0;
 B  short el1;
};

int foo(struct S arg);

void bar(short c) {
 B  struct S a = {c, c + 1};
 B  foo (a);
}

=> LLVM I/R

define void @bar(i16 signext %c) local_unnamed_addr #0 {

 B  %add = add i16 %c, 1
 B  %a.sroa.4.0.insert.ext = zext i16 %add to i32
 B  %a.sroa.0.0.insert.ext = zext i16 %c to i32
 B  %a.sroa.0.0.insert.shift = shl nuw i32 %a.sroa.0.0.insert.ext, 16
 B  %a.sroa.0.0.insert.insert = or i32 %a.sroa.0.0.insert.shift, 
%a.sroa.4.0.insert.ext
 B  %call = tail call signext i32 @foo(i32 %a.sroa.0.0.insert.insert) #2
 B  ret void
}

The i32 passed to foo does not have any attribute set, and the backend 
does not extend it. An 'int' gets the same treatment without the signext 
attribute set, which is then wrong... :-(

/Jonas