[PATCH] D100225: [Clang][AArch64] Coerce integer return values through an undef vector

Andrew Savonichev via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Fri Apr 9 12:59:46 PDT 2021


asavonic created this revision.
Herald added subscribers: danielkiss, kristof.beyls.
asavonic requested review of this revision.
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

If target ABI requires coercion to a larger type, higher bits of the
resulting value are supposed to be undefined. However, before this
patch Clang CG used to generate a `zext` instruction to coerce a value
to a larger type, forcing higher bits to zero.

This is problematic in some cases:

  struct st {
    int i;
  };
  struct st foo(i);
  struct st bar(int x) {
    return foo(x);
  }

For AArch64 Clang generates the following LLVM IR:

  define i64 @bar(i32 %x) {
    %call = call i64 @foo(i32 %0)
    %coerce.val.ii = trunc i64 %call to i32
    ;; ... store to alloca and load back
    %coerce.val.ii2 = zext i32 %1 to i64
    ret i64 %coerce.val.ii2
  }

Coercion is done with a `trunc` and a `zext`. After optimizations we
get the following:

  define i64 @bar(i32 %x) local_unnamed_addr #0 {
  entry:
    %call = tail call i64 @foo(i32 %x)
    %coerce.val.ii2 = and i64 %call, 4294967295
    ret i64 %coerce.val.ii2
  }

The compiler has to keep semantic of the `zext` instruction, even
though no extension or truncation is required in this case.
This extra `and` instruction also prevents tail call optimization.

In order to keep information about undefined higher bits, the patch
replaces `zext` with a sequence of an `insertelement` and a `bitcast`:

  define i64 @_Z3bari(i32 %x) local_unnamed_addr #0 {
  entry:
    %call = tail call i64 @_Z3fooi(i32 %x) #2
    %coerce.val.ii = trunc i64 %call to i32
    %coerce.val.vec = insertelement <2 x i32> undef, i32 %coerce.val.ii, i8 0
    %coerce.val.vec.ii = bitcast <2 x i32> %coerce.val.vec to i64
    ret i64 %coerce.val.vec.ii
  }

InstCombiner can then fold this sequence into a nop, and allow tail
call optimization.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D100225

Files:
  clang/lib/CodeGen/CGCall.cpp
  clang/test/CodeGen/arm64-arguments.c
  clang/test/CodeGenCXX/trivial_abi.cpp

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D100225.336547.patch
Type: text/x-patch
Size: 8978 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20210409/181fa880/attachment-0001.bin>


More information about the cfe-commits mailing list