[PATCH] D94597: [X86] Lower calls with clang.arc.attachedcall bundle

Ahmed Bougacha via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Apr 27 10:42:44 PDT 2021


ab added a comment.

In D94597#2632887 <https://reviews.llvm.org/D94597#2632887>, @fhahn wrote:

> Unfortunately the ObjC runtime matches different sequences on AArc64 and X86. This is documented here https://github.com/opensource-apple/objc4/blob/master/runtime/objc-object.h#L974
>
> On AArch64, the callee only checks for `mov x29, x29` in the caller and the position of the claim/autorelease runtime call does not matter, which makes things much simpler. On X86_64, it specifically looks for the marker instructions, immediately followed by the runtime call. So we want to make sure that it is very hard to break up this specific sequence, to avoid missing out on the runtime optimization.

Interesting, I didn't know that!  I guess the mov rax/rdi was a natural part of the sequence on x86_64, and the old original runtime check was opportunistically based on the existing codegen.

So, looking at the patch, this looks correct, but I'm still mildly uncomfortable with the `sel` bit surviving that late.  Here's a maybe-terrible idea: would it work if you replaced the `i64 0/1` in the bundle with the actual pointer to the called function?  e.g.,

  %r = call i8* @foo(i64 %c, i64 %b, i64 %a) [ "clang.arc.attachedcall"(i8*(i8*)* @_objc_retainAutoreleasedReturnValue) ]

on aarch64 you just do the bundle with no args (though "attachedcall" may not be the best name at that point);  on x86 you make the frontend pass the actual function it wants, and the backend doesn't care and just turns the GV into a target globaladdr, and turns it into a call later


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D94597/new/

https://reviews.llvm.org/D94597



More information about the llvm-commits mailing list