[LLVMdev] AVX calling convention?
Eli Friedman
eli.friedman at gmail.com
Thu Sep 5 13:44:55 PDT 2013
On Thu, Sep 5, 2013 at 1:23 PM, Erik Schnetter <schnetter at gmail.com> wrote:
> I am tracking down an x86-64 code generation problem that has to do with
> AVX instructions. The symptom is: a function is called, and the upper half
> of the function argument (which is short16) is zero. This happens only when
> I compile code with pocl, but not when I use clang and/or llc manually.
>
> I tracked this down to the following. The call site looks like
>
> vmovdqa 24064(%rsp), %ymm0
> vmovdqa %ymm0, (%rsp)
> vzeroupper
> callq __Z14convert_char16Dv16_s
>
> which passes the argument on the stack. The callee, however, begins with
>
> __Z14convert_char16Dv16_s: ## @_Z14convert_char16Dv16_s
> .cfi_startproc
> ## BB#0: ## %entry
> pushq %rbp
> Ltmp2:
> .cfi_def_cfa_offset 16
> Ltmp3:
> .cfi_offset %rbp, -16
> movq %rsp, %rbp
> Ltmp4:
> .cfi_def_cfa_register %rbp
> vextractf128 $1, %ymm0, %xmm1
>
> which expects the argument in %ymm0. However, the vzeroupper in the caller
> just destroyed part of %ymm0...
>
> My question is:
>
> What decides this calling convention? I know that standard x86-64 should
> pass arguments in %xmm0, not %ymm0. Are there e.g. command line options,
> CPU attributes, or target triplets that would modify this? Or should this
> be filed as bug report? However, this may also be a bug in pocl as I
> haven't been able to reproduced this without pocl.
>
>
The calling convention should be clear from the LLVM IR. Make sure the
caller and callee use the same calling convention markings.
You might get strange results if one translation unit has AVX and/or AVX2
enabled, and the other has it disabled: the CPU features modify the calling
convention for AVX/AVX2 vectors.
-Eli
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130905/3a546652/attachment.html>
More information about the llvm-dev
mailing list