[PATCH] D127982: [X86][FP16] Enable vector support for FP16 emulation

Tue Jul 19 01:58:24 PDT 2022

bkramer added a comment.

In D127982#3661389 <https://reviews.llvm.org/D127982#3661389>, @pengfei wrote:

> Hi @aeubanks, I think it should be an inherent problem in the application and just exposed by this patch. The diff in the assembly is as expected. The problem is the `align 16` in below IR:
>
>   %fusion = load ptr, ptr %buffer_table, align 8, !invariant.load !0, !dereferenceable !2, !align !1
>   store half 0xH6056, ptr %fusion, align 16, !alias.scope !3, !noalias !6
>
> which makes codegen to select `movdqa`, while the flaky crashes turn out `%fusion` is not always aligned to 16.

The pointers in `%buffer_table` are known to be always 16-byte aligned, so this shouldn't be a problem. If I run this with `llc -mcpu=skx` I get

  movq    (%rcx), %rax  # load %fusion into %rax
  ...
  vmovdqu %ymm0, 44(%rax)
  vmovdqa 44(%rax), %xmm0  # crash here

I haven't figured out yet why this happens, but adding 44 to a 16-byte aligned pointer will never be 16-byte aligned.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D127982/new/

https://reviews.llvm.org/D127982