[PATCH] D127982: [X86][FP16] Enable vector support for FP16 emulation
Benjamin Kramer via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Jul 19 01:58:24 PDT 2022
bkramer added a comment.
In D127982#3661389 <https://reviews.llvm.org/D127982#3661389>, @pengfei wrote:
> Hi @aeubanks, I think it should be an inherent problem in the application and just exposed by this patch. The diff in the assembly is as expected. The problem is the `align 16` in below IR:
>
> %fusion = load ptr, ptr %buffer_table, align 8, !invariant.load !0, !dereferenceable !2, !align !1
> store half 0xH6056, ptr %fusion, align 16, !alias.scope !3, !noalias !6
>
> which makes codegen to select `movdqa`, while the flaky crashes turn out `%fusion` is not always aligned to 16.
The pointers in `%buffer_table` are known to be always 16-byte aligned, so this shouldn't be a problem. If I run this with `llc -mcpu=skx` I get
movq (%rcx), %rax # load %fusion into %rax
...
vmovdqu %ymm0, 44(%rax)
vmovdqa 44(%rax), %xmm0 # crash here
I haven't figured out yet why this happens, but adding 44 to a 16-byte aligned pointer will never be 16-byte aligned.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D127982/new/
https://reviews.llvm.org/D127982
More information about the llvm-commits
mailing list