[llvm-dev] avx512 JIT backend generates wrong code on <4 x float>
Hal Finkel via llvm-dev
llvm-dev at lists.llvm.org
Wed Jun 29 12:48:25 PDT 2016
Hi Frank,
I recommend trying trunk LLVM. AVX-512 development has been very active recently.
-Hal
----- Original Message -----
> From: "Frank Winter via llvm-dev" <llvm-dev at lists.llvm.org>
> To: "LLVM Dev" <llvm-dev at lists.llvm.org>
> Sent: Wednesday, June 29, 2016 2:41:39 PM
> Subject: [llvm-dev] avx512 JIT backend generates wrong code on <4 x float>
>
> Hi!
>
> When compiling the attached module with the JIT engine on an Intel
> KNL I
> see wrong code getting emitted. I attach a complete exploit program
> which shows the bug in LLVM 3.8. It loads and JIT compiles the module
> and prints the assembler. I stumbled on this since the result of an
> actual calculation was wrong. So, it's not only the text version of
> the
> assembler also the machine assembler is wrong.
>
> When I execute the exploit program on an Intel KNL the following
> output
> is produced:
>
> CPU name = knl
> -sse4a,-avx512bw,cx16,-tbm,xsave,-fma4,-avx512vl,prfchw,bmi2,adx,-xsavec,fsgsbase,avx,avx512cd,avx512pf,-rtm,popcnt,fma,bmi,aes,rdrnd,-xsaves,sse4.1,sse4.2,avx2,avx512er,sse,lzcnt,pclmul,avx512f,f16c,ssse3,mmx,-pku,cmov,-xop,rdseed,movbe,-hle,xsaveopt,-sha,sse2,sse3,-avx512dq,
> Assembly:
> .text
> .file "module_KFxOBX_i4_after.ll"
> .globl adjmul
> .align 16, 0x90
> .type adjmul, at function
> adjmul:
> .cfi_startproc
> leaq (%rdi,%r8), %rdx
> addq %rsi, %r8
> testb $1, %cl
> cmoveq %rdi, %rdx
> cmoveq %rsi, %r8
> movq %rdx, %rax
> sarq $63, %rax
> shrq $62, %rax
> addq %rdx, %rax
> sarq $2, %rax
> movq %r8, %rcx
> sarq $63, %rcx
> shrq $62, %rcx
> addq %r8, %rcx
> sarq $2, %rcx
> movq %rax, %rdx
> shlq $5, %rdx
> leaq 16(%r9,%rdx), %rsi
> orq $16, %rdx
> movq 16(%rsp), %rdi
> addq %rdx, %rdi
> addq 8(%rsp), %rdx
> .align 16, 0x90
> .LBB0_1:
> vmovaps -16(%rdx), %xmm0
> vmovaps (%rdx), %xmm1
> vmovaps -16(%rdi), %xmm2
> vmovaps (%rdi), %xmm3
> vmulps %xmm3, %xmm1, %xmm4
> vmulps %xmm2, %xmm1, %xmm1
> vfmadd213ss %xmm4, %xmm0, %xmm2
> vfmsub213ss %xmm1, %xmm0, %xmm3
> vmovaps %xmm2, -16(%rsi)
> vmovaps %xmm3, (%rsi)
> addq $1, %rax
> addq $32, %rsi
> addq $32, %rdi
> addq $32, %rdx
> cmpq %rcx, %rax
> jl .LBB0_1
> retq
> .Lfunc_end0:
> .size adjmul, .Lfunc_end0-adjmul
> .cfi_endproc
>
>
> .section ".note.GNU-stack","", at progbits
>
> end assembly!
>
>
> The instructions 'vfmadd213ss' are 'Fused Multiply-Add of Scalar
> Single-Precision Floating-Point'. Those should be SIMD vector
> instructions. Note that the KNL has 16 wide float SIMD, while the
> exploit module uses only 4. However, the backend should be able to
> handle this.
>
> Unless I receive further ideas I will file an official bug report.
>
> Frank
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
--
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory
More information about the llvm-dev
mailing list