[llvm-dev] avx512 JIT backend generates wrong code on <4 x float>
Hal Finkel via llvm-dev
llvm-dev at lists.llvm.org
Thu Jun 30 10:04:40 PDT 2016
----- Original Message -----
> From: "Frank Winter" <fwinter at jlab.org>
> To: "Hal Finkel" <hfinkel at anl.gov>
> Cc: "LLVM Dev" <llvm-dev at lists.llvm.org>
> Sent: Thursday, June 30, 2016 11:49:34 AM
> Subject: Re: [llvm-dev] avx512 JIT backend generates wrong code on <4 x float>
>
> Hi Hal!
>
> Thanks, but unfortunately it didn't help. The exact same assembler
> instructions are generated for both 3.8 (yesterday) and trunk (from
> today).
>
> So, this really looks like a bug.
Okay. Please file a bug report.
-Hal
>
> Best,
> Frank
>
> On 06/29/2016 03:48 PM, Hal Finkel wrote:
> > Hi Frank,
> >
> > I recommend trying trunk LLVM. AVX-512 development has been very
> > active recently.
> >
> > -Hal
> >
> > ----- Original Message -----
> >> From: "Frank Winter via llvm-dev" <llvm-dev at lists.llvm.org>
> >> To: "LLVM Dev" <llvm-dev at lists.llvm.org>
> >> Sent: Wednesday, June 29, 2016 2:41:39 PM
> >> Subject: [llvm-dev] avx512 JIT backend generates wrong code on <4
> >> x float>
> >>
> >> Hi!
> >>
> >> When compiling the attached module with the JIT engine on an Intel
> >> KNL I
> >> see wrong code getting emitted. I attach a complete exploit
> >> program
> >> which shows the bug in LLVM 3.8. It loads and JIT compiles the
> >> module
> >> and prints the assembler. I stumbled on this since the result of
> >> an
> >> actual calculation was wrong. So, it's not only the text version
> >> of
> >> the
> >> assembler also the machine assembler is wrong.
> >>
> >> When I execute the exploit program on an Intel KNL the following
> >> output
> >> is produced:
> >>
> >> CPU name = knl
> >> -sse4a,-avx512bw,cx16,-tbm,xsave,-fma4,-avx512vl,prfchw,bmi2,adx,-xsavec,fsgsbase,avx,avx512cd,avx512pf,-rtm,popcnt,fma,bmi,aes,rdrnd,-xsaves,sse4.1,sse4.2,avx2,avx512er,sse,lzcnt,pclmul,avx512f,f16c,ssse3,mmx,-pku,cmov,-xop,rdseed,movbe,-hle,xsaveopt,-sha,sse2,sse3,-avx512dq,
> >> Assembly:
> >> .text
> >> .file "module_KFxOBX_i4_after.ll"
> >> .globl adjmul
> >> .align 16, 0x90
> >> .type adjmul, at function
> >> adjmul:
> >> .cfi_startproc
> >> leaq (%rdi,%r8), %rdx
> >> addq %rsi, %r8
> >> testb $1, %cl
> >> cmoveq %rdi, %rdx
> >> cmoveq %rsi, %r8
> >> movq %rdx, %rax
> >> sarq $63, %rax
> >> shrq $62, %rax
> >> addq %rdx, %rax
> >> sarq $2, %rax
> >> movq %r8, %rcx
> >> sarq $63, %rcx
> >> shrq $62, %rcx
> >> addq %r8, %rcx
> >> sarq $2, %rcx
> >> movq %rax, %rdx
> >> shlq $5, %rdx
> >> leaq 16(%r9,%rdx), %rsi
> >> orq $16, %rdx
> >> movq 16(%rsp), %rdi
> >> addq %rdx, %rdi
> >> addq 8(%rsp), %rdx
> >> .align 16, 0x90
> >> .LBB0_1:
> >> vmovaps -16(%rdx), %xmm0
> >> vmovaps (%rdx), %xmm1
> >> vmovaps -16(%rdi), %xmm2
> >> vmovaps (%rdi), %xmm3
> >> vmulps %xmm3, %xmm1, %xmm4
> >> vmulps %xmm2, %xmm1, %xmm1
> >> vfmadd213ss %xmm4, %xmm0, %xmm2
> >> vfmsub213ss %xmm1, %xmm0, %xmm3
> >> vmovaps %xmm2, -16(%rsi)
> >> vmovaps %xmm3, (%rsi)
> >> addq $1, %rax
> >> addq $32, %rsi
> >> addq $32, %rdi
> >> addq $32, %rdx
> >> cmpq %rcx, %rax
> >> jl .LBB0_1
> >> retq
> >> .Lfunc_end0:
> >> .size adjmul, .Lfunc_end0-adjmul
> >> .cfi_endproc
> >>
> >>
> >> .section ".note.GNU-stack","", at progbits
> >>
> >> end assembly!
> >>
> >>
> >> The instructions 'vfmadd213ss' are 'Fused Multiply-Add of Scalar
> >> Single-Precision Floating-Point'. Those should be SIMD vector
> >> instructions. Note that the KNL has 16 wide float SIMD, while the
> >> exploit module uses only 4. However, the backend should be able to
> >> handle this.
> >>
> >> Unless I receive further ideas I will file an official bug report.
> >>
> >> Frank
> >>
> >>
> >> _______________________________________________
> >> LLVM Developers mailing list
> >> llvm-dev at lists.llvm.org
> >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> >>
>
>
--
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory
More information about the llvm-dev
mailing list