[llvm-dev] avx512 JIT backend generates wrong code on <4 x float>
Frank Winter via llvm-dev
llvm-dev at lists.llvm.org
Thu Jun 30 09:49:34 PDT 2016
Hi Hal!
Thanks, but unfortunately it didn't help. The exact same assembler
instructions are generated for both 3.8 (yesterday) and trunk (from today).
So, this really looks like a bug.
Best,
Frank
On 06/29/2016 03:48 PM, Hal Finkel wrote:
> Hi Frank,
>
> I recommend trying trunk LLVM. AVX-512 development has been very active recently.
>
> -Hal
>
> ----- Original Message -----
>> From: "Frank Winter via llvm-dev" <llvm-dev at lists.llvm.org>
>> To: "LLVM Dev" <llvm-dev at lists.llvm.org>
>> Sent: Wednesday, June 29, 2016 2:41:39 PM
>> Subject: [llvm-dev] avx512 JIT backend generates wrong code on <4 x float>
>>
>> Hi!
>>
>> When compiling the attached module with the JIT engine on an Intel
>> KNL I
>> see wrong code getting emitted. I attach a complete exploit program
>> which shows the bug in LLVM 3.8. It loads and JIT compiles the module
>> and prints the assembler. I stumbled on this since the result of an
>> actual calculation was wrong. So, it's not only the text version of
>> the
>> assembler also the machine assembler is wrong.
>>
>> When I execute the exploit program on an Intel KNL the following
>> output
>> is produced:
>>
>> CPU name = knl
>> -sse4a,-avx512bw,cx16,-tbm,xsave,-fma4,-avx512vl,prfchw,bmi2,adx,-xsavec,fsgsbase,avx,avx512cd,avx512pf,-rtm,popcnt,fma,bmi,aes,rdrnd,-xsaves,sse4.1,sse4.2,avx2,avx512er,sse,lzcnt,pclmul,avx512f,f16c,ssse3,mmx,-pku,cmov,-xop,rdseed,movbe,-hle,xsaveopt,-sha,sse2,sse3,-avx512dq,
>> Assembly:
>> .text
>> .file "module_KFxOBX_i4_after.ll"
>> .globl adjmul
>> .align 16, 0x90
>> .type adjmul, at function
>> adjmul:
>> .cfi_startproc
>> leaq (%rdi,%r8), %rdx
>> addq %rsi, %r8
>> testb $1, %cl
>> cmoveq %rdi, %rdx
>> cmoveq %rsi, %r8
>> movq %rdx, %rax
>> sarq $63, %rax
>> shrq $62, %rax
>> addq %rdx, %rax
>> sarq $2, %rax
>> movq %r8, %rcx
>> sarq $63, %rcx
>> shrq $62, %rcx
>> addq %r8, %rcx
>> sarq $2, %rcx
>> movq %rax, %rdx
>> shlq $5, %rdx
>> leaq 16(%r9,%rdx), %rsi
>> orq $16, %rdx
>> movq 16(%rsp), %rdi
>> addq %rdx, %rdi
>> addq 8(%rsp), %rdx
>> .align 16, 0x90
>> .LBB0_1:
>> vmovaps -16(%rdx), %xmm0
>> vmovaps (%rdx), %xmm1
>> vmovaps -16(%rdi), %xmm2
>> vmovaps (%rdi), %xmm3
>> vmulps %xmm3, %xmm1, %xmm4
>> vmulps %xmm2, %xmm1, %xmm1
>> vfmadd213ss %xmm4, %xmm0, %xmm2
>> vfmsub213ss %xmm1, %xmm0, %xmm3
>> vmovaps %xmm2, -16(%rsi)
>> vmovaps %xmm3, (%rsi)
>> addq $1, %rax
>> addq $32, %rsi
>> addq $32, %rdi
>> addq $32, %rdx
>> cmpq %rcx, %rax
>> jl .LBB0_1
>> retq
>> .Lfunc_end0:
>> .size adjmul, .Lfunc_end0-adjmul
>> .cfi_endproc
>>
>>
>> .section ".note.GNU-stack","", at progbits
>>
>> end assembly!
>>
>>
>> The instructions 'vfmadd213ss' are 'Fused Multiply-Add of Scalar
>> Single-Precision Floating-Point'. Those should be SIMD vector
>> instructions. Note that the KNL has 16 wide float SIMD, while the
>> exploit module uses only 4. However, the backend should be able to
>> handle this.
>>
>> Unless I receive further ideas I will file an official bug report.
>>
>> Frank
>>
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
More information about the llvm-dev
mailing list