[llvm-dev] invalid code generated on Windows x86_64 using skylake-specific features
Andrew Kelley via llvm-dev
llvm-dev at lists.llvm.org
Mon Oct 2 21:55:00 PDT 2017
The crashes are gone, but I'm still getting weird behavior with cpu native
features turned on. Example:
const assert = @import("std").debug.assert;
test "f128" {
if (make_f128(1.0) == 1.1) @panic("wrong");
}
fn make_f128(x: f128) -> f128 { x }
; Function Attrs: nobuiltin nounwind
define internal fastcc fp128 @make_f128(fp128) unnamed_addr #0 !dbg !357 {
Entry:
%x = alloca fp128, align 16
store fp128 %0, fp128* %x, align 16
call void @llvm.dbg.declare(metadata fp128* %x, metadata !362, metadata
!219), !dbg !363
%1 = load fp128, fp128* %x, align 16, !dbg !364
ret fp128 %1, !dbg !367
}
; Function Attrs: nobuiltin nounwind
define fastcc void @f128() #0 !dbg !312 {
Entry:
%0 = call fastcc fp128 @make_f128(fp128
0xL00000000000000003FFF000000000000), !dbg !315
%1 = fcmp fast oeq fp128 %0, 0xLA0000000000000003FFF199999999999, !dbg
!317
br i1 %1, label %Then, label %Else, !dbg !317
Then: ; preds = %Entry
call void @panic(%"[]u8"* bitcast ({ i8*, i64 }* @7 to %"[]u8"*)), !dbg
!318
unreachable, !dbg !318
Else: ; preds = %Entry
ret void, !dbg !319
}
This is calling the panic function, when clearly these f128 floats do not
equal each other. When I revert to not using target-native features, the
test passes.
On Tue, Oct 3, 2017 at 12:34 AM, Andrew Kelley <superjoe30 at gmail.com> wrote:
> I tried __chkstk_ms from compiler-rt which has this definition:
>
> DEFINE_COMPILERRT_FUNCTION(___chkstk_ms)
> push %rcx
> push %rax
> cmp $0x1000,%rax
> lea 24(%rsp),%rcx
> jb 1f
> 2:
> sub $0x1000,%rcx
> test %rcx,(%rcx)
> sub $0x1000,%rax
> cmp $0x1000,%rax
> ja 2b
> 1:
> sub %rax,%rcx
> test %rcx,(%rcx)
> pop %rax
> pop %rcx
> ret
> END_COMPILERRT_FUNCTION(___chkstk_ms)
>
>
> except I called it __chkstk since that's the symbol that LLVM generated a
> dependency on. And it passed all my tests, with optimizations on and off.
>
> Can anyone shed some light on this?
>
> On Tue, Oct 3, 2017 at 12:14 AM, Andrew Kelley <superjoe30 at gmail.com>
> wrote:
>
>> I figured it out. I was using this implementation of __chkstk from
>> compiler-rt:
>>
>> DEFINE_COMPILERRT_FUNCTION(___chkstk)
>> push %rcx
>> cmp $0x1000,%rax
>> lea 16(%rsp),%rcx // rsp before calling this routine -> rcx
>> jb 1f
>> 2:
>> sub $0x1000,%rcx
>> test %rcx,(%rcx)
>> sub $0x1000,%rax
>> cmp $0x1000,%rax
>> ja 2b
>> 1:
>> sub %rax,%rcx
>> test %rcx,(%rcx)
>>
>> lea 8(%rsp),%rax // load pointer to the return address
>> into rax
>> mov %rcx,%rsp // install the new top of stack pointer
>> into rsp
>> mov -8(%rax),%rcx // restore rcx
>> push (%rax) // push return address onto the stack
>> sub %rsp,%rax // restore the original value in rax
>> ret
>> END_COMPILERRT_FUNCTION(___chkstk)
>>
>> (source https://github.com/llvm-project/llvm-project-2017050
>> 7/blob/release_50/compiler-rt/lib/builtins/x86_64/chkstk2.S)
>>
>> When I replaced it with a simple `ret`, everything worked.
>>
>> The disassembled ntdll implementation is:
>>
>> __chkstk:
>> 1800a9f60: 48 83 ec 10 subq $16, %rsp
>> 1800a9f64: 4c 89 14 24 movq %r10, (%rsp)
>> 1800a9f68: 4c 89 5c 24 08 movq %r11, 8(%rsp)
>> 1800a9f6d: 4d 33 db xorq %r11, %r11
>> 1800a9f70: 4c 8d 54 24 18 leaq 24(%rsp), %r10
>> 1800a9f75: 4c 2b d0 subq %rax, %r10
>> 1800a9f78: 4d 0f 42 d3 cmovbq %r11, %r10
>> 1800a9f7c: 65 4c 8b 1c 25 10 00 00 00 movq %gs:16, %r11
>> 1800a9f85: 4d 3b d3 cmpq %r11, %r10
>> 1800a9f88: 73 15 jae 21 <__chkstk+0x3F>
>> 1800a9f8a: 66 41 81 e2 00 f0 andw $61440, %r10w
>> 1800a9f90: 4d 8d 9b 00 f0 ff ff leaq -4096(%r11), %r11
>> 1800a9f97: 45 84 1b testb (%r11), %r11b
>> 1800a9f9a: 4d 3b d3 cmpq %r11, %r10
>> 1800a9f9d: 75 f1 jne -15 <__chkstk+0x30>
>> 1800a9f9f: 4c 8b 14 24 movq (%rsp), %r10
>> 1800a9fa3: 4c 8b 5c 24 08 movq 8(%rsp), %r11
>> 1800a9fa8: 48 83 c4 10 addq $16, %rsp
>>
>> On Mon, Oct 2, 2017 at 1:37 PM, Reid Kleckner <rnk at google.com> wrote:
>>
>>> Can you post test.obj somewhere, and maybe the LLVM IR if you can get
>>> it? If it really was reading address 0xFFFFFFFFFFFFFFFF, then RBP must
>>> have been completely corrupted, probably by the prologue.
>>>
>>> On Sat, Sep 30, 2017 at 6:27 PM, Andrew Kelley via llvm-dev <
>>> llvm-dev at lists.llvm.org> wrote:
>>>
>>>> I suspect that there are 2 issues here:
>>>>
>>>> * I have incorrect alignment somewhere
>>>> * MSVC / .pdb / CodeView debugging is not working correctly.
>>>>
>>>> I think the latter would help solve the former.
>>>>
>>>> I will send out a new email later talking about the issues I'm having
>>>> debugging llvm-generated binaries with MSVC.
>>>>
>>>> On Sat, Sep 30, 2017 at 3:33 PM, Andrew Kelley <superjoe30 at gmail.com>
>>>> wrote:
>>>>
>>>>> I have this code, which works fine on MacOS and Linux hosts:
>>>>>
>>>>> const char *target_specific_cpu_args;
>>>>> const char *target_specific_features;
>>>>> if (g->is_native_target) {
>>>>> target_specific_cpu_args = ZigLLVMGetHostCPUName();
>>>>> target_specific_features = ZigLLVMGetNativeFeatures();
>>>>> } else {
>>>>> target_specific_cpu_args = "";
>>>>> target_specific_features = "";
>>>>> }
>>>>>
>>>>> g->target_machine = LLVMCreateTargetMachine(target_ref,
>>>>> buf_ptr(&g->triple_str),
>>>>> target_specific_cpu_args, target_specific_features,
>>>>> opt_level, reloc_mode, LLVMCodeModelDefault);
>>>>>
>>>>>
>>>>>
>>>>> char *ZigLLVMGetHostCPUName(void) {
>>>>> std::string str = sys::getHostCPUName();
>>>>> return strdup(str.c_str());
>>>>> }
>>>>>
>>>>> char *ZigLLVMGetNativeFeatures(void) {
>>>>> SubtargetFeatures features;
>>>>>
>>>>> StringMap<bool> host_features;
>>>>> if (sys::getHostCPUFeatures(host_features)) {
>>>>> for (auto &F : host_features)
>>>>> features.AddFeature(F.first(), F.second);
>>>>> }
>>>>>
>>>>> return strdup(features.getString().c_str());
>>>>> }
>>>>>
>>>>> On this windows laptop that I am testing on, I get these values:
>>>>>
>>>>> target_specific_cpu_args: skylake
>>>>>
>>>>> target_specific_features: +sse2,+cx16,-tbm,-avx512ifma,-
>>>>> avx512dq,-fma4,+prfchw,+bmi2,+xsavec,+fsgsbase,+popcnt,+aes,
>>>>> +xsaves,-avx512er,-avx512vpopcntdq,-clwb,-avx512f,-clzero,-p
>>>>> ku,+mmx,-lwp,-xop,+rdseed,-sse4a,-avx512bw,+clflushopt,+xsav
>>>>> e,-avx512vl,-avx512cd,+avx,-rtm,+fma,+bmi,+rdrnd,-mwaitx,+ss
>>>>> e4.1,+sse4.2,+avx2,+sse,+lzcnt,+pclmul,-prefetchwt1,+f16c,+s
>>>>> sse3,+sgx,+cmov,-avx512vbmi,+movbe,+xsaveopt,-sha,+adx,-avx512pf,+sse3
>>>>>
>>>>>
>>>>> It successfully creates a binary, but the binary when run crashes with:
>>>>>
>>>>> Unhandled exception at 0x00007FF7C9913BA7 in test.exe: 0xC0000005:
>>>>> Access violation reading location 0xFFFFFFFFFFFFFFFF.
>>>>>
>>>>> The disassembly of the crashed instruction is:
>>>>>
>>>>> 00007FF7C9913BA7 vmovdqa xmmword ptr [rbp-20h],xmm0
>>>>>
>>>>> There is no callstack or source in the MSVC debugger. The .pdb
>>>>> produced is 64KB exactly. The file was linked with:
>>>>>
>>>>> lld -NOLOGO -DEBUG -MACHINE:X64 /SUBSYSTEM:console -OUT:.\test.exe
>>>>> -NODEFAULTLIB -ENTRY:_start ./zig-cache/test.obj ./zig-cache/builtin.obj
>>>>> ./zig-cache/compiler_rt.obj ./zig-cache/kernel32.lib
>>>>>
>>>>>
>>>>> When I change the call to LLVMCreateTargetMachine so that both
>>>>> target_specific_cpu_args and target_specific_features are the empty
>>>>> string, the produced binary is valid and runs successfully.
>>>>>
>>>>> Is this an LLVM bug? Am I using the API incorrectly? Is there more
>>>>> information I can provide to LLVM-dev mailing list that would make it
>>>>> easier to help me?
>>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> llvm-dev at lists.llvm.org
>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20171003/5da2ca14/attachment.html>
More information about the llvm-dev
mailing list