[cfe-dev] compiling CUDA w/ -fdebug-default-version=5 generates invalid PTX
David Blaikie via cfe-dev
cfe-dev at lists.llvm.org
Thu Dec 3 12:39:12 PST 2020
On Thu, Dec 3, 2020 at 12:29 PM Artem Belevich via cfe-dev <
cfe-dev at lists.llvm.org> wrote:
>
>
> On Thu, Dec 3, 2020 at 11:57 AM Alexey.Bataev <a.bataev at outlook.com>
> wrote:
>
>> Hi Artem, here is what I found about this.
>>
>> These labels are emitted only if DWARF 4 or 5 is used. They are required
>> for emission of the DW_AT_call_site attribute. The info about callsites
>> also emitted for lineinfo emission with debug info for profiling like in
>> your example (-O1 -gmlt).
>>
> Thank you for looking into this.
>
>
>> Call/CallUni instruction is treated as a separate instruction by the
>> debug info generator and it treats it as a separate call and emits labels
>> for it.
>>
> Would it help if we were to fold the whole call.uni into a single line? It
> appears that swarf injects the label after the call instruction line gets
> printed.
> What is the label supposed to point at? At the call instruction itself? Or
> at the return point?
>
I believe it's immediately after the call.
Testing an x86 example:
$ cat test.c
void f1();
void f2();
void f3() {
f1();
f2();
}
$ clang-tot -gdwarf-5 test.c -c -O3 && llvm-dwarfdump-tot test.o | grep
"DW_TAG\|_call_" && llvm-objdump -d test.o
0x0000000c: *DW_TAG*_compile_unit
0x00000023: *DW_TAG*_subprogram
DW_AT*_call_*all_calls (true)
0x0000002e: *DW_TAG_call_*site
DW_AT*_call_*origin (0x0000003b)
DW_AT*_call_*return_pc (0x0000000000000008)
0x00000034: *DW_TAG_call_*site
DW_AT*_call_*origin (0x00000041)
DW_AT*_call_*tail_call (true)
DW_AT*_call_*pc (0x000000000000000b)
0x0000003b: *DW_TAG*_subprogram
0x0000003f: *DW_TAG*_unspecified_parameters
0x00000041: *DW_TAG*_subprogram
0x00000045: *DW_TAG*_unspecified_parameters
test.o: file format elf64-x86-64
Disassembly of section .text:
0000000000000000 <f3>:
0: 50 pushq %rax
1: 31 c0 xorl %eax, %eax
3: e8 00 00 00 00 callq 0x8 <f3+0x8>
8: 31 c0 xorl %eax, %eax
a: 59 popq %rcx
b: e9 00 00 00 00 jmp 0x10 <f3+0x10>
But the tail call uses the jump location, because it can't do anything else.
You can try to mark the CallUni instructions as hasDelaySlot = 1 or stop
>> treating it as a call instruction in NVPTXInstrInfo.td. Can't say which one
>> is better/correct. Looks like the representation for call/call.uni is not
>> quite compatible with the debug info
>>
> David suggested not allowing -fdebug-default-version=5 to override the
> DWARF version provided by NVPTX back-end. That would make sense,
> considering that we can't handle the newer DWARF versions anyways.
>
> Let me see if I can fix the lineinfo generation first.
>
Yeah, -gdwarf-5 and -fdebug-default-version=5 -g should behave the same. So
whatever it is that's disabling/downgrading to DWARFv2 for NVPTX when the
user uses -gdwarf-5 shoudl do the same for -fdebug-default-version=5
>
> --Artem
>
>> -------------
>> Best regards,
>> Alexey Bataev
>>
>> 12/2/2020 6:00 PM, Artem Belevich пишет:
>>
>> Hi, Alexey!
>>
>> I've ran into an odd case with debug info generation in NVPTX.
>>
>> Reproduction:
>> ------------------------
>> __device__ __attribute__((noinline)) void bar() { printf("Hi!"); }
>> __global__ void foo() { bar(); }
>> int main(){}
>> ------------------------
>>
>> $ clang++ -v --cuda-gpu-arch=sm_70 --cuda-device-only -fdebug-default-version=5
>> a.cu -gmlt -O1 -c
>>
>> Compilation fails due to a syntax error reported by ptxas.
>> The reason for the error is that clang generates a label in the middle of
>> a `call.uni` instuction. E.g:
>>
>> { // callseq 1, 0
>> .reg .b32 temp_param_reg;
>> call.uni
>> Ltmp14:
>> _Z3barv,
>> (
>> );
>> } // callseq 1
>>
>> The odd part is that we're only generating line info and there is no
>> DWARF in the generated PTX.
>> It appears that this behavior is triggered by `-dwarf-version=5` passed
>> to cc1.
>> Looks like another case where PTX syntax breaks DWARF generator
>> assumptions.
>>
>> It's possible to work around it with an additional `-Xarch_device
>> -fdebug-default-version=2`,
>> but I'd appreciate it if you could take a look and see if that could be
>> fixed.
>>
>> --
>> --Artem Belevich
>>
>>
>
> --
> --Artem Belevich
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20201203/8f9c329c/attachment-0001.html>
More information about the cfe-dev
mailing list