[cfe-dev] compiling CUDA w/ -fdebug-default-version=5 generates invalid PTX

Artem Belevich via cfe-dev cfe-dev at lists.llvm.org
Thu Dec 3 12:28:49 PST 2020

On Thu, Dec 3, 2020 at 11:57 AM Alexey.Bataev <a.bataev at outlook.com> wrote:

> Hi Artem, here is what I found about this.
> These labels are emitted only if DWARF 4 or 5 is used. They are required
> for emission of the DW_AT_call_site attribute. The info about callsites
> also emitted for lineinfo emission with debug info for profiling like in
> your example (-O1 -gmlt).
Thank you for looking into this.

> Call/CallUni instruction is treated as a separate instruction by the debug
> info generator and it treats it as a separate call and emits labels for it.
Would it help if we were to fold the whole call.uni into a single line? It
appears that swarf injects the label after the call instruction line gets
What is the label supposed to point at? At the call instruction itself? Or
at the return point?

You can try to mark the CallUni instructions as hasDelaySlot = 1 or stop
> treating it as a call instruction in NVPTXInstrInfo.td. Can't say which one
> is better/correct. Looks like the representation for call/call.uni is not
> quite compatible with the debug info
David suggested not allowing -fdebug-default-version=5 to override the
DWARF version provided by NVPTX back-end. That would make sense,
considering that we can't handle the newer DWARF versions anyways.

Let me see if I can fix the lineinfo generation first.


> -------------
> Best regards,
> Alexey Bataev
> 12/2/2020 6:00 PM, Artem Belevich пишет:
> Hi, Alexey!
> I've ran into an odd case with debug info generation in NVPTX.
> Reproduction:
> ------------------------
> __device__ __attribute__((noinline)) void bar() { printf("Hi!"); }
> __global__ void foo() { bar(); }
> int main(){}
> ------------------------
> $ clang++ -v --cuda-gpu-arch=sm_70 --cuda-device-only -fdebug-default-version=5
> a.cu -gmlt -O1 -c
> Compilation fails due to a syntax error reported by ptxas.
> The reason for the error is that clang generates a label in the middle of
> a `call.uni` instuction. E.g:
>         { // callseq 1, 0
>         .reg .b32 temp_param_reg;
>         call.uni
> Ltmp14:
>         _Z3barv,
>         (
>         );
>         } // callseq 1
> The odd part is that we're only generating line info and there is no DWARF
> in the generated PTX.
> It appears that this behavior is triggered by `-dwarf-version=5` passed to
> cc1.
> Looks like another case where PTX syntax breaks DWARF generator
> assumptions.
> It's possible to work around it with an additional `-Xarch_device
> -fdebug-default-version=2`,
> but I'd appreciate it if you could take a look and see if that could be
> fixed.
> --
> --Artem Belevich

--Artem Belevich
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20201203/fd5e091c/attachment.html>

More information about the cfe-dev mailing list