[llvm-dev] Debug info for Cuda

Wed Nov 8 02:34:36 PST 2017

I don't understand the use case and reasons to blame PTXAS compiler here.

*>>a) Supports DWARF-2 only.*
*What would you like to achieve with DWARF-3+ that you cannot do with
DWARF2?*

*>> b) Labels are allowed only in code section (only in functions).*
*What is the use case here which needs labels outside functions?*

*>>>c) Does not support label arithmetic in DWARF sections.*
*Same. Please explain use case.*

*d) Debug info must point to the sections, not to labels inside
these sections.*

*e) Sections itself must be enclosed into braces*
*>     “.section .debug_info {…}”*

*Again, why is this a limitation?*

*>>>> i) .debug_frame section is emitted by txas compiler.*

*>     DW_AT_frame_base must be set to dwarf::DW_FORM_data1>
dwarf::DW_OP_call_frame_cfa value.I doubt that's a problem.*

*Why is this a problem?*

On Tue, Nov 7, 2017 at 1:33 AM, Alexey Bataev via llvm-dev <
llvm-dev at lists.llvm.org> wrote:

> 06.11.2017 14:56, Robinson, Paul пишет:
> >> Hi everybody,
> >> As you know, Cuda/NVPTX target has very limited support of the debug
> >> info in Clang/LLVM. Currently, LLVM supports only emission of the line
> >> numbers debug info.
> >> This is caused by limitations of the Cuda/NVPTX codegen. Clang/LLVM
> >> translates the source code to LLVM IR, which is then lowered to PTX
> >> (parallel thread execution) intermediate file. This PTX file represents
> >> special kind of the assembler code in text format, which contains the
> >> code itself + (possibly) debug info. Then this PTX file is compiled by
> >> ptxas tool into the CUDA binary representation.
> >>
> >> Debug info representation in PTX file.
> >> ========================
> >> According to PTX Writer's Guide to Interoperability, Debug information
> >> (http://docs.nvidia.com/cuda/ptx-writers-guide-to-
> interoperability/index.html#debug-information)
> >> , debug information must be encoded in DWARF (Debug With Arbitrary
> >> Record Format). The responsibility for generating debug information is
> >> split between the PTX producer and the PTX-to-SASS backend. The PTX
> >> producer is responsible for emitting binary DWARF into the PTX file,
> >> using the .section and .b8-.b16-.b32-and-.b64 directives in PTX. This
> >> should contain the .debug_info and .debug_abbrev sections, and possibly
> >> optional sections .debug_pubnames and .debug_aranges. These sections
> >> are standard DWARF2 sections that refer to labels and registers in the
> >> PTX.
> >>
> >> The PTX-to-SASS backend is responsible for generating the .debug_line
> >> section from the .file and .loc directives in the PTX file. This
> >> section maps source lines to SASS addresses. The PTX-to-SASS backend
> >> also generates the .debug_frame section.
> > All this sounds like the standard division of responsibilities between
> > an LLVM code generator and the assembler.
> >
> >> LLVM is able to emit debug info in DWARF. But ptxas compiler has some
> >> limitations, that make it hard to adapt LLVM for correct emission of
> >> the debug info in PTX files.
> >>
> >> Limitations/features of the PTX format/ptxas compiler.
> >> ==================================
> >> a) Supports DWARF-2 only.
> > IIRC, Darwin had a similar restriction until recently.
> >
> >> b) Labels are allowed only in code section (only in functions).
> > If you have static/global variables, I guess their locations would
> > have to be described using a section+offset expression?  Normally
> > we emit a location attribute that is just a reference to a label
> > for the variable.
> >
> >> c) Does not support label arithmetic in DWARF sections.
> >>     “.b32 L1 – L2” as the size of the section is not allowed, so the
> >> sections sizes should be calculated explicitly.
> > MachO has a similar restriction, this should not be a problem if you
> > can do something like:
> >     L3 = L1 - L2
> >     .b32 L3
> Nope, it is not supported
> >> d) Debug info must point to the sections, not to labels inside these
> >> sections.
> >>     “.b32 .debug_abbrevs”
> > Offhand for DWARF-2 I can't think of a reference that couldn't be done
> > this way.
> >
> >> e) Sections itself must be enclosed into braces
> >>     “.section .debug_info {…}”
> >> f) Frame info is non-register based
> >>     Based on function local “__local_depot” array, that represents the
> >> stack frame.
> >> g) All variables must have non-standard DW_AT_address_class attribute
> >> so the debuger had the info about address class of the variable -
> >> global or local. DWARF standard does support this attribute, but it can
> >> be appiled to pointer/reference types only, not variables.
> > For variables it would be more usual to use DW_AT_segment for this.
> > But that's an agreement that the compiler and debugger need to reach.
> >
> >> h) The first label in the function must follow the debug location macro.
> >> In LLVM, it is followed by the debug location macro.
> > I am not 100% sure what you mean by this, but I think it has to do with
> > the fact that LLVM attaches locations to instructions, not labels.  It
> > might or might not be easy to work around this; there might be an
> > unfortunate interaction with how emitting line-0 records works.
> >
> >> i) .debug_frame section is emitted by txas compiler.
> >>     DW_AT_frame_base must be set to dwarf::DW_FORM_data1
> >> dwarf::DW_OP_call_frame_cfa value.
> > I doubt that's a problem.
> >
> >> j) Strings cannot be referenced by the labels, instead they must be
> >> inlined in the sections in form of array of chars.
> > LLVM used to do inline strings, but switched to the .debug_str section
> > quite a while ago.  On the other hand, I spent a little time maybe a
> > year ago looking into whether we could emit short strings inline as a
> > space-saving measure, and decided it was feasible.  (I didn't do it
> > because the space savings was really trivial.)  So I think doing this
> > would not be terribly hard.
> >
> >> Some changes in LLVM are required to support all these
> >> limitation/features in the output PTX files.
> >> Required changes in LLVM.
> >> ==================
> >> •include/llvm/CodeGen/AsmPrinter.h.
> >>     •Add “virtual MCSymbol *getFunctionFrameSymbol(const
> >> MachineFunction *MF) const” for non-register-based frame info.
> >>     •Override “NVPTXMCAsmPrinter.cpp” to return the name of the
> >> “__local_depot” frame storage.
> >> •Add ”cuda-gdb” specific tuning.
> > Note that our design philosophy for "tuning" is that a tuning option
> > unpacks into other separate flags.  Not a problem, just an observation.
> >
> >>     •Inlined strings must be used in sections, not string references.
> >>     •Label arithmetic is replaced by the absolute section size
> >> evaluation.
> > This one isn't a debug-info tuning decision, it's how your assembler
> works
> > and so is a target decision.
> >
> >>     •Use “AsmPrinter::doInitialization()” instead of NVPTX-custom
> manual
> >> initialization.
> >>     •Local variables address emitted as “__local_depot” + <var offset>.
> >> •Add NVPTX specific “NVPTXMCAsmStreamer” class.
> >>     •Requires moving to includes of “MCAsmStreamer” class declaration.
> >>     •Overrides emission of the labels (names of the section are emitted
> >> instead).
> >>     •Overrides emission of the sections (emit braces)
> >>     •Overrides string emission (as sequence of bytes, not as strings)
> >>     •Overrides emission of files/locations debug info
> >> Required changes in Clang.
> >> =================
> >> •Add option “-gcuda-gdb” to driver.
> >>     •Emit cuda-gdb compatible debug info (DWARF-2 by default + CudaGDB
> >> tuning).
> >> •Add options “-g --dont-merge-basicblocks --return-at-end” to “ptxas”
> >> call.
> >>     •ptxas is able to translate debug information only if -O0
> >> optimization level is used. It means, that we can use optimization
> >> level in LLVM > O0, but still have to use O0 when calling ptxas
> >> compiler.
> >>
> >> This approach was implemented in https://github.com/clang-ykt to
> support
> >> debug info emission for NVPTX target when generating code for OpenMP
> >> offloading constructs. You can try to use it.
> > I haven't looked at your code but all the things you describe seem
> > reasonably feasible.  Certainly the details of what you want to do
> > to the emitted DWARF are fine; I am less sure about the assembler
> > details, but if you have a worked example that makes it likely that
> > part is okay as well.
> > --paulr
> >
>
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>

-- 
*Disclaimer: Views, concerns, thoughts, questions, ideas expressed in this
mail are of my own and my employer has no take in it. *
Thank You.
Madhur D. Amilkanthwar
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20171108/1cef75e4/attachment.html>