<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
  <head>
    <meta content="text/html; charset=ISO-8859-1"
      http-equiv="Content-Type">
  </head>
  <body text="#000000" bgcolor="#ffffff">
    On 31.05.2011 19:45, Devang Patel wrote:
    <blockquote
      cite="mid:28E8AD70-77FD-4FB3-A37C-E3A30284047D@apple.com"
      type="cite"><br>
      <div>
        <div>On May 31, 2011, at 10:36 AM, <a moz-do-not-send="true"
            href="mailto:trash-stuff@gmx.de">trash-stuff@gmx.de</a>
          wrote:</div>
        <br class="Apple-interchange-newline">
        <blockquote type="cite"><span class="Apple-style-span"
            style="border-collapse: separate; font-family: Verdana;
            font-style: normal; font-variant: normal; font-weight:
            normal; letter-spacing: normal; line-height: normal;
            orphans: 2; text-indent: 0px; text-transform: none;
            white-space: normal; widows: 2; word-spacing: 0px;
            font-size: medium;">On 31.05.2011 19:22, Devang Patel wrote:
            <blockquote
              cite="mid:63165691-E116-4435-9188-7976D35830BB@apple.com"
              type="cite"><br>
              <div>
                <div>On May 30, 2011, at 11:11 AM,<span
                    class="Apple-converted-space"> </span><a
                    moz-do-not-send="true"
                    href="mailto:trash-stuff@gmx.de">trash-stuff@gmx.de</a><span
                    class="Apple-converted-space"> </span>wrote:</div>
                <br class="Apple-interchange-newline">
                <blockquote type="cite"><span class="Apple-style-span"
                    style="border-collapse: separate; font-family:
                    Verdana; font-style: normal; font-variant: normal;
                    font-weight: normal; letter-spacing: normal;
                    line-height: normal; orphans: 2; text-indent: 0px;
                    text-transform: none; white-space: normal; widows:
                    2; word-spacing: 0px; font-size: medium;">Hi all,<br>
                    <br>
                    I am processing DWARF line and column information in
                    (x86 and ARM) executables in order to produce a
                    mapping from the machine instructions back to the
                    original source code (C/C++). Using the line numbers
                    is quite straightforward ("libdwarf" [1] is doing
                    the work me.) But when comparing the column numbers
                    (extracted from the DWARF line table) with the
                    corresponding source code locations, it becomes
                    clear that they are not very "useful".<br>
                    <br>
                    Consider the following small example (C++):<br>
                    <blockquote><tt> 1: #include <iostream><br>
                         2: #include <ctime><br>
                         3: #include <cstdlib><br>
                         4: using namespace std;<br>
                         5: int main() {<br>
                         6:    int j = 0; cin >> j; long sum = (j
                        < 0 ? -5 : 4) + rand();<br>
                         7:    for(int i = 0; i < j; i++) { sum +=
                        j*j-2; cout << (sum / 2) << endl; }<br>
                         8:    srand(time(NULL));<br>
                         9:    double d = rand() / 10.341; int t =
                        (int)d+j*sum;<br>
                        10:    cout << sum << d << t
                        << j;<br>
                        11:    return (0);<br>
                        12: }</tt><br>
                    </blockquote>
                    Compiling this with "clang++ Main.cpp -g -O3 -o
                    column" result in the following location information
                    within the generated executable:<br>
                    <blockquote><tt>$ dwarfdump -l column<br>
                        <br>
                        .debug_line: line number info for a single cu<br>
                        Source lines (from CU-DIE at .debug_info offset
                        11):<br>
                          <source file>     [line,column]    
                        <pc>    //<new stmt or basic block<br>
                        .../locale_facets.h:  [868, 2]    0x80488f0  //
                        new statement</tt><br>
                      <tt>               [...]</tt><br>
                      <tt>.../Main.cpp:   </tt><tt>     <span
                          class="Apple-converted-space"> </span></tt><tt>[ 
                        8, 2]    0x804896f  // new statement</tt><br>
                      <tt>.../Main.cpp:   </tt><tt>     <span
                          class="Apple-converted-space"> </span></tt><tt>[ 
                        9,28]    0x8048983  // new statement</tt><br>
                      <tt>.../ostream:  </tt><tt>     <span
                          class="Apple-converted-space"> </span></tt><tt> 
                        [165, 9]    0x8048990  // new statement</tt><br>
                      <tt>.../Main.cpp:  </tt><tt>      </tt><tt><span
                          class="Apple-converted-space"> </span>[ 
                        9,28]    0x80489a0  // new statement</tt><br>
                      <tt>.../ostream:   </tt><tt>      <span
                          class="Apple-converted-space"> </span></tt><tt>[209,
                        9]    0x80489ac  // new statement</tt><br>
                      <tt>.../Main.cpp:   </tt><tt>     <span
                          class="Apple-converted-space"> </span></tt><tt>[ 
                        9,28]    0x80489b5  // new statement</tt><br>
                      <tt>.../ostream:   </tt><tt>      <span
                          class="Apple-converted-space"> </span></tt><tt>[209,
                        9]    0x80489bb  // new statement</tt><br>
                      <tt>               [...]</tt><br>
                      <tt>.../basic_ios.h:      [ 48, 2]    0x8048a23 
                        // new statement // end of text sequence</tt><br>
                    </blockquote>
                    Now, have a look at source code line 9. The
                    extracted debug info above says that we've 3
                    "instruction sets" (beginning at<span
                      class="Apple-converted-space"> </span><tt>0x8048983,<span
                        class="Apple-converted-space"> </span></tt><tt>0x80489a0</tt><span
                      class="Apple-converted-space"> </span>and<span
                      class="Apple-converted-space"> </span><tt>0x80489b5</tt><span
                      class="Apple-converted-space"> </span>respectively)
                    which correspond to line 9. But all of them are
                    labeled with column number 28! According to my
                    understanding, this does not contribute any further
                    information to support my task (= mapping assembler
                    code back to the source lines or even to statements
                    within a line). Did i miss anything?<br>
                  </span></blockquote>
              </div>
              <br>
              <div>You are looking at the line table produced at -O3,
                i.e. after aggressive optimizer had opportunities to
                optimize code. Try -O0 and see if it helps.</div>
            </blockquote>
            First of all, thanks for your reply!<br>
            <br>
            I've already checked that at -O0 but it results in the same
            information.</span></blockquote>
        <div><br>
        </div>
        <div>You mean, the instructions with given line and column
          number do not match the source code construct at that location
          ? <br>
        </div>
      </div>
    </blockquote>
    No, they do.<br>
    <blockquote
      cite="mid:28E8AD70-77FD-4FB3-A37C-E3A30284047D@apple.com"
      type="cite">
      <div><br>
        <blockquote type="cite"><span class="Apple-style-span"
            style="border-collapse: separate; font-family: Verdana;
            font-style: normal; font-variant: normal; font-weight:
            normal; letter-spacing: normal; line-height: normal;
            orphans: 2; text-indent: 0px; text-transform: none;
            white-space: normal; widows: 2; word-spacing: 0px;
            font-size: medium;"> (The documentation about "Source Level
            Debugging with LLVM" says "<b>LLVM debug information always
              provides information to accurately read the source-level
              state of the program, regardless of which LLVM
              optimizations have been run</b>, and without any
            modification to the optimizations themselves." [1])<br>
          </span></blockquote>
        <br>
      </div>
      <div>It means the instructions with given line and column number
        matches the source code construct at that line/col number. It
        does not mean that optimizer/code generator will not reorder
        instruction. It also does not mean that optimizer/code generator
        will not emit instruction without line number information. It
        means, if there is a line number information, it is as accurate
        as possible to map source construct.</div>
    </blockquote>
    Yes, that matches my understanding, too. But I thought that clang
    would be able to emit <b>more</b> than one (different) column
    number per line. As in my example, for line number 9 (in Main.cpp),
    there are <b>three</b> entries in the DWARF line table. But all of
    them contain the <b>same</b> information. As a consequence, the
    associated assembler instructions were all mapped to the same source
    line and thus, the column information is useless...? I mean, what
    are the additional information included in the column numbers?<br>
    <br>
    I extracted the assembler instructions for the 9th line (x86):<br>
    <tt>.../Main.cpp: 9<br>
          double d = rand() / 10.341; int t = (int)d+j*sum;<br>
                                    ^<br>
      8048983:    e8 40 fe ff ff           call   80487c8
      <rand@plt><br>
      8048988:    89 c7                    mov    %eax,%edi<br>
      804898a:    8b 5d f0                 mov    -0x10(%ebp),%ebx<br>
      804898d:    0f af de                 imul   %esi,%ebx<br>
      80489a0:    f2 0f 2a c7              cvtsi2sd %edi,%xmm0<br>
      80489a4:    f2 0f 5e 05 f0 8a 04     divsd  0x8048af0,%xmm0<br>
      80489ab:    08 <br>
      80489b5:    f2 0f 2c f0              cvttsd2si %xmm0,%esi<br>
      80489b9:    01 de                    add    %ebx,%esi</tt><br>
    <br>
    I hope that makes it clearer... ;-)<br>
    <br>
    BTW, any hints to my cross-compilation-related question?<br>
    <br>
    Best regards<br>
      Adrian<br>
  </body>
</html>