[LLVMdev] Expressiveness of column numbers in dwarf using clang 3.0?

trash-stuff at gmx.de trash-stuff at gmx.de
Tue May 31 10:36:54 PDT 2011


On 31.05.2011 19:22, Devang Patel wrote:
>
> On May 30, 2011, at 11:11 AM, trash-stuff at gmx.de 
> <mailto:trash-stuff at gmx.de> wrote:
>
>> Hi all,
>>
>> I am processing DWARF line and column information in (x86 and ARM) 
>> executables in order to produce a mapping from the machine 
>> instructions back to the original source code (C/C++). Using the line 
>> numbers is quite straightforward ("libdwarf" [1] is doing the work 
>> me.) But when comparing the column numbers (extracted from the DWARF 
>> line table) with the corresponding source code locations, it becomes 
>> clear that they are not very "useful".
>>
>> Consider the following small example (C++):
>>
>>      1: #include <iostream>
>>      2: #include <ctime>
>>      3: #include <cstdlib>
>>      4: using namespace std;
>>      5: int main() {
>>      6:    int j = 0; cin >> j; long sum = (j < 0 ? -5 : 4) + rand();
>>      7:    for(int i = 0; i < j; i++) { sum += j*j-2; cout << (sum /
>>     2) << endl; }
>>      8:    srand(time(NULL));
>>      9:    double d = rand() / 10.341; int t = (int)d+j*sum;
>>     10:    cout << sum << d << t << j;
>>     11:    return (0);
>>     12: }
>>
>> Compiling this with "clang++ Main.cpp -g -O3 -o column" result in the 
>> following location information within the generated executable:
>>
>>     $ dwarfdump -l column
>>
>>     .debug_line: line number info for a single cu
>>     Source lines (from CU-DIE at .debug_info offset 11):
>>     <source file>     [line,column] <pc>    //<new stmt or basic block
>>     .../locale_facets.h:  [868, 2]    0x80488f0  // new statement
>>                    [...]
>>     .../Main.cpp: [  8, 2]    0x804896f  // new statement
>>     .../Main.cpp: [  9,28]    0x8048983  // new statement
>>     .../ostream:   [165, 9]    0x8048990  // new statement
>>     .../Main.cpp: [  9,28]    0x80489a0  // new statement
>>     .../ostream: [209, 9]    0x80489ac  // new statement
>>     .../Main.cpp: [  9,28]    0x80489b5  // new statement
>>     .../ostream: [209, 9]    0x80489bb  // new statement
>>                    [...]
>>     .../basic_ios.h:      [ 48, 2]    0x8048a23  // new statement //
>>     end of text sequence
>>
>> Now, have a look at source code line 9. The extracted debug info 
>> above says that we've 3 "instruction sets" (beginning 
>> at0x8048983,0x80489a0and0x80489b5respectively) which correspond to 
>> line 9. But all of them are labeled with column number 28! According 
>> to my understanding, this does not contribute any further information 
>> to support my task (= mapping assembler code back to the source lines 
>> or even to statements within a line). Did i miss anything?
>
> You are looking at the line table produced at -O3, i.e. after 
> aggressive optimizer had opportunities to optimize code. Try -O0 and 
> see if it helps.
First of all, thanks for your reply!

I've already checked that at -O0 but it results in the same information. 
(The documentation about "Source Level Debugging with LLVM" says "*LLVM 
debug information always provides information to accurately read the 
source-level state of the program, regardless of which LLVM 
optimizations have been run*, and without any modification to the 
optimizations themselves." [1])

Any other ideas?

Best regards
   Adrian

[1] http://llvm.org/docs/SourceLevelDebugging.html#debugopt

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20110531/a4b1e0d5/attachment.html>


More information about the llvm-dev mailing list