[PATCH] D58952: [llvm] Skip over empty line table entries.
Paul Robinson via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Mar 6 14:59:32 PST 2019
probinson added inline comments.
================
Comment at: llvm/lib/DebugInfo/DWARF/DWARFDebugLine.cpp:871
+ // In some cases, e.g. first instruction in a function, the compiler generates
+ // two entries, both with the same address. We want the last one.
LineTable::RowIter RowPos = std::lower_bound(
----------------
probinson wrote:
> dblaikie wrote:
> > aprantl wrote:
> > > probinson wrote:
> > > > dblaikie wrote:
> > > > > probinson wrote:
> > > > > > aprantl wrote:
> > > > > > > Doesn't the DWARF spec require all PC values in the line table to be strictly monotonically increasing? How is it possible to have more than one entry at the same address?
> > > > > > The DWARF spec's model is based on increasing PC values, however the encoding allows advancing PC by 0. So it's entirely possible to have a line table that (for example) advances line by 2 and PC by 0.
> > > > > I couldn't find specific wording about that - but please do take a look, quite possible I missed it.
> > > > >
> > > > > LLVM's been generating line tables like this for a while (the test updates show evidence of that) & it might be justifiable to fix this code even if it's technically invalid DWARF, given how long we've been generating it. (to my mind, it is wasted DWARF - it doesn't describe any location, so I think it should be correct to remove it if we wanted to, but not a high priority bug (@echristo reckons there might be consumers using these empty ranges to find the start of a function or the like))
> > > > The non-normative text on the first couple pages of section 6.2 describe the line table as notionally "a large matrix, with one row for each instruction" (p.149 of DWARF 5). However, the normative text describes how special opcodes are formed (section 6.2.5.1, pp. 160-162) without requiring "operation advance" to be nonzero. This permits a special opcode to increment the line number without changing the (address, op_index) tuple.
> > > > It's skating on thin ice because DWARF doesn't explicitly give a meaning to this, but the encoding allows it.
> > > That's quite interesting for a completely unrelated reason: If the linetable encoding actually supports more than one entry per PC, we could change the implementation of getMergedLocation to emit both locations and then teach the debugger to disambiguate the locations based on DW_AT_call_* information or something.
> > >
> > > > it might be justifiable to fix this code even if it's technically invalid DWARF, given how long we've been generating it.
> > > Agreed.
> > I don't know/think I'd say "the linetable encoding actually supports more than one entry per PC"
> >
> > I think each pair of addresses in the line table is the half-open range [X, Y), so if you have two lines after each other you have [X, X) - I would argue that is the empty range (it certainly is with C++ iterators, for instance), not X. So it doesn't describe any addresses at all.
> Not so fast... We don't encode the line table until deep in the bowels of the assembler. getMergedLocation is working out what to attach to an IR instruction, which doesn't know how to attach more than one location. Teach it that first, and getMergedLocation can return you a list.
> It will confuse the heck out of consumers that aren't expecting it, and this will be especially weird for profilers.
> I think this particular dark corner of the spec might be exploitable, but it won't be smooth going.
>
> As far as the current patch is concerned, I am uncertain. Having addr2line do something consistent seems likely to do more good than harm, but I can imagine this could cause some confusion in other clients.
Treating the zero-increment case as an empty range has some appeal. However, it can be confusing. You tell the debugger to plant a breakpoint on foo(), which the debugger will tell you is on line 12; and yet when you hit said breakpoint, you're told that address is on line 15, because things got folded and optimized and the instruction range for lines 12-14 has gone to nothing.
Maybe we hand-wave that away as "what do you expect from optimized code" but it's still not a great debugging experience.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D58952/new/
https://reviews.llvm.org/D58952
More information about the llvm-commits
mailing list