[LLVMdev] Questions before moving the new debug info hierarchy into place

Frédéric Riss friss at apple.com
Fri Feb 20 11:34:37 PST 2015


> On Feb 20, 2015, at 11:14 AM, Duncan P. N. Exon Smith <dexonsmith at apple.com> wrote:
> 
> 
>> On 2015-Feb-20, at 09:44, Frédéric Riss <friss at apple.com> wrote:
>> 
>> Speaking of deduplication of filenames. I think we discussed that in the early stages of your work, but I just wanted to make sure I remember correctly: the new debug hierarchy will allow to implement specific uniquing behavior, right? So we will be able to tweak MDFile to unique:
>> 
>> !MDFile(filename: "path/to/file", directory: "/path/to/dir")
>> !MDFile(filename: "to/file", directory: "/path/to/dir/path")
>> 
>> into the same object? I did some work in this area and in the current form it’s not really possible to do cleanly (don’t remember the details here, but I know punted it till we get smart uniquing capability).
> 
> IIRC, there was some talk about how the DWARF spec requires that
> references to a `directory` need to be the CWD of the compiler.

If you refer to a conversation we had, I’m pretty sure I was referring to this text in the standard about the directories listed in the line table:
"Entries in this sequence describe each path that was searched for included source files in this compilation. (The paths include those directories specified explicitly by the user for the compiler to search and those the compiler searches without explicit direction.) Each path entry is either a full path name or is relative to the current directory of the compilation."

I think that at the time I had interpreted that has “if the user passes -I../include to the compiler, le line table should list ../include as a directory in the line table”. I think my interpretation was too strict though and that we could canonicalize the file/dir names as we’d like.

The compilation dir will always be there as a special entry. It’s the implicit entry 0 in the directory list, but it’s value is not stored in the line table, it is a normal string that is referenced by the TAG_compile_unit DIE. If it is extracted from an MDFile, we need to make sure this one doesn’t get mangled, but I think this is the only one we really need to preserve.

> It's not clear to me *which* references need that, but I guess
> we need to sort out exactly what the requirements are so we know
> which way to canonicalize things.  I wonder if this only matters
> for DW_TAG_compile_unit?  If so, we could canonicalize freely
> everywhere else.  (For example, we could store a `path:` node
> everywhere (dropping the filename/directory distinction, or
> canonicalizing it to basename/dirname, etc.), and add a `cwd:`
> to the compile unit.)

As stated above, I think we are free to canonicalize as we wish.

> David also pointed out (in a previous thread about this) that
> the frontend (or `DIBuilder`) might be the right location for
> canonicalization.  Certainly, we couldn't canonicalize `..`
> references without access to the filesystem (at least not on
> POSIX platforms), but maybe we don't care about that anyway?

What I attempted to do in the past was exactly that, canonicalize ‘..’ in the paths (The lldb folks would have loved to be able to consider file identifiers as unique in the line table). 

Fred


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150220/328f57d6/attachment.html>


More information about the llvm-dev mailing list