[Lldb-commits] [PATCH] [2/2] Fix Path support on Windows

Zachary Turner zturner at google.com
Fri Aug 1 10:36:08 PDT 2014


Actually now I'm confusing myself.  I didnt' change the const char * to a
StringRef yet, so that doesn't need fixing yet, only need to keep it in
mind in the future if we change raw strings to StringRefs.


On Fri, Aug 1, 2014 at 10:26 AM, Zachary Turner <zturner at google.com> wrote:

> Ahh, good point about the null const char*.  I'll definitely fix that
> before submitting.  It seems like we crossed paths, but did you see my
> message about submitting the optimization as a subsequent patch?  Are you
> ok with that?
>
>
> On Fri, Aug 1, 2014 at 10:20 AM, Greg Clayton <gclayton at apple.com> wrote:
>
>>
>> > On Jul 31, 2014, at 10:18 AM, Zachary Turner <zturner at google.com>
>> wrote:
>> >
>> > Just to expand a little more since my last post, the PathSyntax to the
>> constructor and to SetFile() has a default value, which will do the right
>> thing 99% of the time by selecting Host syntax.  You only need to specify
>> it if you know what you're doing, so to speak.  But without being able to
>> set it, there would be no way to have FileSpec represent a path on a remote
>> machine.
>>
>> So Jim changed my mind after speaking with him about specifying the
>> PathSyntax, so I agree that is the right way to go.
>>
>> >
>> > As for the performance cost of normalization, SmallString<64> only does
>> a malloc if the string is longer than 64 characters.  I could try
>> increasing the number, I think 128 would cover probably 95% of all file
>> paths.  I agree in principle that it would be nice to not normalize the
>> string on every file set, one idea would be to never normalize it if
>> PathSyntax = Posix (since Normalized form is by definition Posix form) and
>> always normalize it otherwise.
>>
>> I still think we should avoid making the copy if it is already
>> normalized. I would be fine if posix never normalized but it does for other
>> syntaxes.
>>
>> >  But then to avoid the copy I would need to change the SetFile()'s
>> signature to accept a SmallString instead of a const char*.
>>
>> Why can't Normalize and Denormalize return a bool: false if no
>> normalization was a needed, and true if it was normalized. Then just have
>> FileSpec::SetFile() use a llvm::StringRef after the call to normalize which
>> will contain the raw "pathname" if no normalization was needed or the
>> contents of "normalized" if it was normalized.
>>
>>
>> > I actually thinking changing raw strings to LLVM strings is a good
>> change, but it's probably going to result in some unrelated churn fixing up
>> function signatures elsewhere, so doing that iteratively as a separate
>> change might make more sense.
>>
>> Feel free to make two FileSpec::SetFile() functions:
>>
>> void
>> FileSpec::SetFile (const char *path_cstr);
>>
>> void
>> FileSpec::SetFile (const llvm::StringRef &path);
>>
>> Then the const char * version would call the llvm::StringRef version. We
>> don't to allow just the llvm::StringRef version because if you pass in NULL
>> StringRef will crash with an assert.
>>
>> > Thoughts?
>>
>> Let me know what you think about the above comments.
>>
>> Greg
>>
>> >
>> >
>> > On Wed, Jul 30, 2014 at 6:50 PM, Greg Clayton <gclayton at apple.com>
>> wrote:
>> > Looks ok, a few questions:
>> >
>> > Why does the user ever want to specify the PathSyntax? Shouldn't it
>> just be something that can be queried and be set automagically?
>> >
>> > What if I did:
>> >
>> > FileSpec f("C:\Users\me\foo.txt", false, ePathSyntaxPosix);
>> >
>> > If I set it incorrectly, seems like it might "do the wrong thing" in
>> some calls. It might be nice to use a C++11 typed bitfield to store the
>> PathSyntax to keep the size of the FileSpec class down to as small as
>> possible (uint8_t).
>> >
>> > All the other functionality seems fine.
>> >
>> > A few things to keep in mind:
>> >
>> > All debug info is parsed and stored into FileSpec objects. They must
>> match exactly what the user types. What if the user types in:
>> >
>> > (lldb) breakpoint set --file C:/Users/me/foo.txt
>> >
>> > Note the wrong slashes. Are we alway normalizing a path to be native so
>> that this kind of compare will always work even if the user types it in
>> wrong?
>> >
>> > Seems like we shouldn't allow the user to specify the PathSyntax or we
>> can run into problems.
>> >
>> > Also seems like we should check if a string needs to be normalized
>> before calling Normalize and allocating memory and copying and replacing
>> things which can be expensive (think parsing thousands of line table files).
>> >
>> > So I would vote to:
>> > 1 - fix it so the user never specifies PathSyntax, but can query it if
>> desired
>> > 2 - I would rather not normalize the path on every
>> FileSpec::SetFile(...) if we can avoid it as it will be expensive
>> > 3 - If we do need to normalize it every time, either have Normalize
>> return false and not touch the llvm::SmallString<64> argument (no mallocs)
>> if it doesn't need to be normalized to avoid expensive allocations if not
>> required.
>> >
>> > > On Jul 30, 2014, at 4:49 PM, Zachary Turner <zturner at google.com>
>> wrote:
>> > >
>> > >>> ! In D4675#11, @deepak2427 wrote:
>> > >> Thanks for fixing this. I haven't had a chance to test this patch
>> with our cross-platform issues yet, have to look at some other issues first.
>> > >> Just a couple of doubts I had.
>> > >>
>> > >> - Are the paths always going to default as ePathSyntaxHostNative and
>> then converted based on the Host?
>> > >> - In 733       llvm::SmallString<64> denormalized; , is the length
>> of 64 enough?
>> > >> - I'm not that familiar with LLVM data structures, however would it
>> be better to keep llvm::SmallString itself for the Normalize/Denormalize
>> functions to keep it consistent?
>> > >
>> > > Paths default to SyntaxHostNative because that makes the semantics
>> post-patch equivalent to pre-patch.  Normalized form is basically unix
>> form, so if the syntax ends up as ePathSyntaxPosix (either because it was
>> explicitly specified or because host was specified and your host is posix),
>> then normalization and de-normalization are both no-ops.
>> > >
>> > > SmallString<64> just means that it keeps 64 bytes on the stack, and
>> if it grows longer than that, it will fallback to help allocation.  It
>> still supports arbitrarily long strings, but paths tend to be small, so
>> that's why a small number was chosen.
>> > >
>> > > http://reviews.llvm.org/D4675
>> > >
>> > >
>> > >
>> > > _______________________________________________
>> > > lldb-commits mailing list
>> > > lldb-commits at cs.uiuc.edu
>> > > http://lists.cs.uiuc.edu/mailman/listinfo/lldb-commits
>> >
>> >
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/lldb-commits/attachments/20140801/da16597f/attachment.html>


More information about the lldb-commits mailing list