[Lldb-commits] [PATCH] D17107: [lldb] Unicode support on Win32

Zachary Turner via lldb-commits lldb-commits at lists.llvm.org
Mon Feb 22 15:24:03 PST 2016

zturner added inline comments.

Comment at: lldb/trunk/source/Core/Disassembler.cpp:881
@@ -878,3 +880,3 @@
-    FILE *test_file = fopen (file_name, "r");
+    FILE *test_file = nullptr;
+#if _WIN32
cameron314 wrote:
> cameron314 wrote:
> > zturner wrote:
> > > Would LLDB's `File` class work here instead of using raw APIs?  I see you fixed the code in `lldb_private::File`, so perhaps simply using that class here instead of a `FILE*` would allow you to reuse the fix from that class.
> > No idea. Probably :-) I'll look into it, and make this change if it's trivial.
> I don't trust myself to make this change without introducing a bug. But I'll get rid of the `#ifdef`.
One possibility is to revert this portion of the patch (along with any other changes that you don't feel comfortable making) back to their original state.  This will still leave certain parts of unicode functionality broken, but it would reduce the surface area of "things that might be wrong with the patch" and give people a chance to make sure nothing else is seriously broken.  Then, we can do these other "more difficult" locations as a followup patch.  The nice thing about that is that if it does end up breaking something, and we have to revert it, it doesn't revert 100% of the work towards getting Unicode working, but only a couple of points.

Comment at: lldb/trunk/source/Host/common/FileSpec.cpp:237
@@ +236,3 @@
+    int stat_result;
+#ifdef _WIN32
+    llvm::SmallVector<UTF16, 128> wpath;
cameron314 wrote:
> zturner wrote:
> > cameron314 wrote:
> > > amccarth wrote:
> > > > zturner wrote:
> > > > > Maybe you can just call `GetStats()` here
> > > > Instead of another #ifdef here, can't we just re-use the GetFileStats function above?
> > > Probably, yes, but it requires a FileSpec. I'm not sure what creating a temporary one implicates. Surely there was a reason it wasn't already called before?
> > This is another reason I argued for moving this to `FileSystem` so that we wouldn't need a `FileSpec` here.  
> Actually, it seems the stat is just done to see if the path exists. `llvm::sys::fs::exists` does this already, I'll call that instead. It does the proper string conversion and uses the right Win32 API call on Windows.
Even better :)

Comment at: lldb/trunk/source/Host/common/FileSpec.cpp:829
@@ -799,2 +828,3 @@
 #ifdef _WIN32
-    auto attrs = ::GetFileAttributes (resolved_path);
+    llvm::SmallVector<UTF16, 128> wpath;
+    if (!llvm::convertUTF8ToUTF16String (resolved_path, wpath))
cameron314 wrote:
> cameron314 wrote:
> > zturner wrote:
> > > I think a `FileSystem::IsSymbolicLink()` here would be better, and then have `File` call `FileSpec::IsSymbolicLink()`
> > > 
> > > There's currently some confusion and overlap about what goes in `FileSpec` and what goes in `FileSystem`, but my general opinion is that wherever possible, anything that queries the filesystem should be a static method in `Host/FileSystem`, and any other classes who want to do the same thing can call the methods in `FileSystem`.  This way you don't have to, for example, construct a `FileSpec` from a string just to check if something is a symbolic link since you could just call the low level method directly.
> > I agree, but this is outside the scope of my patch :-)
> > This method already existed, and was already written this way. I merely fixed the way it calls into the OS by doing a string conversion, which is completely independent of the existing file system code's design.
> By the way, all the `FileSystem` methods seem to accept `FileSpec`s as input, not raw paths. I agree it would be nice if `FileSystem` abstracted the OS-level FS, which all the other higher-level parts (`FileSpec`, `File`) then depended on. But that would require changing the API rather significantly.
Yea, it's a bit unfortunate.  In reality there should be overloads that take `FileSpecs` and `std::strings` or `llvm::StringRefs`, and then only 1 implementation that the others re-use.  But right now there aren't.  

Comment at: lldb/trunk/source/Host/windows/Windows.cpp:210
@@ +209,3 @@
+    wchar_t wpath[PATH_MAX];
+    if (wchar_t* wresult = _wgetcwd(wpath, PATH_MAX))
+    {
cameron314 wrote:
> zturner wrote:
> > Why can't we just convert to the result of `_wgetcwd` directly to UTF8?
> Not sure I understand. That's what this code does? There's a tad of extra logic needed because the `path` passed in is allowed to be `NULL`, indicating we should allocate a buffer for the caller.
Ahh that extra logic is what I was referring to.  I wasn't aware of those semantics of `getcwd`.  Can you add a comment explaining that, since it's not obvious from looking at the code.

Comment at: lldb/trunk/source/Plugins/ScriptInterpreter/Python/PythonDataObjects.cpp:1175
@@ -1166,1 +1174,3 @@
+    fp = _wfopen((const wchar_t *)wpath.data(), (const wchar_t *)wmode.data());
     fp = fopen(path, mode);
cameron314 wrote:
> zturner wrote:
> > Here's another example of an #ifdef in common code that we will need to do something about.  Is there a constructor of the `File` class that takes a path and a mode?
> There is, but the mode is an enum and not a string. I'll get rid of the `#ifdef`, though.
We have a function somewhere that converts a mode string to the enum.  I think it's in `File.h` or `File.cpp`.  Let me know if you can't find it.



More information about the lldb-commits mailing list