[llvm-bugs] [Bug 46236] New: directory_iterator assert crash on Unicode input on Windows

via llvm-bugs llvm-bugs at lists.llvm.org
Sun Jun 7 16:44:05 PDT 2020


https://bugs.llvm.org/show_bug.cgi?id=46236

            Bug ID: 46236
           Summary: directory_iterator assert crash on Unicode input on
                    Windows
           Product: libraries
           Version: trunk
          Hardware: PC
                OS: Windows 2000
            Status: NEW
          Severity: normal
          Priority: P
         Component: Support Libraries
          Assignee: unassignedbugs at nondot.org
          Reporter: andrey at futoin.org
                CC: llvm-bugs at lists.llvm.org

In short: there is a bug that UTF-8 byte length is used in UTF-16 condition
checks whats leads to out-of-range assertion.

Why so: UTF-8 may be longer in bytes than UTF-16 in wchar_t

Severity: unlikely to be critical, but I have spotted the problem in
third-party software. It seems only a few LLVM/clang internals use the
functionality.

A short obvious blind bug fix, that explains everything, but not tested:

diff --git a/llvm/lib/Support/Windows/Path.inc
b/llvm/lib/Support/Windows/Path.inc
index ec62e656ddf..49fc8dbdfb0 100644
--- a/llvm/lib/Support/Windows/Path.inc
+++ b/llvm/lib/Support/Windows/Path.inc
@@ -941,32 +941,32 @@ static basic_file_status
status_from_find_data(WIN32_FIND_DATAW *FindData) {
                            FindData->ftLastWriteTime.dwHighDateTime,
                            FindData->ftLastWriteTime.dwLowDateTime,
                            FindData->nFileSizeHigh, FindData->nFileSizeLow);
 }

 std::error_code detail::directory_iterator_construct(detail::DirIterState &IT,
                                                      StringRef Path,
                                                      bool FollowSymlinks) {
   SmallVector<wchar_t, 128> PathUTF16;

   if (std::error_code EC = widenPath(Path, PathUTF16))
     return EC;

   // Convert path to the format that Windows is happy with.
   if (PathUTF16.size() > 0 &&
-      !is_separator(PathUTF16[Path.size() - 1]) &&
-      PathUTF16[Path.size() - 1] != L':') {
+      !is_separator(PathUTF16[PathUTF16.size() - 1]) &&
+      PathUTF16[PathUTF16.size() - 1] != L':') {
     PathUTF16.push_back(L'\\');
     PathUTF16.push_back(L'*');
   } else {
     PathUTF16.push_back(L'*');
   }

   //  Get the first directory entry.
   WIN32_FIND_DATAW FirstFind;
   ScopedFindHandle FindHandle(::FindFirstFileExW(
       c_str(PathUTF16), FindExInfoBasic, &FirstFind, FindExSearchNameMatch,
       NULL, FIND_FIRST_EX_LARGE_FETCH));
   if (!FindHandle)
     return mapWindowsError(::GetLastError());

   size_t FilenameLen = ::wcslen(FirstFind.cFileName);

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20200607/4b6fba6b/attachment.html>


More information about the llvm-bugs mailing list