[libcxx-commits] [PATCH] D91137: [5/N] [libcxx] Convert paths to/from the right narrow code page for narrow strings on windows

Thu Dec 10 00:42:08 PST 2020

curdeius added inline comments.

================
Comment at: libcxx/include/filesystem:1448
+  _VSTD::wstring __w;
+  _CVT()(back_inserter(__w), __tmp.data(), __tmp.data() + __tmp.size());
+  return path(__w);
----------------
mstorsjo wrote:
> curdeius wrote:
> > It's there a patch where you add `_w.reserve(...)`? If not, this one seems like a good candidate.
> > If I'm not mistaken, widening can decrease the number of characters, so that might get tricky (like going through the input twice, once to count the output size, then doing the real conversion), but at least a FIXME note would be great.
> > BTW, you probably know that, `MultiByteToWideChar` will return the required output buffer size without doing the conversion if you pass 0 as `cchWideChar`.
> I guess I could call `reserve()` here using the length of the utf8 string as size. As converting from utf8 to wchar in most cases will make the input shorter, so the size allocated by `reserve()` should be enough in most concievable cases. For strings consisting mostly of ascii chars, the actual difference in length shouldn't be much, so it shouldn't hurt much with such a rough estimate.
> 
> Yeah I know that `MultiByteToWideChar` can count the needed output size, this is used in e.g. ` size_t __size = __char_to_wide(__str, nullptr, 0);` further up in the same patch when operating on the native narrow charset. I'd rather not mix `MultiByteToWideChar` with libc++'s `__widen_from_utf8`, especially as the exact length isn't needed beforehand here as that conversion allocates more as needed.
I agree that mixing `MultiByteToWideChar` with `__widen_from_utf8` doesn't seem like a good idea.
But I'll be for what you suggested, so just reserve possibly more.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D91137/new/

https://reviews.llvm.org/D91137