[libcxx Windows][PATCH] mbsnrtowcs and wcsnrtombs are broken.

Howard Hinnant hhinnant at apple.com
Sat Mar 30 08:52:13 PDT 2013


Below is an anonymously donated patch for libcxx on Windows.  For those interested in libcxx on Windows, please review and recommend either accepting or rejecting this patch.

Thanks,
Howard

------------------
mbsnrtowcs and wcsnrtombs are broken.
------------------------------------
The Win32/mingw64 libcxx implementations of mbsnrtowcs and wcsnrtombs (in Win32\support.cpp) are broken.
They've been broken for a long time, by the date of this:
http://stackoverflow.com/questions/7528363/implementing-mbsnrtowcs-in-terms-of-mbsrtowcs
How this affects libcxx
-----------------------
Functionality that is dependent on mbsnrtowcs and wcsnrtombs crash 100% of the time.
A lot of important functionality is affected. For instance:
mbsnrtowcs_l and wcsnrtombs_l (in locale.cpp) depend on the above functions.
Dependent on those are do_in and do_out; and on those, the iostream classes.
Through this dependency, all these functions are broken.
So even wcout and wcin are broken. wcout << L"Hello" just crashes.
This also affects an estimated 20-30 test cases; though they suffer from other issues too; that means you can't easily verify this, but I have.
If you can imagine what else is broken if the above is true, tell me so I can verify if it's broken/fixed.
What's wrong in essence
-----------------------
The problem is that mbsnrtowcs is implemented in terms of mbsrtowcs; and wcsnrtombs in terms of wcsrtombs. In each case the former function should return a valid pointer to the latter which should propogate it back to libcxx which depends on it being valid.
The bug is that the former function doesn't even *try* to propogate it and the issue is the latter function doesn't even provide it on Win32!
The reason is the latter functions don't conform between Windows and Mac. On Windows they just return NULL; on Mac/Linux they do the right thing and produce a valid pointer. libcxx expects the latter Mac/Linux implementation and will crash with it.
The fixes
---------
The purpose of the first "minimal" patch attached is to fix an overwrite bug and a memory leak bug to make the code at least try to propogate the pointers back. This is: a) less buggy; b) easier to understand because it's at least conceptually correct; c) should MS ever fix the second functions the first would work; d) it forms a sensible base, should the "maximal" patch attached, which is what is what is really being proposed, be rejected or reverted.
Beyond this, the minimal patch is practically useless.
The solution is to rewrite these functions to conform to how libcxx and linux/mac expects; and to how the original author of these functions' FIXME requested. That's what I've done and that's the "maximal" patch, attached.
If you want to bail now and return to the short version above, there's still time!
What's wrong in detail
----------------------
The problem with mbsnrtowcs
---------------------------
If you read the documentation for mbsnrtowcs, and there's some here:
http://man7.org/linux/man-pages/man3/mbsnrtowcs.3.html
it states:
"2. The nms limit forces a stop, or len non-L'\0' wide characters have been
          stored at dest.  In this case *src is left pointing to the next
          multibyte sequence to be converted"
If you then read the code below, from win32\support.cpp, you'll see there's no code that satisfies that clause. There's nothing that sets *src "pointing to the next multibyte sequence to be converted". What you can see is just my comments and *+proposed* code to do that, by adding an "else" path that *should* do that if non-conformance didn't get in the way.
// 1. FIXME: use wcrtomb and avoid copy
size_t mbsnrtowcs( wchar_t *__restrict dst, const char **__restrict src,
                   size_t nmc, size_t len, mbstate_t *__restrict ps )
{
    char* local_src = new char[nmc+1]; // 1. new + strncpy, a potential performance problem according to the FIXME.
    char* nmcsrc = local_src;
    strncpy( nmcsrc, *src, nmc );
    nmcsrc[nmc] = '\0';
    const size_t result = mbsrtowcs( dst, const_cast<const char **>(&nmcsrc), len, ps );
    // propagate error
    if( nmcsrc == NULL ) // 2. *src is only set on this condtion, there's no else.
        *src = NULL;
+   else // Proposed
+       *src = *src + local_src - nwcsrc; // 3. Leave *src pointing to the next character to convert *in the callers* buffer?
    delete[] local_src;
    return result;
}
So it doesn't work on Win32 as previously explained. (it doesn't make things worse though).
It *would* work *if* mbsrtowcs (the function called by the above one) were implemented as described here:
http://man7.org/linux/man-pages/man3/mbsrtowcs.3.html
because it says (in section 2 again) that a pointer to the next character to convert is returned.
But on Mingw64/Win32, which is more MeSsy than Macy/Linuxy, it doesn't work because it actually conforms to this documentation here:
http://msdn.microsoft.com/en-us/library/esth3h8c(v=vs.80).aspx
and that doesn't say mbsrtowcs returns a valid pointer. I hoped that might have been a documentational oversight, but I've tested it and it isn't. The pointer is always NULL on Windows.
So despite the Windows function having the same name as on other platform, it doesn't conform or work the same way. Go figure.
This means there is no way (that I can think of) to salvage the code above with a minimal patch along the lines I have shown while the solution is based on the current functions mentioned; because the pointer needed just isn't returned on Win32.
The problem with wcsnrtombs
---------------------------
Below is *wcsnrtombs*, from win32\support.cpp. It's affected similarly to mbsnrtowcs, but more broken.
If you read the comments and the +added and -removed lines, and see how those lines relate to the documentation, some is here:
http://man7.org/linux/man-pages/man3/wcsnrtombs.3.html
you'll see the code is missing the same else path to propogate the "next character to convert" pointer, like mbsnrtowcs.
You'll also the -current code has an incorrect delete[].
// 1. FIXME: use wcrtomb and avoid copy
size_t wcsnrtombs( char *__restrict dst, const wchar_t **__restrict src,
                   size_t nwc, size_t len, mbstate_t *__restrict ps )
{
    wchar_t* local_src = new wchar_t[nwc]; // 1. new, potential performance problem and 2. Should be nwc + 1 to avoid overwrite.
    wchar_t* nwcsrc = local_src;
    wcsncpy(nwcsrc, *src, nwc);
    nwcsrc[nwc] = '\0'; // 2. Causes the overwrite without the +1 above.
    const size_t result = wcsrtombs( dst, const_cast<const wchar_t **>(&nwcsrc), len, ps );
    // propogate error
    if( nwcsrc == NULL ) // 3. What about else?
        *src = NULL;
+   else // Proposed
+       *src = *src + local_src - nwcsrc; // 3. Leave *src pointing to next character to convert, in the callers buffer.
-    delete[] nwcsrc; // 4. DON'T delete a random point in our allocated buffer. It doesn't make sense and will crash.
+    delete[] local_src; //4. Proposed, DO delete the our allocated buffer, but at the start.
    return result;
}
Even if the function the above one calls (wcsrtombs) returned a valid "next character to convert" pointer, i.e. it conformed to the documentation here:
http://man7.org/linux/man-pages/man3/wcsrtombs.3.html
the -current code above would still not work; it would try to delete a pointer into it's own allocated buffer instead of the buffer itself.
Of course, on Mingw/Win32, it's Microsoft's implementation that's in effect:
http://msdn.microsoft.com/en-us/library/esth3h8c(v=vs.80).aspx
so that doesn't happen.
What happens is the function is consistent with kin from earlier, and also doesn't return a useful pointer; it's returns NULL. That causes libcxx to delete the NULL pointer instead of it's previously allocated buffer and that causes a memory leak. But it's the returning of NULL as the next character to convert pointer, that causes the problems again; because libcxx expects a meaningful value not NULL. So wcsnrtombs then crashes in the same way that mbsnrtowcs does.
The solution and why
--------------------
All of the above was to explain what is wrong, the extent of it, and why the most "obvious" solution doesn't and won't work; and why a new solution is needed; and why it can't be based on the existing code (AFAIK).
The initially proposed code (in the examples above) does improve the baseline, i.e. it removes some bugs (the overwrite and memory leak), and makes things conceptually correct, and improves conformance. For that reason I've attached it as a "minimal patch". But without the next character to convert pointer fix, it leaves the situation otherwise practically the same unless Microsoft fixed their implementations.
However, even that might not be the end of it; the functions on apple/linux implement further features, like a measuring facility when the destination pointer is NULL. Microsoft's derivative functions don't mention this. libcxx may or may not depend on this feature, but I can't assume functions outside of libcxx won't depend on it. So again, to gain those features, a complete rewrite is probably required.
I've tried the minimal fix explained above and it works as well as it's defined to. i.e. things are better but not practicaly one bit.
The "maximal" patch fixes all of the bugs/issues mentioned, including the measuring feature, and it implements it using the method stated in the original authors FIXME comments that you can see above.
Disclaimer
----------
The patch (and supporting test cases) are wholly written by me, no one else, and based on documentation (not code) provided by the non Microsoft links already given above. There is no copyright attached based on these statements and my only claim is the right to contribute this code to others should I/they wish it. If there are any immediate issues with it, I am willing to make reasonable efforts to fix them.
Ideal application (not assumuing this is possible nor necessary)
-----------------
The two test cases attached are designed to expose the faults in the existing libcxx mingw32 versions of these functions.
wctest.cpp relates to wcsnrtombs.
mbctest.cpp relates to mbsnrtowcs.
The test cases should ideally be included in the libcxx test suite, because they test functionality that the suite isn't already testing *in this way* and they do so without using iostreams. This is of value because most/all of the other wide character related tests use iostreams and as that is broken too for other reasons (a seek/ftell problem) these patches won't appear to fix the the existing libcxx test cases until those problems are fixed; but these new test cases should appear to start working immediately after the patches are committed.
The test cases are simple and designed to check the related functions return value, returned pointer and returned character data.
They are known to work on mingw64 and expected to work on Linux / Mac. They will not work with Microsoft's SDK because the related function are't present, but they should otherwise compile with Microsoft's compiler. Ideally they should be reviewed to confirm this and their accuracy.
The test cases should be compiled and run with asserts enabled. When they work, i.e. no message is issued. When they don't, they should print a message and assert. If the test cases are submitted before the patches are applied, they should work on Mac/Linux and fail on Windows. After the patches are applied, the test cases should work on Windows/Mingw64 too.
Where that leaves things
------------------------
Only once the attached patches AND all the prior ones I've sent have been applied, will the two attached test cases work. That will likely be the only noticable difference from the perspective of the libcxx's test suite.
Only when the next patch is applied that I'll send later (relating to ftell/fseek), will the 20-30 more test cases (that I mentioned earlier) pass on Win32/Mingw64.
These functions are so fiddly one shouldn't expect this to be the last word on the subject (doh!).
Sorry for being over explanatory, I'm just trying to help you get as strong of a mental picture of what's happening on a platform you can't test, as possible.
Don't let the words make it sound like there's a lot changing here or a lot of risk. None of this relates the Apple platform other than the test cases; and for the affected platform things are already so broken where this patch targets, it can't make that platform any worse. :)

This is the difference of running testit on c:\libcxx\test\input.output.
I didn't run it on the full suite since I changed my fix, and it takes so long.
But once you apply all this I will send the full report later that will show before all these patches were applied and after, on the whole set of test cases.
 
Before:
----------------------------------------------------
sections without tests   : 0
sections with failures   : 26
sections without failures: 104
                       +   ----
total number of sections : 130
----------------------------------------------------
number of tests failed   : 61
number of tests passed   : 296
                       +   ----
total number of tests    : 357
****************************************************

After:
----------------------------------------------------
sections without tests   : 0
sections with failures   : 12
sections without failures: 118
                       +   ----
total number of sections : 130
----------------------------------------------------
number of tests failed   : 24
number of tests passed   : 333
                       +   ----
total number of tests    : 357
****************************************************

-------------- next part --------------
A non-text attachment was scrubbed...
Name: fstream_windows.patch
Type: application/octet-stream
Size: 7992 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20130330/78860500/attachment.obj>


More information about the cfe-commits mailing list