[libcxx-commits] [PATCH] D98207: [SystemZ][z/OS] Missing wchar functions libc++

Daniel McIntosh via Phabricator via libcxx-commits libcxx-commits at lists.llvm.org
Wed Jun 9 10:49:50 PDT 2021


DanielMcIntosh-IBM requested changes to this revision.
DanielMcIntosh-IBM added a comment.
This revision now requires changes to proceed.

If this is taken from windows, I will point out that, on windows wide characters are only 16 bits (on most systems, wchar_t is 32 bits). This may result in some assumptions in the implementation which aren't true for z/OS. I'm particularly concerned about line 39 potentially writing more than `dest_remaining` characters (TBH, I'm a little surprised this part works correctly on windows). This could result in writing past the end of dst (i.e. a buffer overflow). This would then be made even worse by `dest_remaining -= char_size` not setting dest to 0 at that point, and thus the `! dest_remaining` check not triggering. Assuming it has the correct license, using the z/OS implementation of `wcsrtombs` as a reference might be better (unlike `wcsnrtombs` which is a POSIX extention, `wcsrtombs` is a part of the C and C++ standards).

If you don't end up re-writing this based on the z/OS implementation of `wcsrtombs`, I've made several comments regarding how to clean up the control flow. Most are small, but together I think they will result in a significant improvement to readability.



================
Comment at: libcxx/src/support/ibm/wcsnrtombs.cpp:35-36
+  while ( source_converted != max_source_chars ) {
+      if (! dest_remaining)
+          break;
+      wchar_t c = (*src)[source_converted];
----------------
Any reason this can't just be a part of the loop condition? i.e. `while (source_converted != max_source_chars && dest_remaining != 0)`


================
Comment at: libcxx/src/support/ibm/wcsnrtombs.cpp:45-55
+      if (char_size > 0) {
+          if (c == L'\0') {
+              terminator_found = true;
+              break;
+          }
+          ++source_converted;
+          if (dst)
----------------
As per LLVM coding standards we should probably invert this condition so it is an early-exit. i.e. `if (char_size <= 0) { have_result = true; break; }`


================
Comment at: libcxx/src/support/ibm/wcsnrtombs.cpp:48
+              terminator_found = true;
+              break;
+          }
----------------
This one takes a little bit more work, but I think it's cleaner to return here instead of break. (Also, return is an even earlier exit than break, so I think that would be preferred by the LLVM coding standards).

We know control will jump to line 59, and the `if (terminator_found)` on line 60 is only true if we got to line 59 from this `break`. Thus, we can replace line 47 and 48 with
```
if (dst)
    *src = NULL;
return dest_converted;
```
Then, get rid of the `terminator_found` state variable, and simplify lines 59-64.
I think this will also probably enable further simplification of the control flow in a few places, but at this point I've recommended so many changes it's hard to keep track.


================
Comment at: libcxx/src/support/ibm/wcsnrtombs.cpp:66-68
+  if (have_result && char_size < 0) {
+      return static_cast<size_t>(-1);
+  }
----------------
If I'm not mistaken, when `char_size < 0`, `have_result` is guaranteed to be true (if the loop doesn't run then `char_size == 0`, and if `char_size <= 0` in the loop, then we set `have_result = true` and immediately exit). Thus, we can get rid of the `have_result` state variable.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D98207/new/

https://reviews.llvm.org/D98207



More information about the libcxx-commits mailing list