[libcxx-commits] [PATCH] D118851: [SystemZ]:[z/OS]:[libcxx]: fix isascii function to work for z/OS

Arthur O'Dwyer via Phabricator via libcxx-commits libcxx-commits at lists.llvm.org
Tue Mar 1 08:15:25 PST 2022


Quuxplusone added inline comments.


================
Comment at: libcxx/include/__locale:685
+#if defined(__MVS__)
+        return __c ? true : true;
+#else
----------------
SeanP wrote:
> Quuxplusone wrote:
> > NancyWang2222 wrote:
> > > ldionne wrote:
> > > > NancyWang2222 wrote:
> > > > > NancyWang2222 wrote:
> > > > > > Mordante wrote:
> > > > > > > Is this intended to always return true, if you please just return `true`. You can use a `(void) __c;` to silence compiler diagnostics.
> > > > > > clang will emit error if argument __c is not used in function. that is why I checked __c. 
> > > > > I will add void(__c); 
> > > > Why is this intended to always return true?
> > > isascii function is not right test for z/OS,  char type is between [0-255] which is valid character  indexing in __tab function on z/OS for both ASCII and EBCDIC. 
> > IIUC now (based on @SeanP's comment below), `__is_valid_in_encoding(x)` means no more or less than "`0 <= x && x < N`, where `N` is the number of elements in the array that was passed to the constructor on line 681." For most platforms this `N` is 128; for z/OS in particular, this `N` is 256.
> > 
> > Is this `N` always the same as `ctype::table_size` from lines 777–779? What goes wrong if we make this function simply
> > ```
> > static bool __is_in_range(size_t __c) { return __c < table_size; }
> > ```
> > ? See
> > https://en.cppreference.com/w/cpp/locale/ctype_char/classic_table
> > https://en.cppreference.com/w/cpp/locale/ctype_char/table
> > 
> > Personally (although @ldionne is welcome to overrule me) I'd like to move away from the existing code's use of `isascii(x)` as a shorthand for `x < 128`, because this bounds check doesn't really have anything to do with ASCII, and it's silly to have to read a man page just to find out it means `x < 128`. If we mean `x < table_size`, we should just say that.
> Good observation.  However, I noticed that table_size==256.  I don't think that we can assume that `ctype<char>::classic_table` and `__tab_` have the same range.  The code as it is will return false for anything out of range of `__tab_` for `ctype<char>::classic_table`.  If they should have the same size then we would have to update table_size for non-z/OS platforms.  It is currently set to 256 for all platforms.  I worry this will break people.  
> 
> If you think the `__is_in_range()` fucntion should return table_size, I'd rather do that in a separate patch to avoid mixing it in with making things work for z/OS.
My current thought is that there are at least two, maybe three, changes going on in this PR:
(1) Rename `isascii(x)` to `__is_valid_in_encoding(x)`. I agree that `isascii` is the wrong name, but I still hypothesize that the right name is nothing specifically to do with "encoding," but more like `__is_valid_table_index`. Sometimes the table in question is `__tab_`; sometimes it's `classic_table()`; do we want two different functions depending on which table we're talking about? Are all these tables guaranteed to have the same size? IMHO this renaming should be its own PR.
(2) Specifically for z/OS, change the behavior of `__is_valid_in_encoding` (or whatever its name(s) ends up as). This seems uncontroversial to me, but sadly is blocked on (1).
(3?) Other ad-hoc EBCDICifications, like replacing uses of `x - 'a'`? I see some non-functional diffs involving `L'A' - L'a'` and so on, but I'm not sure if there might be some functional diffs hiding among them. Anyway, I would imagine these would be uncontroversial, if they exist.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D118851/new/

https://reviews.llvm.org/D118851



More information about the libcxx-commits mailing list