[libcxx-commits] [PATCH] D118930: [SystemZ]:[z/OS]:[libcxx]: fix the mask in stage2_float_loop function

Nikolas Klauser via Phabricator via libcxx-commits libcxx-commits at lists.llvm.org
Fri Feb 18 12:25:41 PST 2022


philnik added a comment.

There should also be a test changed somewhere in `libcxx/test/std/localization`, or is this only part of the fix?



================
Comment at: libcxx/include/locale:549
     {
-        if (__a_end == __a || (__a_end[-1] & 0x5F) == (__exp & 0x7F))
+        if (__a_end == __a || (_VSTD::toupper(__a_end[-1]) == _VSTD::toupper(__exp)))
         {
----------------
SeanP wrote:
> ldionne wrote:
> > This is going to be really naive, but can you explain why `exp & 0x7F` this is even equivalent to `toupper(exp)`?
> > 
> > I understand the equivalence for `c & 0x5f` based on the table below, but I don't see the link for `0x7F`:
> > 
> > ```
> > 0x41  01000001    A
> > 0x61  01100001    a
> > 0x42  01000010    B
> > 0x62  01100010    b
> > 0x43  01000011    C
> > 0x63  01100011    c
> > 0x44  01000100    D
> > 0x64  01100100    d
> > 0x45  01000101    E
> > 0x65  01100101    e
> > 0x46  01000110    F
> > 0x66  01100110    f
> > 0x47  01000111    G
> > 0x67  01100111    g
> > 0x48  01001000    H
> > 0x68  01101000    h
> > 0x49  01001001    I
> > 0x69  01101001    i
> > [...]
> > 0x58  01011000    X
> > 0x78  01111000    x
> > 0x59  01011001    Y
> > 0x79  01111001    y
> > 0x5A  01011010    Z
> > 0x7A  01111010    z
> > 
> > 0x5F  01011111
> > 0x7F  01111111
> > 0x80  10000000
> > ```
> > 
> The 0x7f is tied to the "|0x80" below.  It is an ugly hack that takes advantage of the fact all ascii characters are in the range of 0-127.  The hack enables or disables the check on line 558 (`(__x&0x5f)) == __exp`) from matching the letter 'E' as a valid character in the mantissa or the start of the exponent.  When the high order bit is set in __exp, the test on line 558 will match 'a' and 'A', but never 'e' or 'E',  but the test here will match 'e' and 'E'.  
> 
> This hack doesn't work for ebcdic since the characters are in the range of 0-255.  To make the code work with both we changed the code to make `__exp` 'e' instead of 'E' plus the high order bit (line 560) and then on this line we upper case both `__exp` and the character to get the case insensitive compare and just upper case the character on line 558 to get the context sensitive compare.
> 
> I recall the logic is:
> - line 549 - match any 'e'/'E' in the string
> - line 558 - only match the first 'e'/'E'
> 
> Making `__exp` lower case instead of setting the high order bit achieves the same results.
Same below.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D118930/new/

https://reviews.llvm.org/D118930



More information about the libcxx-commits mailing list