[libcxx-commits] [PATCH] D118930: [SystemZ]:[z/OS]:[libcxx]: fix the mask in stage2_float_loop function
Nikolas Klauser via Phabricator via libcxx-commits
libcxx-commits at lists.llvm.org
Fri Feb 18 12:25:41 PST 2022
philnik added a comment.
There should also be a test changed somewhere in `libcxx/test/std/localization`, or is this only part of the fix?
================
Comment at: libcxx/include/locale:549
{
- if (__a_end == __a || (__a_end[-1] & 0x5F) == (__exp & 0x7F))
+ if (__a_end == __a || (_VSTD::toupper(__a_end[-1]) == _VSTD::toupper(__exp)))
{
----------------
SeanP wrote:
> ldionne wrote:
> > This is going to be really naive, but can you explain why `exp & 0x7F` this is even equivalent to `toupper(exp)`?
> >
> > I understand the equivalence for `c & 0x5f` based on the table below, but I don't see the link for `0x7F`:
> >
> > ```
> > 0x41 01000001 A
> > 0x61 01100001 a
> > 0x42 01000010 B
> > 0x62 01100010 b
> > 0x43 01000011 C
> > 0x63 01100011 c
> > 0x44 01000100 D
> > 0x64 01100100 d
> > 0x45 01000101 E
> > 0x65 01100101 e
> > 0x46 01000110 F
> > 0x66 01100110 f
> > 0x47 01000111 G
> > 0x67 01100111 g
> > 0x48 01001000 H
> > 0x68 01101000 h
> > 0x49 01001001 I
> > 0x69 01101001 i
> > [...]
> > 0x58 01011000 X
> > 0x78 01111000 x
> > 0x59 01011001 Y
> > 0x79 01111001 y
> > 0x5A 01011010 Z
> > 0x7A 01111010 z
> >
> > 0x5F 01011111
> > 0x7F 01111111
> > 0x80 10000000
> > ```
> >
> The 0x7f is tied to the "|0x80" below. It is an ugly hack that takes advantage of the fact all ascii characters are in the range of 0-127. The hack enables or disables the check on line 558 (`(__x&0x5f)) == __exp`) from matching the letter 'E' as a valid character in the mantissa or the start of the exponent. When the high order bit is set in __exp, the test on line 558 will match 'a' and 'A', but never 'e' or 'E', but the test here will match 'e' and 'E'.
>
> This hack doesn't work for ebcdic since the characters are in the range of 0-255. To make the code work with both we changed the code to make `__exp` 'e' instead of 'E' plus the high order bit (line 560) and then on this line we upper case both `__exp` and the character to get the case insensitive compare and just upper case the character on line 558 to get the context sensitive compare.
>
> I recall the logic is:
> - line 549 - match any 'e'/'E' in the string
> - line 558 - only match the first 'e'/'E'
>
> Making `__exp` lower case instead of setting the high order bit achieves the same results.
Same below.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D118930/new/
https://reviews.llvm.org/D118930
More information about the libcxx-commits
mailing list