[cfe-dev] gcc switch -fexec-charset=IBM-1047 to generate EBCDIC character constants

Fri Apr 11 02:21:19 PDT 2014

This append is related to integers being expressed as character
constants, e.g., 'a'.  Strings, e.g., "a" are not an issue in the
assembler code produced.

Consider a trivial program:

   int main(void) {char a='a';return a;}

Compiling with

   clang -target s390x-linux-gnu -S -D__x86_64__ -O2 test.c

Gets me this assembler:

main:
        lghi    %r2, 97
        br      %r14

Which is all correct as z/Linux is an ASCII operating system.

However, there are other operating systems for IBM's z/Architecture that
use the EBCDIC encoding, and there one wants 0x81 for 'a'.

If the constant 'a' could pass through the compilation system (even if
the assembler does not support such constants), I would have less than a
smop; as it is now, the code generated is indistinguishable from a =
0x61, which should not be converted to EBCDIC.

Any pointer to where I should start hacking would be greatly appreciated.

An alternative would be a target specification triple, e.g.,
s390-zvm-cms, but one would still wish to specify the target code page
as Germans are likely to want a different one from the French.  And
presumably that also means a new back end (?)

Finally, the gcc implementation is not optimal because the conversion is
also applied to strings, in particular the ones in printf() and that
severely messes up the checking as the EBCDIC string is scanned for
ASCII %, which is not helpful.