[llvm-dev] RFC: Adding support for the z/OS platform to LLVM and clang
Kai Peter Nacke via llvm-dev
llvm-dev at lists.llvm.org
Tue Jun 16 05:50:51 PDT 2020
> > 2) Add patches to Clang to allow EBCDIC and ASCII (ISO-8859-1) encoded
> input source files. This would be done at the file open time to allow
the
> rest of Clang to operate as if the source was UTF-8 and so require no
> changes downstream. Feedback on this plan is welcome from the Clang
> community.
> Would it be correct to assume that this EBCDIC -> UTF-8 mapping
> would be as prescribed by
> UTF-EBCDIC / IBM CDRA, notably for the control characters that do
> not map exactly?
> Notably, if the execution encoding is EBCDIC, is '0x06' equivalent
> to '0086', etc?
>
> The question "Is Unicode sufficient to represent all characters
> present in the input source without using the Private Use Area?" is one
that
> is relevant to both Clang and the C/C++ standard. ( I do hope that
> it is the case!)
The current goal is to make only minimal changes to the frontend to enable
reading of EBCDIC encoded files. For this, we use the auto-conversion
service of z/OS UNIX System Services (
https://www.ibm.com/support/knowledgecenter/SSLTBW_2.4.0/com.ibm.zos.v2r4.bpxb200/xpascii.htm
), together with file tagging and setting the CCSID for the program and
for opened files.. The auto-conversion service supports round-trip
conversion between EBCDIC and Enhanced ASCII. With it, boot strapping with
EBCDIC source files is possible.
Of course, more complete UTF-8 support is a valid implementation
alternative.
Best regards,
Kai Nacke
IT Architect
IBM Deutschland GmbH
Vorsitzender des Aufsichtsrats: Sebastian Krause
Geschäftsführung: Gregor Pillen (Vorsitzender), Agnes Heftberger, Norbert
Janzen, Markus Koerner, Christian Noll, Nicole Reimer
Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart,
HRB 14562 / WEEE-Reg.-Nr. DE 99369940
More information about the llvm-dev
mailing list