[llvm-dev] RFC: Adding support for the z/OS platform to LLVM and clang

Kai Peter Nacke via llvm-dev llvm-dev at lists.llvm.org
Tue Jun 16 05:50:51 PDT 2020


> > 2) Add patches to Clang to allow EBCDIC and ASCII (ISO-8859-1) encoded 

> input source files. This would be done at the file open time to allow 
the 
> rest of Clang to operate as if the source was UTF-8 and so require no 
> changes downstream. Feedback on this plan is welcome from the Clang 
> community.
> Would it be correct to assume that this EBCDIC -> UTF-8 mapping 
> would be as prescribed by
> UTF-EBCDIC / IBM CDRA, notably for the control characters that do 
> not map exactly?
> Notably, if the execution encoding is EBCDIC, is '0x06' equivalent 
> to '0086', etc?
> 
> The question "Is Unicode sufficient to represent all characters 
> present in the input source without using the Private Use Area?" is one 
that
> is relevant to both Clang and the C/C++ standard. ( I do hope that 
> it is the case!)  

The current goal is to make only minimal changes to the frontend to enable 
reading of EBCDIC encoded files. For this, we use the auto-conversion 
service of z/OS UNIX System Services (
https://www.ibm.com/support/knowledgecenter/SSLTBW_2.4.0/com.ibm.zos.v2r4.bpxb200/xpascii.htm 
), together with file tagging and setting the CCSID for the program and 
for opened files.. The auto-conversion service supports round-trip 
conversion between EBCDIC and Enhanced ASCII. With it, boot strapping with 
EBCDIC source files is possible.
Of course, more complete UTF-8 support is a valid implementation 
alternative.

Best regards,
Kai Nacke
IT Architect

IBM Deutschland GmbH
Vorsitzender des Aufsichtsrats: Sebastian Krause
Geschäftsführung: Gregor Pillen (Vorsitzender), Agnes Heftberger, Norbert 
Janzen, Markus Koerner, Christian Noll, Nicole Reimer
Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart, 
HRB 14562 / WEEE-Reg.-Nr. DE 99369940




More information about the llvm-dev mailing list