<html>
<head>
<meta http-equiv="Content-Type" content="text/html;
charset=windows-1252">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<div class="moz-cite-prefix">On 1/30/2018 8:18 AM, Sean Perry wrote:<br>
</div>
<blockquote type="cite"
cite="mid:OFDF0274BD.5452F6F3-ON00258225.00555EAF-85258225.00598A2E@notes.na.collabserv.com">
<p><font size="2">clang and llvm aren't performing any conversion
right now. Everything assumes the input, output and exec
charsets are UTF-8. One user scenario I am trying to enable is
the input charset being EBCDIC for a system where EBCDIC is
the charset. Doing this is non-trivial and exposes the issues
I outlined below and most likely more (eg. debug info).</font><br>
</p>
</blockquote>
<br>
Please don't mix together the issues of compiling for an EBCDIC
target and running LLVM on an EBCDIC host. I understand it's kind
of tied together from your perspective (since the end result you
want is a native compiler which runs on an EBCDIC target), but LLVM
is always built as a cross-compiler, so we need to consider them
separately to get a reasonable result.<br>
<br>
If you're cross-compiling UTF-8-encoded source code on a UTF-8 host
to a EBCDIC target, you need conversions in a few places in clang:
specifically, symbol names need to be translated when IR is
generated, and string/character literals need to be translated by
the lexer. And the LLVM backend might also need to convert certain
strings which are emitted into object files.<br>
<br>
If you're cross-compiling EBCDIC-encoded source code on a UTF-8 host
to a UTF-8 target, you need a conversion in exactly one place; the
input source code needs to be converted to UTF-8, once.<br>
<br>
If you're cross-compiling EBCDIC-encoded source code on a UTF-8 host
to a EBCDIC target, you need both of the above conversions.<br>
<br>
If you're compiling LLVM/clang for an EBCDIC host, everything
becomes complicated because both LLVM and clang assume they're
running in a ASCII-compatible locale; the issues you're describing
are primarily related to this. You probably want to leave this for
last because a lot of the changes involved will be controversial,
and it'll be easier to convince everyone it's useful if you have a
usable target.<br>
<p>-Eli<br>
</p>
<pre class="moz-signature" cols="72">--
Employee of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project</pre>
</body>
</html>