[PATCH] [2/6] Convert non-printing characters to their octal sequence before emitting #line directive or __FILE__ macro
Yunzhong_Gao at playstation.sony.com
Wed Sep 11 14:22:52 PDT 2013
> -----Original Message-----
> From: Arthur O'Dwyer [mailto:arthur.j.odwyer at gmail.com]
> Sent: Wednesday, September 11, 2013 1:35 PM
> To: Gao, Yunzhong; cfe-commits
> Cc: reviews+D1291+public+fd336e303f55df78 at llvm-reviews.chandlerc.com
> Subject: Re: [PATCH] [2/6] Convert non-printing characters to their octal
> sequence before emitting #line directive or __FILE__ macro
> If #include directives will use UTF-8, then __FILE__ must also use UTF-8, so
> that this will work:
> #include __FILE__
> And I would expect #line directives also to use UTF-8. The only good rationale
> I can imagine is that you're dealing with badly behaved third-party generators
> such as lex/yacc which dump malformed #line directives into the source file.
> The patch looks good to me, but the stated rationale is misleading; I don't
> think this patch helps with anything on a well-behaved system (even one
> where the filesystem charset is Shift-JIS). It merely helps Clang not-barf on
> malformed input (such as that produced by a badly behaved lex/yacc).
> my $.02,
For some reason, your replies just won't appear in Phabricator while Eli's went
in just fine. Weird.
I think, a UTF-8 encoded source file should not contain shift-jis encoded lines like this:
But it is okay to have lines like this:
You might be right that the current patch does not help the compiler find the included file
because the compiler will attempt a UTF-8 to unicode translation on the shift-jis file name.
It only makes sure that you do not have strange characters in the preprocessed file.
The equivalent UTF-8 encoded file name like the following might help the compiler find the file:
More information about the cfe-commits