[PATCH] [2/6] Convert non-printing characters to their octal sequence before emitting #line directive or __FILE__ macro

Yunzhong Gao Yunzhong_Gao at playstation.sony.com
Wed Sep 11 14:22:52 PDT 2013


  > -----Original Message-----
  > From: Arthur O'Dwyer [mailto:arthur.j.odwyer at gmail.com]
  > Sent: Wednesday, September 11, 2013 1:35 PM
  > To: Gao, Yunzhong; cfe-commits
  > Cc: reviews+D1291+public+fd336e303f55df78 at llvm-reviews.chandlerc.com
  > Subject: Re: [PATCH] [2/6] Convert non-printing characters to their octal
  > sequence before emitting #line directive or __FILE__ macro
  >
  > If #include directives will use UTF-8, then __FILE__ must also use UTF-8, so
  > that this will work:
  >
  >     #include __FILE__
  >
  > And I would expect #line directives also to use UTF-8. The only good rationale
  > I can imagine is that you're dealing with badly behaved third-party generators
  > such as lex/yacc which dump malformed #line directives into the source file.
  >
  > The patch looks good to me, but the stated rationale is misleading; I don't
  > think this patch helps with anything on a well-behaved system (even one
  > where the filesystem charset is Shift-JIS). It merely helps Clang not-barf on
  > malformed input (such as that produced by a badly behaved lex/yacc).
  >
  > my $.02,
  > -Arthur

  For some reason, your replies just won't appear in Phabricator while Eli's went
  in just fine. Weird.

  I think, a UTF-8 encoded source file should not contain shift-jis encoded lines like this:
  #include "こんにちは.c"

  But it is okay to have lines like this:
  #include "\202\261\202\361\202\311\202\277\202\315.c"

  You might be right that the current patch does not help the compiler find the included file
  because the compiler will attempt a UTF-8 to unicode translation on the shift-jis file name.
  It only makes sure that you do not have strange characters in the preprocessed file.

  The equivalent UTF-8 encoded file name like the following might help the compiler find the file:
  #include "\343\203\231\343\203\274\343\202\267\343\203\203\343\202\257.c"

http://llvm-reviews.chandlerc.com/D1291



More information about the cfe-commits mailing list