[cfe-commits] r68975 - in /cfe/trunk: lib/CodeGen/CodeGenModule.cpp test/CodeGen/illegal-UTF8.m

Mon Apr 13 13:02:50 PDT 2009

On Mon, Apr 13, 2009 at 12:46 PM, Chris Lattner <clattner at apple.com> wrote:
>
> On Apr 13, 2009, at 12:43 PM, Eli Friedman wrote:
> > On Mon, Apr 13, 2009 at 12:08 PM, Steve Naroff <snaroff at apple.com>
> > wrote:
> >> Author: snaroff
> >> Date: Mon Apr 13 14:08:08 2009
> >> New Revision: 68975
> >>
> >> URL: http://llvm.org/viewvc/llvm-project?rev=68975&view=rev
> >> Log:
> >> Fixed crasher in <rdar://problem/6780904> [irgen] Assertion failed:
> >> (Result == conversionOK && "UTF-8 to UTF-16 conversion failed"),
> >> function GetAddrOfConstantCFString, file CodeGenModule.cpp, line
> >> 1063.
> >
> > We should not be letting invalid strings through Sema.  Either the
> > Lexer or Sema needs to deal with this; it needs to either error out or
> > warn and "fix" the string to use a 0xFFFD.
> >
> > I would suggest reverting this fix because it does nothing but hide
> > the issue.
>
> Perhaps I don't understand the issue fully, but why is "\xff\xff"
> necessarily a unicode string?

I definitely don't understand the issue fully, but I agree with Eli's
sentiment. If the string is supposed to be treated as unicode, but has
errors, then it should be fixed in the AST and a warning (error) generated.
If it isn't supposed to be treated as Unicode, then IRgen shouldn't try to
convert it.

In any case, IRgen should not have to deal with a conversion failure.

 - Daniel

> -Chris
>
> _______________________________________________
> cfe-commits mailing list
> cfe-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20090413/8ec8c2d0/attachment.html>