[cfe-commits] [PATCH] Support for universal character names in identifiers

Eli Friedman eli.friedman at gmail.com
Wed Dec 19 13:18:15 PST 2012


On Tue, Dec 18, 2012 at 11:01 PM, Chris Lattner <clattner at apple.com> wrote:
>
> On Dec 18, 2012, at 8:40 PM, Eli Friedman <eli.friedman at gmail.com> wrote:
>
>>>> Oh, I see... so the idea is to hack up getCharAndSize instead of
>>>> calling isUCNAfterSlash/ConsumeUCNAfterSlash where we expect a UCN,
>>>> use a marker which essentially means "saw a UCN".
>>>>
>>>> Seems like a workable approach; I don't think it actually helps any
>>>> with error recovery (I'm pretty sure we can't diagnose anything
>>>> without knowing what kind of token we're forming), but I think it will
>>>> make the patch simpler.  I'll try to hack up a new version of my
>>>> patch.
>>>
>>> Attached.
>>
>> And, I've discovered a rather large weakness of this approach:
>> actually writing a correct implementation of getCharAndSizeSlow which
>> returns a special value for UCNs is painful at best.  I might have to
>> abandon this route.
>
> How terrible would it be to make getChar* return a uint32_t codepoint?  Would that fix the problem?

That doesn't even help; the issue is that checking for a UCN itself
requires calling getCharAndSize, and I'm not sure how to structure it
correctly without e.g. blowing out the stack for code with a bunch of
consecutive slashes.  Thinking a bit more, it isn't that tricky, just
more code than I was expecting to write.

-Eli



More information about the cfe-commits mailing list