[cfe-commits] [PATCH] Comment parsing: resolve HTML character references (e.g., & -> &)

Dmitri Gribenko gribozavr at gmail.com
Fri Jul 27 13:38:16 PDT 2012


On Wed, Jul 25, 2012 at 3:05 PM, Jordan Rose <jordan_rose at apple.com> wrote:
>
> On Jul 25, 2012, at 14:59 , Dmitri Gribenko <gribozavr at gmail.com> wrote:
>
>> On Wed, Jul 25, 2012 at 2:56 PM, Jordan Rose <jordan_rose at apple.com> wrote:
>>>
>>> On Jul 25, 2012, at 14:54 , Dmitri Gribenko <gribozavr at gmail.com> wrote:
>>>
>>>> On Wed, Jul 25, 2012 at 2:51 PM, Jordan Rose <jordan_rose at apple.com> wrote:
>>>>> This seems like a very bad idea when I have this in a comment:
>>>>>
>>>>> <em>0<i</em>
>>>>>
>>>>> If you expand the '<', you end up with invalid HTML. Entities are
>>>>> supposed to be entities when they come out the other end.
>>>>
>>>> '<' will be expanded in the internal representation.  HTML renderer
>>>> will escape HTML special characters back.
>>>
>>> …as long as my test case is emitted unchanged, I don't mind, but I think it's non-trivial to expand entities in "<em>0<i</em>" and keep track of which "<" are supposed to be escaped.
>>
>> '<em>' is a separate AST node and "0<i" is (three) plain text nodes,
>> so it is actually simple.
>>
>> Added your example to tests.
>
> Oh right. Forgot you were already lexing HTML. Okay, once again sorry for the noise.

Thank you for looking at this!

Committed r160890, r160891.

Dmitri

-- 
main(i,j){for(i=2;;i++){for(j=2;j<i;j++){if(!(i%j)){j=0;break;}}if
(j){printf("%d\n",i);}}} /*Dmitri Gribenko <gribozavr at gmail.com>*/




More information about the cfe-commits mailing list