PATCH: In -traditional mode, ignore token pasting and stringification (PR16371)

Ahmed Bougacha ahmed.bougacha at gmail.com
Fri Mar 13 17:56:58 PDT 2015


Forgive the necromancy, but I noticed this in the testcase:

+#define FOO_NO_PASTE(a, b) test(b##a)
+FOO_NO_PASTE(foo,bar)
+/* CHECK {{^}}test(bar##foo){{$}}

The "CHECK:" colon is missing.  Currently this doesn't check anything,
and adding the colon actually fails the test (I get "test(bar##bar)").

-Ahmed

-Ahmed


On Mon, Jul 8, 2013 at 6:02 PM, Richard Smith <richard at metafoo.co.uk> wrote:
> On Mon, Jul 8, 2013 at 5:24 PM, Austin Seipp <aseipp at pobox.com> wrote:
>> On Mon, Jul 8, 2013 at 3:17 PM, Richard Smith <richard at metafoo.co.uk> wrote:
>>> On Fri, Jul 5, 2013 at 5:57 PM, Austin Seipp <aseipp at pobox.com> wrote:
>>>> Well, just to be clear, there's absolutely no intention of completely
>>>> emulating -traditional's behavior. We don't need full emulation, just
>>>> [Feature X].
>>>
>>> This is exactly the argument that was used last time a feature was
>>> added to our -traditional-cpp implementation. If we keep incrementally
>>> adding features, it seems very likely that we'll end up with a
>>> poorly-designed implementation, and no point along the way where we
>>> could say "at this point we stop and rewrite".
>>
>> Point taken.
>>
>>>> The example I brought up here about expansions in literal quotations
>>>> was to point out that, this patch 'implements' behavior found in GCC's
>>>> -traditional mode, but with an exception. Of course, that's really all
>>>> Clang's -traditional mode is anyway: a small collection of *some* of
>>>> GCCs behaviors, with caveats even at that. So, that considered I think
>>>> this is fine: the patch has relatively small impact/scope, and is
>>>> pretty simple on top of that.
>>>
>>> It seems unfortunate for neither stringization nor macro expansion
>>> into string literals to work. That said, if anyone actually cares
>>> about the latter, maybe we can persuade them to design and implement a
>>> proper traditional preprocessor.
>>>
>>>> On Fri, Jul 5, 2013 at 6:50 PM, Eli Friedman <eli.friedman at gmail.com> wrote:
>>>>> On Fri, Jul 5, 2013 at 3:44 PM, Austin Seipp <aseipp at pobox.com> wrote:
>>>>>>
>>>>>> Hello,
>>>>>>
>>>>>> Attached is a patch that makes the preprocessor ignore token pasting
>>>>>> (##) and stringification (#) when in -traditional mode. This makes it
>>>>>> behave more like GCC[1].
>>>>>>
>>>>>> This change fixes PR16371, and is needed for Clang to function
>>>>>> properly as a preprocessor for Haskell (in the Glasgow Haskell
>>>>>> Compiler.) If you're curious and look at the bug, I made some
>>>>>> incorrect assumptions about the behavior of -traditional for GCC (and
>>>>>> attached a bad patch,) but this fixes the problem in the principled
>>>>>> way. And the patch is simpler, which is good too.
>>>
>>> There's something distasteful about the whole approach here. Haskell's
>>> lexing rules are not the same as C or C++'s. It's wrong to use a
>>> standard C preprocessor in Haskell (for instance, ' will be
>>> mistreated, and with GHC extensions so will #), and it's also wrong to
>>> use a traditional C preprocessor (for instance, macros will be
>>> expanded inside string literals).
>>
>> (I don't disagree that our current situation is a wee bit unfortunate,
>> just for the record.)
>>
>>> This is not the only tweak you'll need to get Clang's preprocessor to
>>> preprocess Haskell properly. For instance, consider:
>>>
>>>   MACRO(foo') + MACRO(foo')
>>>
>>> A proper Haskell preprocessor would treat foo' as a single token.
>>> Clang will treat ') + MACRO(foo' as a single token.
>>
>> Well, this situation is pretty unlikely as it stands because GCC
>> doesn't correctly lex this either. End-users can force GHC to use
>> something like cpphs (which has the correct behavior,) but very few
>> packages actually do this, while the number of packages that use the
>> preprocessor itself is very high. So really, lexing rules are very
>> rarely violated in such a way. Which leads to...
>>
>>> Have you considered using cpphs instead?
>>
>> Yes, and there have been some ideas of writing a proper traditional
>> preprocessor library and integrating it into GHC (cpphs is GPL, which
>> is off limits from an API standpoint.) However, in light of this work
>> (which is not your problem, and a longer term thing for us) I merely
>> attempted to make GHC use Clang, and discovered this deficiency in
>> -traditional's behaviour. And so the story leads us here...
>>
>> I was also unaware of the NetBSD tradcpp work which Joerg mentioned,
>> which is also promising as a possible replacement for the longer term.
>>
>>> I agree with your comments; implementing a fundamentally
>>> character-based preprocessor on top of our current token-based
>>> preprocessor isn't the right approach in the long term. But
>>> pragmatically, this is a small point fix to disable a feature that
>>> should not be enabled in our existing token-based traditional
>>> preprocessor, so I don't think this is a big deal. If there really is
>>> a demand for it, I find that more compelling than the slippery-slope
>>> argument.
>>
>> In any case, I appreciate the timely feedback and review. Thanks.
>
> Patch committed as r185896.
> _______________________________________________
> cfe-commits mailing list
> cfe-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits



More information about the cfe-commits mailing list