r200787 - Fix whitespace handling in empty macro expansions

Harald van Dijk harald at gigawatt.nl
Mon Feb 24 15:09:01 PST 2014


On 24/02/14 21:58, Richard Smith wrote:
> On Sun, Feb 9, 2014 at 1:01 AM, Harald van Dijk <harald at gigawatt.nl
> <mailto:harald at gigawatt.nl>> wrote:
> 
>     On 05/02/14 23:16, Richard Smith wrote:
>     > On Wed, Feb 5, 2014 at 1:08 PM, Harald van Dijk
>     <harald at gigawatt.nl <mailto:harald at gigawatt.nl>
>     > <mailto:harald at gigawatt.nl <mailto:harald at gigawatt.nl>>> wrote:
>     >
>     >     On 05/02/14 00:42, Richard Smith wrote:
>     >     > Here's an easy test:
>     >     >
>     >     > #define foo() bar() b
>     >     > #define bar()
>     >     > a
>     >     > foo()
>     >     > c
>     >     >
>     >     > This should produce:
>     >     >
>     >     > # 1 "..."
>     >     >
>     >     >
>     >     > a
>     >     >  b
>     >     > c
>     >     >
>     >     > ... but currently produces:
>     >     >
>     >     > #1 "..."
>     >     >
>     >     >
>     >     > a b
>     >     >
>     >     > c
>     >
>     >     Thanks Richard, I don't know how I managed to miss that.
>     >
>     >     Setting Token::StartOfLine as mentioned gets this right in the
>     lexer.
>     >     However, even though the b token in the output has both
>     >     Token::StartOfLine and Token::LeadingSpace set,
>     >     PrintPPOutputPPCallbacks::HandleFirstTokOnLine uses the column
>     number to
>     >     determine whether to print spaces, not the flag. As a result,
>     the output
>     >     becomes not
>     >
>     >     a
>     >      b
>     >     c
>     >
>     >     but
>     >
>     >     a
>     >     b
>     >     c
>     >
>     >     Given that the lexer behaviour is now right, would it be okay
>     to leave
>     >     this as it is, and only add a test to check that b is on a
>     separate
>     >     line, like attached? (I used first/second/third instead of
>     a/b/c in an
>     >     attempt to get FileCheck to print a better suggestion when the
>     test
>     >     fails, but it neither helps nor hurts.)
>     >
>     >
>     > I'm not sure whether we have consumers of -E that care about this
>     > detail, but I think this is an unrelated bug. Consider:
>     >
>     > #define foo(x) x y
>     > foo()
>     >
>     > This also misses the leading space from the output. So:
>     >
>     > 1) I think your patch is fine as-is, and
>     > 2) I think we should fix PrintPreprocessedTokens to correctly
>     handle the
>     > first token on the line having leading space but being expanded from a
>     > token in the first column.
> 
>     All right. Assuming that leading white space in column 1 is the only
>     case that gets mishandled, it seems like a special exception would
>     suffice. The attached (on top of my previous patch) passes testing. But
>     I am not all that familiar with this part of the code.
> 
> 
> Patch looks good, committed as r202070.

Thank you!

>     Note that the code below checks for # as a special exception: if that
>     appears at the start of a line, a space is output without changing
>     ColNo. While I do like consistency, using that approach would cause two
>     spaces to be printed instead of one if # should appear in column 1, with
>     leading white space.
> 
> 
> It seems somewhat reasonable to insert the space and stay in column 1 in
> that case -- that way, downstream tools that support the "preprocessed
> source" input format can just strip the leading space in a line that
> starts with " #" and not worry about later column numbers getting messed up.

Ah, I see how you read my previous message. Sorry for the confusion,
that wasn't what I meant. What I meant was that the code directly below does

if (ColNo <= 1 && Tok.is(tok::hash))
  OS << ' ';

and copying that approach, to get

if (ColNo <= 1 && Tok.hasLeadingSpace())
  OS << ' ';

if (ColNo <= 1 && Tok.is(tok::hash))
  OS << ' ';

would have problems, so I just wanted to mention why I didn't do that.
The ColNo variable is local to HandleFirstTokOnLine, never passed to any
other function, and not visible, even indirectly, in either the output
or in other functions.

Cheers,
Harald van Dijk



More information about the cfe-commits mailing list