r200787 - Fix whitespace handling in empty macro expansions

Richard Smith richard at metafoo.co.uk
Mon Feb 24 15:23:02 PST 2014


On Mon, Feb 24, 2014 at 3:09 PM, Harald van Dijk <harald at gigawatt.nl> wrote:

> On 24/02/14 21:58, Richard Smith wrote:
> > On Sun, Feb 9, 2014 at 1:01 AM, Harald van Dijk <harald at gigawatt.nl
> > <mailto:harald at gigawatt.nl>> wrote:
> >
> >     On 05/02/14 23:16, Richard Smith wrote:
> >     > On Wed, Feb 5, 2014 at 1:08 PM, Harald van Dijk
> >     <harald at gigawatt.nl <mailto:harald at gigawatt.nl>
> >     > <mailto:harald at gigawatt.nl <mailto:harald at gigawatt.nl>>> wrote:
> >     >
> >     >     On 05/02/14 00:42, Richard Smith wrote:
> >     >     > Here's an easy test:
> >     >     >
> >     >     > #define foo() bar() b
> >     >     > #define bar()
> >     >     > a
> >     >     > foo()
> >     >     > c
> >     >     >
> >     >     > This should produce:
> >     >     >
> >     >     > # 1 "..."
> >     >     >
> >     >     >
> >     >     > a
> >     >     >  b
> >     >     > c
> >     >     >
> >     >     > ... but currently produces:
> >     >     >
> >     >     > #1 "..."
> >     >     >
> >     >     >
> >     >     > a b
> >     >     >
> >     >     > c
> >     >
> >     >     Thanks Richard, I don't know how I managed to miss that.
> >     >
> >     >     Setting Token::StartOfLine as mentioned gets this right in the
> >     lexer.
> >     >     However, even though the b token in the output has both
> >     >     Token::StartOfLine and Token::LeadingSpace set,
> >     >     PrintPPOutputPPCallbacks::HandleFirstTokOnLine uses the column
> >     number to
> >     >     determine whether to print spaces, not the flag. As a result,
> >     the output
> >     >     becomes not
> >     >
> >     >     a
> >     >      b
> >     >     c
> >     >
> >     >     but
> >     >
> >     >     a
> >     >     b
> >     >     c
> >     >
> >     >     Given that the lexer behaviour is now right, would it be okay
> >     to leave
> >     >     this as it is, and only add a test to check that b is on a
> >     separate
> >     >     line, like attached? (I used first/second/third instead of
> >     a/b/c in an
> >     >     attempt to get FileCheck to print a better suggestion when the
> >     test
> >     >     fails, but it neither helps nor hurts.)
> >     >
> >     >
> >     > I'm not sure whether we have consumers of -E that care about this
> >     > detail, but I think this is an unrelated bug. Consider:
> >     >
> >     > #define foo(x) x y
> >     > foo()
> >     >
> >     > This also misses the leading space from the output. So:
> >     >
> >     > 1) I think your patch is fine as-is, and
> >     > 2) I think we should fix PrintPreprocessedTokens to correctly
> >     handle the
> >     > first token on the line having leading space but being expanded
> from a
> >     > token in the first column.
> >
> >     All right. Assuming that leading white space in column 1 is the only
> >     case that gets mishandled, it seems like a special exception would
> >     suffice. The attached (on top of my previous patch) passes testing.
> But
> >     I am not all that familiar with this part of the code.
> >
> >
> > Patch looks good, committed as r202070.
>
> Thank you!
>
> >     Note that the code below checks for # as a special exception: if that
> >     appears at the start of a line, a space is output without changing
> >     ColNo. While I do like consistency, using that approach would cause
> two
> >     spaces to be printed instead of one if # should appear in column 1,
> with
> >     leading white space.
> >
> >
> > It seems somewhat reasonable to insert the space and stay in column 1 in
> > that case -- that way, downstream tools that support the "preprocessed
> > source" input format can just strip the leading space in a line that
> > starts with " #" and not worry about later column numbers getting messed
> up.
>
> Ah, I see how you read my previous message. Sorry for the confusion,
> that wasn't what I meant. What I meant was that the code directly below
> does
>
> if (ColNo <= 1 && Tok.is(tok::hash))
>   OS << ' ';
>
> and copying that approach, to get
>
> if (ColNo <= 1 && Tok.hasLeadingSpace())
>   OS << ' ';
>
> if (ColNo <= 1 && Tok.is(tok::hash))
>   OS << ' ';
>
> would have problems, so I just wanted to mention why I didn't do that.
> The ColNo variable is local to HandleFirstTokOnLine, never passed to any
> other function, and not visible, even indirectly, in either the output
> or in other functions.


 Ah, right. That means that:

#define HASH #
#define EMPTY
#define FOO EMPTY HASH
FOO

... will get one space before the '#', not two. '-fpreprocessed' doesn't
seem to care whether there's one space or two here, so I don't think this
matters.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20140224/43f5701d/attachment.html>


More information about the cfe-commits mailing list