<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Mon, Feb 24, 2014 at 3:09 PM, Harald van Dijk <span dir="ltr"><<a href="mailto:harald@gigawatt.nl" target="_blank">harald@gigawatt.nl</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="">On 24/02/14 21:58, Richard Smith wrote:<br>
> On Sun, Feb 9, 2014 at 1:01 AM, Harald van Dijk <<a href="mailto:harald@gigawatt.nl">harald@gigawatt.nl</a><br>
</div><div class="">> <mailto:<a href="mailto:harald@gigawatt.nl">harald@gigawatt.nl</a>>> wrote:<br>
><br>
> On 05/02/14 23:16, Richard Smith wrote:<br>
> > On Wed, Feb 5, 2014 at 1:08 PM, Harald van Dijk<br>
> <<a href="mailto:harald@gigawatt.nl">harald@gigawatt.nl</a> <mailto:<a href="mailto:harald@gigawatt.nl">harald@gigawatt.nl</a>><br>
</div><div><div class="h5">> > <mailto:<a href="mailto:harald@gigawatt.nl">harald@gigawatt.nl</a> <mailto:<a href="mailto:harald@gigawatt.nl">harald@gigawatt.nl</a>>>> wrote:<br>
> ><br>
> > On 05/02/14 00:42, Richard Smith wrote:<br>
> > > Here's an easy test:<br>
> > ><br>
> > > #define foo() bar() b<br>
> > > #define bar()<br>
> > > a<br>
> > > foo()<br>
> > > c<br>
> > ><br>
> > > This should produce:<br>
> > ><br>
> > > # 1 "..."<br>
> > ><br>
> > ><br>
> > > a<br>
> > > b<br>
> > > c<br>
> > ><br>
> > > ... but currently produces:<br>
> > ><br>
> > > #1 "..."<br>
> > ><br>
> > ><br>
> > > a b<br>
> > ><br>
> > > c<br>
> ><br>
> > Thanks Richard, I don't know how I managed to miss that.<br>
> ><br>
> > Setting Token::StartOfLine as mentioned gets this right in the<br>
> lexer.<br>
> > However, even though the b token in the output has both<br>
> > Token::StartOfLine and Token::LeadingSpace set,<br>
> > PrintPPOutputPPCallbacks::HandleFirstTokOnLine uses the column<br>
> number to<br>
> > determine whether to print spaces, not the flag. As a result,<br>
> the output<br>
> > becomes not<br>
> ><br>
> > a<br>
> > b<br>
> > c<br>
> ><br>
> > but<br>
> ><br>
> > a<br>
> > b<br>
> > c<br>
> ><br>
> > Given that the lexer behaviour is now right, would it be okay<br>
> to leave<br>
> > this as it is, and only add a test to check that b is on a<br>
> separate<br>
> > line, like attached? (I used first/second/third instead of<br>
> a/b/c in an<br>
> > attempt to get FileCheck to print a better suggestion when the<br>
> test<br>
> > fails, but it neither helps nor hurts.)<br>
> ><br>
> ><br>
> > I'm not sure whether we have consumers of -E that care about this<br>
> > detail, but I think this is an unrelated bug. Consider:<br>
> ><br>
> > #define foo(x) x y<br>
> > foo()<br>
> ><br>
> > This also misses the leading space from the output. So:<br>
> ><br>
> > 1) I think your patch is fine as-is, and<br>
> > 2) I think we should fix PrintPreprocessedTokens to correctly<br>
> handle the<br>
> > first token on the line having leading space but being expanded from a<br>
> > token in the first column.<br>
><br>
> All right. Assuming that leading white space in column 1 is the only<br>
> case that gets mishandled, it seems like a special exception would<br>
> suffice. The attached (on top of my previous patch) passes testing. But<br>
> I am not all that familiar with this part of the code.<br>
><br>
><br>
> Patch looks good, committed as r202070.<br>
<br>
</div></div>Thank you!<br>
<div class=""><br>
> Note that the code below checks for # as a special exception: if that<br>
> appears at the start of a line, a space is output without changing<br>
> ColNo. While I do like consistency, using that approach would cause two<br>
> spaces to be printed instead of one if # should appear in column 1, with<br>
> leading white space.<br>
><br>
><br>
> It seems somewhat reasonable to insert the space and stay in column 1 in<br>
> that case -- that way, downstream tools that support the "preprocessed<br>
> source" input format can just strip the leading space in a line that<br>
> starts with " #" and not worry about later column numbers getting messed up.<br>
<br>
</div>Ah, I see how you read my previous message. Sorry for the confusion,<br>
that wasn't what I meant. What I meant was that the code directly below does<br>
<br>
if (ColNo <= 1 && Tok.is(tok::hash))<br>
OS << ' ';<br>
<br>
and copying that approach, to get<br>
<br>
if (ColNo <= 1 && Tok.hasLeadingSpace())<br>
OS << ' ';<br>
<br>
if (ColNo <= 1 && Tok.is(tok::hash))<br>
OS << ' ';<br>
<br>
would have problems, so I just wanted to mention why I didn't do that.<br>
The ColNo variable is local to HandleFirstTokOnLine, never passed to any<br>
other function, and not visible, even indirectly, in either the output<br>
or in other functions.</blockquote><div><br></div><div> Ah, right. That means that:</div><div><br></div><div>#define HASH #</div><div>#define EMPTY</div><div>#define FOO EMPTY HASH</div><div>FOO</div><div><br></div><div>
... will get one space before the '#', not two. '-fpreprocessed' doesn't seem to care whether there's one space or two here, so I don't think this matters.</div></div></div></div>