[cfe-commits] [PATCH] Fix crash printing diagnostic range spanning macros

Chandler Carruth chandlerc at google.com
Thu Nov 1 23:14:31 PDT 2012


Thanks for the ping. =] This is definitely the direction I was
imagining, and I really like the approach, Feel free to commit this at
any point and rotate this to post-commit discussion as needed. I think
its a strict improvement and would be happy to do further refinements
incrementally.

One meta comment, I think you might want more test cases to check for
other edge case patterns, or for ranges that *shouldn't* work in
addition to the ones that should.

Index: lib/Frontend/DiagnosticRenderer.cpp
===================================================================
--- lib/Frontend/DiagnosticRenderer.cpp (revision 166711)
+++ lib/Frontend/DiagnosticRenderer.cpp (working copy)
@@ -18,6 +18,7 @@
 #include "llvm/Support/MemoryBuffer.h"
 #include "llvm/Support/raw_ostream.h"
 #include "llvm/Support/ErrorHandling.h"
+#include "llvm/ADT/SmallSet.h"
 #include "llvm/ADT/SmallString.h"
 #include <algorithm>
 using namespace clang;
@@ -218,6 +219,42 @@
   emitIncludeLocation(Loc, PLoc, SM);
 }

+// Helper function to fix up the source ranges so the start and end
+// have the same FileID, and that FileID is somewhere in the
+// caller chain for Loc.

I didn't really follow this comment. Let me see if I've understood the
implementation itself correctly:

Because FileIDs aren't actually identifying the *file*, but in fact
distinguish one macro expansion from another, such that if we have two
source locations both of which are macro source locations, their
FileIDs are equal iff they are expanded from the same macro. Thus we
can walk down the macro expansion stack of the start and end locations
of each of the input ranges, and if we find the same macro expansion
as has been selected for the given caret location, use that macro
expansion location's spelling location.

Ok, if I've gotten this all correct, this makes *lots* of sense to me.
I particularly like the way it will "resynchronize" even when their
are intervening macros on one side or the other side of the range. If
there is a way to make this a bit more clear in the comments, that
would be awesome. I wish "FileID" weren't so misleading...

+static void MapDiagnosticRanges(SourceLocation Loc,
+                            SmallVectorImpl<CharSourceRange>& Ranges,
+                            SmallVectorImpl<CharSourceRange>& SpellingRanges,
+                            const SourceManager *SM) {

Style nit-pick: 'mapDiagnosticRanges'? Also, please don't use
80-column justification to pick strange indentations. I'd much rather
have a line break after the '(' and indent the argument list 4 spaces
instead of 2.

+  FileID LocFileID = SM->getFileID(Loc);

Do you want to do anything other than map to spelling location for the
ranges in the event that this is not a macro expansion location?
Wonder if it would be better to short circuit.... probably not because
you need to push stuff into the output range anyways.

+  for (SmallVectorImpl<CharSourceRange>::iterator I = Ranges.begin(),
+       E = Ranges.end();
+       I != E; ++I) {
+    SourceLocation Start = I->getBegin(), End = I->getEnd();
+    bool IsTokenRange = I->isTokenRange();
+
+    while (Start.isMacroID() && SM->getFileID(Start) != LocFileID)
+      Start = SM->getImmediateMacroCallerLoc(Start);
+
+    while (End.isMacroID() && SM->getFileID(End) != LocFileID) {
+      // The computation of the next End is an inlined version of
+      // getImmediateMacroCallerLoc, except it chooses the end of an
+      // expansion range.
+      if (SM->isMacroArgExpansion(End)) {
+        End = SM->getImmediateSpellingLoc(End);
+      } else {
+        End = SM->getImmediateExpansionRange(End).second;
+      }
+    }

Should this do anything special if one end of the range finds a macro
expansion location, and the other doesn't? That is, should we insist
that *both* locations FileIDs match the LocFileID if one of them does
at this point?

One interesting option here would be to find the start (or end)
location of the this macro definition and if the start (or end) is not
in the same macro expansion as Loc, but the end (or start) is in the
same expansion, use the start (or end) of the expansion instead. This
would essentially saturate the range at the macro definition boundary.

+
+    Start = SM->getSpellingLoc(Start);
+    End = SM->getSpellingLoc(End);
+    SpellingRanges.push_back(CharSourceRange(SourceRange(Start, End),
+                                             IsTokenRange));
+  }

One unfortunate aspect of this re-synchronization style mapping is
that it is O(N^2). Do you think it would be worth adjusting the input
ranges to be the immediate caller loc when we find one that matches
the Loc's FileID? I don't think that would miss any cases as all we
really need is the ability to skip over macro expansion frames that
exist in the range stack but not in the Loc stack. Still, might be
more complex, and I certainly hope that having an N^2 algorithm here
really isn't a problem in practice...

On Wed, Oct 31, 2012 at 2:27 PM, Eli Friedman <eli.friedman at gmail.com> wrote:
> On Mon, Oct 29, 2012 at 4:07 PM, Eli Friedman <eli.friedman at gmail.com> wrote:
>> On Thu, Oct 25, 2012 at 6:09 PM, Eli Friedman <eli.friedman at gmail.com> wrote:
>>> On Thu, Oct 25, 2012 at 3:32 PM, Chandler Carruth <chandlerc at google.com> wrote:
>>>> On Thu, Oct 25, 2012 at 2:57 PM, Eli Friedman <eli.friedman at gmail.com> wrote:
>>>>> On Thu, Oct 25, 2012 at 1:27 PM, Chandler Carruth <chandlerc at google.com> wrote:
>>>>>> On Thu, Oct 25, 2012 at 9:38 AM, Jordan Rose <jordan_rose at apple.com> wrote:
>>>>>>> I think that's the best we can do. Even if the range had the beginning
>>>>>>> before the end (say, by trying to highlight the entirety of both macros), it
>>>>>>> wouldn't be "correct".  We should not show ranges that don't correspond to
>>>>>>> something meaningful in the text.
>>>>>>
>>>>>> I actually think we can do a bit better.
>>>>>
>>>>> Yes, we could completely change what we display, but I'm not really
>>>>> interested in embarking on a large architectural project at the
>>>>> moment.
>>>>>
>>>>>>> ...though arguably we could show a line with all macros expanded, and put
>>>>>>> the range there. But that's a big change in what you expect from diagnostic
>>>>>>> printing, and it wouldn't work in IDEs anyway.
>>>>>>
>>>>>> We get pretty close with the macro backtrace. I have sometimes
>>>>>> wondered if we should start the error with a synthetic preprocessed
>>>>>> snippet, and then give the code the user wrote in the first note, and
>>>>>> descend through the macro expansions in subsequent notes.
>>>>>> Alternatively, we could add a final note to the macro backtrace that
>>>>>> shows the fully preprocessed source, but that seems more likely to be
>>>>>> ignored.
>>>>>
>>>>> Hmm, interesting; please file a bug. :)
>>>>>
>>>>>>>
>>>>>>> Jordan
>>>>>>>
>>>>>>>
>>>>>>> On Oct 24, 2012, at 19:36 , Eli Friedman <eli.friedman at gmail.com> wrote:
>>>>>>>
>>>>>>> Patch attached.  Fixes a crash on a testcase like the following:
>>>>>>>
>>>>>>> +#define BAD_CONDITIONAL_OPERATOR (2<3)?4:5
>>>>>>> +int x = BAD_CONDITIONAL_OPERATOR+BAD_CONDITIONAL_OPERATOR;
>>>>>>>
>>>>>>> We try to print a source range which starts at the 5 in the first
>>>>>>> expansion, and ends just after the 3 in the second expansion.
>>>>>>
>>>>>> My suggestion would be this:
>>>>>>
>>>>>> When you have a source range to highlight, and it's start or stop
>>>>>> location occurs within a macro, grow it to the start (or stop, resp.)
>>>>>> of the macro info's expansion location. This should be the start of
>>>>>> where the macro got expanded into the code.
>>>>>>
>>>>>> Then, if there the diagnostic location itself is inside a macro, as
>>>>>> you do the macro backtrace walk you'll need to address the fixme in
>>>>>> the diagnostic code to actually walk the source ranges back through
>>>>>> the macro backtrace as well, and at each level to the analogous
>>>>>> transform to grow the range at that level.
>>>>>
>>>>> We already do this; we just don't do it correctly for the case where
>>>>> the start and/or end locations come from a different expansion than
>>>>> the caret.
>>>>
>>>> Yes, but do we do the first paragraph correctly? I think we can do the
>>>> first paragraph and fix the crash/misbehavior you're talking about.
>>>
>>> I see what you mean.  New patch attached.
>>
>> Ping.
>
> Chandler, were you planning on reviewing this?
>
> -Eli



More information about the cfe-commits mailing list