[PATCH] D80508: [AST] Fix the source range for TagDecl if there is no matched } brace.

Haojian Wu via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Tue Jun 2 02:45:59 PDT 2020


hokein marked an inline comment as done.
hokein added inline comments.


================
Comment at: clang/lib/AST/Decl.cpp:4129
 SourceRange TagDecl::getSourceRange() const {
-  SourceLocation RBraceLoc = BraceRange.getEnd();
-  SourceLocation E = RBraceLoc.isValid() ? RBraceLoc : getLocation();
+  SourceLocation E = BraceRange.getBegin().isValid() ?
+                     BraceRange.getEnd() : getLocation();
----------------
sammccall wrote:
> hokein wrote:
> > sammccall wrote:
> > > I'm confused about this change.
> > > 
> > > We were using BR.end if it was valid, now we're using it if BR.start is valid. So this changes behavior in two cases:
> > >  - BR.start is valid, BR.end is invalid: old was {getOuterLocStart(), getLocation()}. new is {getOuterLocStart(), invalid}
> > >  - BR.end is valid, BR.start is invalid: old was {getOuterLocStart(), BraceRange.getEnd()}. new is {getOuterLocStart(), getLocation()}
> > > 
> > > These both seem worse to me, what am I missing?
> > > 
> > > Patch description refers to the first case, I wonder if this is just a bug in clangd cancelling out a bad AST (does it just mark tokens until EOF if the end location is invalid?)
> > > We were using BR.end if it was valid, now we're using it if BR.start is valid. So this changes behavior in two cases:
> > 
> > > BR.start is valid, BR.end is invalid: old was {getOuterLocStart(), getLocation()}. new is {getOuterLocStart(), invalid}
> > > BR.end is valid, BR.start is invalid: old was {getOuterLocStart(), BraceRange.getEnd()}. new is {getOuterLocStart(), getLocation()}
> > 
> > case 1), I think this is intended, though it looks like worse at first glance :( Invalid source location has subtle semantic in clang -- "BR.end() is invalid" implies that there is no `}` in the source code.
> > 
> > And  `Missing-}`  namespaceDecl in today's clang is using invalid loc (not the NameLoc) as the end loc:
> > 
> > ```
> > // `-NamespaceDecl 0x8d27728 </tmp/t.cpp:2:1, <invalid sloc>> col:11 abc
> > namespace abc {
> > // missing }
> > ```
> > 
> > case 2) is less interesting, I wonder whether we will encounter this case in practice, tried and failed to come up with an example to it happen. I think we can preserve the old behavior in this patch.
> > 
> > 
> > > I wonder if this is just a bug in clangd cancelling out a bad AST (does it just mark tokens until EOF if the end location is invalid?)
> > 
> > I assume "clangd" you mean "clang"? I think clang's behavior is fine, the AST is recovered well, it is an issue that the TagDecl::getSourceRange doesn't correctly claim the range of TagDecl.
> > 
> > e.g. 
> > ```
> > class Test {
> >   int field;
> > // missing }
> > // `-CXXRecordDecl 0x96dd738 </tmp/t.cpp:1:1, col:7> col:7 class Test definition
> > //   |-CXXRecordDecl 0x96dd878 <col:1, col:7> col:7 implicit class Test
> > //   `-FieldDecl 0x96dd940 <line:2:3, col:7> col:7 field 'int'
> > ```
> > 
> > And we may not set the BraceRange.end to the EOF position -- as it changes the semantic, making it valid implies we have a `}` in the source code.
> > 
> No, I meant clangd. Invalid source ranges hit codepaths in Selection.cpp that isn't deliberately handling them, and we're getting "random" results that may happen to do what you want in some cases, but...
> 
> e.g. mayHit returns true if the range isn't valid, which is probably good, but the reason is bizarre: it thinks the range is from a macro expansion.
> 
> However claimRange seems to claim no tokens. So I'd guess hover, find references etc on ABC wouldn't work after this patch, so there's regressions there too.
> 
> Still even if it breaks clangd, the most important question here is how *should* the source range be represented. Is a half-broken range actually useful? Maybe it's namespacedecl that should be "fixed".
ok, that's fair enough, so the current situation is 

- braceRange has its semantics, we should keep the EndLoc as Invalid;
- getSourceRange() returns a wrong range ({getOuterLocStart(), getLocation()}, {getOuterLocStart(), invalid} are not ideal), which doesn't reflect the AST;

Idea: we can add a new member `EndOfLoc` in TagDecl, and getSourceRange returns {getOuterLocStart(), EndOfLoc}, but this will increase the TagDecl size by 8 bytes, WDYT?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D80508/new/

https://reviews.llvm.org/D80508





More information about the cfe-commits mailing list