[PATCH] D137213: [clang][modules] NFCI: Pragma diagnostic mappings: write/read FileID instead of SourceLocation

Duncan P. N. Exon Smith via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Tue Nov 29 08:24:09 PST 2022


dexonsmith added inline comments.


================
Comment at: clang/lib/Serialization/ASTReader.cpp:6343
              "Invalid data, missing pragma diagnostic states");
-      SourceLocation Loc = ReadSourceLocation(F, Record[Idx++]);
-      auto IDAndOffset = SourceMgr.getDecomposedLoc(Loc);
-      assert(IDAndOffset.first.isValid() && "invalid FileID for transition");
-      assert(IDAndOffset.second == 0 && "not a start location for a FileID");
+      FileID FID = ReadFileID(F, Record, Idx);
+      assert(FID.isValid() && "invalid FileID for transition");
----------------
alexfh wrote:
> alexfh wrote:
> > jansvoboda11 wrote:
> > > alexfh wrote:
> > > > dexonsmith wrote:
> > > > > eaeltsin wrote:
> > > > > > This doesn't work as before, likely because ReadFileID doesn't do TranslateSourceLocation.
> > > > > > 
> > > > > > Our tests fail.
> > > > > > 
> > > > > > I tried calling TranslateSourceLocation here and the tests passed:
> > > > > > ```
> > > > > >       SourceLocation Loc = Diag.SourceMgr->getComposedLoc(FID, 0);
> > > > > >       SourceLocation Loc2 = TranslateSourceLocation(F, Loc);
> > > > > >       auto IDAndOffset = SourceMgr.getDecomposedLoc(Loc2);
> > > > > > 
> > > > > >       // Note that we don't need to set up Parent/ParentOffset here, because
> > > > > >       // we won't be changing the diagnostic state within imported FileIDs
> > > > > >       // (other than perhaps appending to the main source file, which has no
> > > > > >       // parent).
> > > > > >       auto &F = Diag.DiagStatesByLoc.Files[IDAndOffset.first];
> > > > > > ```
> > > > > > 
> > > > > > Sorry I don't know the codebase, so this fix is definitely ugly :) But it shows the problem.
> > > > > > 
> > > > > I don't think that's the issue, since `ReadFileID()` calls `TranslateFileID`, which should seems like it should be equivalent.
> > > > > 
> > > > > However, I notice that the post-increment for `Idx` got dropped! Can you try replacing the line of code with the following and see if that fixes your tests (without any extra TranslateSourceLocation logic)?
> > > > > ```
> > > > > lang=c++
> > > > > FileID FID = ReadFileID(F, Record, Idx++);
> > > > > ```
> > > > > 
> > > > > If so, maybe you can contribute that fix with a reduced testcase? I suggest adding me, @vsapsai, @Bigcheese, and @jansvoboda11 as reviewers.
> > > > > 
> > > > > @alexfh, maybe you can check if this fixes your tests as well?
> > > > > 
> > > > > (If this is the issue, it's a bit surprising we don't have existing tests covering this case... and embarrassing I missed it when reviewing initially!)
> > > > I've noticed the dropped `Idx` post-increment as well, but I went a step further and looked at the `ReadFileID` implementation, which is actually doing a post-increment itself, and accepts `Idx` by reference:
> > > > ```
> > > >   FileID ReadFileID(ModuleFile &F, const RecordDataImpl &Record,
> > > >                     unsigned &Idx) const {
> > > >     return TranslateFileID(F, FileID::get(Record[Idx++]));
> > > >   }
> > > > ```
> > > > 
> > > > Thus, it seems to be correct. But what @eaeltsin  has found is actually a problem for us.  I'm currently trying to make an isolated test case, but it's quite tricky (as header modules are =\). It may be the case that our build setup relies on something clang doesn't explicitly promises, but the fact is that the behavior (as observed by our build setup) has changed. I'll try to revert the commit for now to get us unblocked and provide a test case as soon as I can.
> > > Thanks for helping out @dexonsmith, we did have the week off.
> > > 
> > > @eaeltsin @alexfhDone, are you able to provide the failing test case? I'm happy to look into it and re-land this with a fix.
> > I've spent multiple hours trying to extract an observable test case. It turned out to be too hairy of a yaq to shave: for each compilation a separate sandboxed environment is created with a separate symlink tree with just the inputs necessary for that action. Some of the inputs are prebuilt module files (e.g. for libc++) that are version-locked with the compiler. So far @jgorbe and I could reduce this to four compilation steps with their own symlink trees with inputs. While I could figure out some of the factors that affect reproducibility (for example, symlinks are important, since making a deep copy of the input directories makes the issue disappear), it will take a few more hours of concentrated yak shaving to bring this to a shareable state. I'm not sure I have much more time to sink into investigating this. 
> > 
> > It seems like examining code based on @eaeltsin's finding may be a more fruitful path to synthesizing a regression test. Could you try following that path?
> One more observation: `-fmodules-embed-all-files` and `-Wno-mismatched-tags` compiler options turned out to be important.
Maybe @eaeltsin can help, but I don't see any reason to think that testcase will be easier. Typically we don't revert without a testcase or at least some way to understand the problem and make progress.

(Maybe @jansvoboda11 has ideas for extra instrumentation in the compiler to better understand what's going on with your setup?)


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D137213/new/

https://reviews.llvm.org/D137213



More information about the cfe-commits mailing list