[PATCH] D136624: [clang][modules] Account for non-affecting inputs in `ASTWriter`

Jan Svoboda via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Tue Nov 1 13:17:15 PDT 2022


jansvoboda11 marked 5 inline comments as done.
jansvoboda11 added inline comments.


================
Comment at: clang/include/clang/Serialization/ASTWriter.h:449-452
+  /// Exclusive prefix sum of the lengths of preceding non-affecting inputs.
+  std::vector<SourceLocation::UIntTy> NonAffectingInputOffsetAdjustments;
+  /// Exclusive prefix sum of the count of preceding non-affecting inputs.
+  std::vector<unsigned> NonAffectingInputFileIDAdjustments;
----------------
dexonsmith wrote:
> Can you collect a histogram for how big these vectors are? Can we avoid pointer chasing in the common case by making them `SmallVector` of some size during lookup?
> 
Usually 4-6 elements. Making them a `SmallVector<T, 8>` didn't affect performance, though.


================
Comment at: clang/lib/Serialization/ASTWriter.cpp:5282-5283
+
+SourceLocation::UIntTy
+ASTWriter::getAdjustment(SourceLocation::UIntTy Offset) const {
+  if (PP->getSourceManager().isLoadedOffset(Offset) ||
----------------
dexonsmith wrote:
> How often does `getAdjustment()` return the same answer in consecutive calls? If at all common, this would likely benefit from a peephole:
> ```
> lang=c++
> Optional<SLocRange> ASTWriter::CachedAdjustmentRange;
> Optional<UIntTy> ASTWriter::CachedAdjustment;
> 
> SourceLocation::UIntTy
> ASTWriter::getAdjustment(SourceLocation::UIntTy Offset) const {
>   // Check for 0.
>   //
>   // How fast is "isLoadedOffset()"? Can/should we add a peephole, or is it just bit
>   // manipulation? (I seem to remember it checking the high bit or something, but if
>   // it's doing some sort of look up, maybe it should be in the slow path so it can
>   // get cached by LastAdjustment.)
>   if (PP->getSourceManager().isLoadedOffset(Offset) ||
>       NonAffectingInputs.empty())
>     return 0;
> 
>   // Check CachedAdjustment.
>   if (CachedAdjustment && CachedAdjustmentRange->includes(Offset))
>     return *CachedAdjustment;
> 
>   // Call getAdjustmentSlow, which updates CachedAdjustment and CachedAdjustmentRange.
>   // It's out-of-line so that `getAdjustment` can easily be inlined without inlining
>   // the slow path.
>   //
>   // LastAdjustmentRange would be the size of the "gap" between this adjustment
>   // level and the next one (end would be UINTMAX if it's after the last
>   // non-affecting range).
>   return getAdjustmentSlow(Offset);
> }
> ```
> 
Not that often, see my top-level comment.


================
Comment at: clang/lib/Serialization/ASTWriter.cpp:5289-5290
+                ? NonAffectingInputs.end()
+                : llvm::lower_bound(NonAffectingInputs,
+                                    PP->getSourceManager().getFileID(Offset));
+  unsigned Idx = std::distance(NonAffectingInputs.begin(), It);
----------------
jansvoboda11 wrote:
> dexonsmith wrote:
> > Why do you need to call `getFileID()` here?
> > 
> > Instead, I would expect this to be a search through a range of offsets (e.g., see my suggestion at https://reviews.llvm.org/D106876#3869247 -- `DroppedMMs` contains SourceLocations, not FileIDs).
> > 
> > Two benefits:
> > 1. You don't need to call `getFileID()` to look up an offset.
> > 2. You can merge adjacent non-affecting files (shrinking the search/storage significantly).
> > 
> > 
> My reasoning was that if we search through a range of offsets, we're doing conceptually the same thing as `getFileID()` (which already has some optimizations baked in). Maybe the non-affecting files are indeed adjacent and we'll be able to merge most of them. I'll give it a shot and report back.
This ended up being faster due to merging of non-affecting files. Thanks for the suggestion!


================
Comment at: clang/test/Modules/non-affecting-module-maps-source-locations.m:32
+
+// RUN: %clang_cc1 -I %t/first -I %t/second -I %t/third -fmodules -fimplicit-module-maps -fmodules-cache-path=%t/cache %t/tu.m -o %t/tu.o
----------------
dexonsmith wrote:
> dexonsmith wrote:
> > This is exercising the code, but it could do one better and check if the output PCMs are bit-for-bit identical when we (now) expect them to be.
> > 
> > Maybe you could do this by having two run lines: one that includes `-I %t/second` and another that doesn't. Then check if the output PCMs are equal.
> (Or if the PCM isn't bit-for-bit identical yet, maybe at least the AST block should be...)
Yes, I'll probably drop this test entirely and just check the PCM files are bit-for-bit identical when a non-affecting file is not loaded at all.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D136624/new/

https://reviews.llvm.org/D136624



More information about the cfe-commits mailing list