[PATCH] D106876: Remove non-affecting module maps from PCM files.

Duncan P. N. Exon Smith via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Wed Oct 19 12:45:43 PDT 2022


dexonsmith added a comment.

As an end goal, it seems strictly better for build systems / distribution / artifact storage / caching if the output PCM has been canonicalized as-if the module maps weren't found, rather than tracking both which module maps were discovered and which to ignore. Ideally, we'd find a way to canonicalize the output efficiently and correctly... along the lines of this patch, but with a bug fix somewhere.

It may not be too hard/expensive to translate the source locations. E.g., you could build up a vector:

  SmallVector<SourceLocRange> DroppedMMs;
  
  // fill up DroppedMMs with MMs...
  ...
  
  // sort to create map.
  sort(DroppedMMs);
  
  // accumulate offsets.
  SmallVector<uint64_t> AccumulatedOffset;
  uint64_t Total = 0;
  AccumulatedOffset.reserve(DroppedMMs.size() + 1);
  AccumulatedOffset.push_back(0);
  for (MM : DroppedMMs)
    AccumulatedOffset.push_back(Total += MM.size());
  
  // later, translate during serialization
  Error serializeSourceLoc(SourceLoc SL) {
    if (DroppedMMs.empty())
      return SerializeSourceLocImpl(SL);
    auto I = llvm::lower_bound(DroppedMMs, SL);
    assert((I == MMs.end() || !I->contains(SL)) &&
           "Serializing a location from an ignored module map???");
    return serializeSourceLocImpl(SL - AccumulatedOffset[I - MMs.begin()]);
  }

Then a `std::lower_bound` into `DroppedMMs` tells you how much to adjust any given SourceLoc by. Serializing a source location would go through this map. Presumably, the number of dropped files will be relatively small (number of ignored module maps) so the binary search should be fast. Probably there would be good peepholes for common cases (such as tracking `LastDroppedMM` to avoid repeating the same search).

A further (more involved) approach would be to separate module maps into a separate SourceManager, so that their source locations don't affect other input files. Then only module map source locations would need to be translated during serialization. (Now that FileManager has the capability to remap file contents, I think the commit that merged the SourceManagers could be effectively reverted.)


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D106876/new/

https://reviews.llvm.org/D106876



More information about the cfe-commits mailing list