[PATCH] D136624: [clang][modules] Account for non-affecting inputs in `ASTWriter`

Duncan P. N. Exon Smith via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Fri Oct 28 18:46:19 PDT 2022


dexonsmith added a comment.

In D136624#3893001 <https://reviews.llvm.org/D136624#3893001>, @jansvoboda11 wrote:

> I tried implementing your suggestion (merging ranges of adjacent non-affecting files and avoiding `FileID` lookups), but the numbers from `-ftime-trace` are very noisy. I got more stable data by measuring clock cycles and instruction counts, but nothing conclusive yet.
>
> Compilation of `CompilerInvocation.cpp` with implicit modules.
>
> - previous approach with vector + `FileID` lookup: +0.64% cycles and +1.68% instructions,
> - current approach with merged `SourceRange`s: +0.38% cycles and +1.11% instructions.
>
> I'll post here as I experiment more and get more data.

Nice; that seems like a bit of an improvement.

I'm curious; are system modules allowed to be non-affecting yet, or are they still assumed to be affecting? (It's the system modules that I think are most likely to be adjacent.)

My intuition is that there is likely some peephole that would be quite effective, that might not be useful for general `getFileID()` lookups.

- I already suggested "same as last lookup?"... I'm curious if that'll help. Maybe that's already in `getFileID()`, but now that you've factored out that call, it could be useful to replicate.
- You could also try: "past the the last non-affecting module?"
- You could also try: "before the first non-affecting module?"

I suspect you could collect some data to guide this, such as, for loaded locations (you could ignore "local" locations since they already have a peephole):

- Histogram of "loaded" vs. "between" vs. "after" non-affecting modules.
- Histogram of "same as last" vs. "same as last-1" vs. "different from last 2".
- [...]

Other things that might be useful to know:

- What effect is the merging having (or would it have)? (i.e., what's the histogram of "adjacent" non-affecting files? (e.g.: 9 ranges of non-affecting files, with two blocks of 5 files and seven blocks of 1 (which aren't adjacent to any others)))
- Is there a change in cycles/instructions when the module cache is hot? (presumably the common case)
- Are the PCM artifacts smaller?
- Are the PCMs bit-for-bit identical now when a non-affecting module is added to the input? (If not, why not?)
- What's the data for implicitly-discovered, explicitly-built modules?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D136624/new/

https://reviews.llvm.org/D136624



More information about the cfe-commits mailing list