[PATCH] D112915: [clang][modules] Track number of includes per submodule

Duncan P. N. Exon Smith via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Wed Nov 17 13:05:45 PST 2021


dexonsmith added inline comments.


================
Comment at: clang/include/clang/Serialization/ModuleFile.h:399
+  /// include information.
+  llvm::StringMap<llvm::SmallVector<uint64_t, 64>> SubmoduleIncludedFiles;
+
----------------
jansvoboda11 wrote:
> dexonsmith wrote:
> > Each StringMapEntry is going to have a pretty big allocation here, for a 512B vector. Given that it doesn't need to be after creation, it'd be more efficient to use this pattern:
> > ```
> > lang=c++
> > llvm::StringMap<ArrayRef<uint64_t>> SubmoduleIncludedFiles;
> > SpecificBumpPtrAlloc<uint64_t> SubmoduleIncludedFilesAlloc;
> > 
> > // later
> > MutableArrayRef<uint64_t> Files(SubmoduleIncludedFiles.Allocate(Record.size()), Record.size());
> > llvm::copy(Record, Files.begin());
> > SubmoduleIncludedFiles[Key] = Files;
> > ```
> > 
> > That said, I feel like this should just be:
> > ```
> > lang=c++
> > DenseMap<StringRef, StringRef> SubmoduleIncludedFiles;
> > ```
> > The key can point at the name of the submodule, which must already exist somewhere without needing a StringMap to create a new copy of it. The value is a lazily deserialized blob.
> I switched to `StringRef` value in the latest revision. Unfortunately, had to use `std::string` as the key instead of `StringRef`, since `getFullModuleName` constructs the string on heap. That forced me to use `std::map`, too. I'll explore using something else entirely as the key.
Oh, if the key isn't being kept alive elsewhere, you can/should use `StringMap<StringRef>`, which is strictly better than `std::map<std::string, StringRef>` (except for non-deterministic iteration order). I suggested otherwise because I assumed the key already existed.

Also, if the map isn't being deleted from (much), might as well bump-ptr-allocate it:
```
lang=c++
BumpPtrAllocator SubmoduleIncludedFilesAlloc;
StringMap<StringRef, BumpPtrAllocator> SubmoduleIncludedFiles; // iniitalized with teh Alloc above.
```

> I'll explore using something else entirely as the key.

Not necessarily important; just if the key was already (e.g., stored in its `Module` or something) you might as well use it. Unless, maybe if the `Module` is known to be constructed already, you could use its address?


================
Comment at: clang/include/clang/Serialization/ModuleFile.h:399
+  /// include information.
+  llvm::StringMap<llvm::SmallVector<uint64_t, 64>> SubmoduleIncludedFiles;
+
----------------
dexonsmith wrote:
> jansvoboda11 wrote:
> > dexonsmith wrote:
> > > Each StringMapEntry is going to have a pretty big allocation here, for a 512B vector. Given that it doesn't need to be after creation, it'd be more efficient to use this pattern:
> > > ```
> > > lang=c++
> > > llvm::StringMap<ArrayRef<uint64_t>> SubmoduleIncludedFiles;
> > > SpecificBumpPtrAlloc<uint64_t> SubmoduleIncludedFilesAlloc;
> > > 
> > > // later
> > > MutableArrayRef<uint64_t> Files(SubmoduleIncludedFiles.Allocate(Record.size()), Record.size());
> > > llvm::copy(Record, Files.begin());
> > > SubmoduleIncludedFiles[Key] = Files;
> > > ```
> > > 
> > > That said, I feel like this should just be:
> > > ```
> > > lang=c++
> > > DenseMap<StringRef, StringRef> SubmoduleIncludedFiles;
> > > ```
> > > The key can point at the name of the submodule, which must already exist somewhere without needing a StringMap to create a new copy of it. The value is a lazily deserialized blob.
> > I switched to `StringRef` value in the latest revision. Unfortunately, had to use `std::string` as the key instead of `StringRef`, since `getFullModuleName` constructs the string on heap. That forced me to use `std::map`, too. I'll explore using something else entirely as the key.
> Oh, if the key isn't being kept alive elsewhere, you can/should use `StringMap<StringRef>`, which is strictly better than `std::map<std::string, StringRef>` (except for non-deterministic iteration order). I suggested otherwise because I assumed the key already existed.
> 
> Also, if the map isn't being deleted from (much), might as well bump-ptr-allocate it:
> ```
> lang=c++
> BumpPtrAllocator SubmoduleIncludedFilesAlloc;
> StringMap<StringRef, BumpPtrAllocator> SubmoduleIncludedFiles; // iniitalized with teh Alloc above.
> ```
> 
> > I'll explore using something else entirely as the key.
> 
> Not necessarily important; just if the key was already (e.g., stored in its `Module` or something) you might as well use it. Unless, maybe if the `Module` is known to be constructed already, you could use its address?
> I switched to `StringRef` value in the latest revision. Unfortunately, had to use `std::string` as the key instead of `StringRef`, since `getFullModuleName` constructs the string on heap. That forced me to use `std::map`, too. I'll explore using something else entirely as the key.




Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D112915/new/

https://reviews.llvm.org/D112915



More information about the cfe-commits mailing list