[PATCH] D97204: [RFC] Clang 64-bit source locations

Richard Smith - zygoloid via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Wed Feb 24 15:18:45 PST 2021


rsmith added a comment.

Thanks for doing this!

The 8-9% memory hit is better than I'd feared, but still seems uncomfortably large. I've left comments on a couple of places where I think we could substantially reduce this.
The performance regression seems large enough that people will notice, but perhaps not prohibitively huge. I assume that's all caused by increasing the size of the working set.

Can we avoid a libclang ABI break if we don't allow the use of 64-bit source locations for builds with 32-bit pointers?

It would be useful to get data points from more users.



================
Comment at: clang/include/clang/AST/DeclBase.h:1763-1784
+    static_assert(sizeof(DeclContextBitfields) <= 16,
+                  "DeclContextBitfields is larger than 16 bytes!");
+    static_assert(sizeof(TagDeclBitfields) <= 16,
+                  "TagDeclBitfields is larger than 16 bytes!");
+    static_assert(sizeof(EnumDeclBitfields) <= 16,
+                  "EnumDeclBitfields is larger than 16 bytes!");
+    static_assert(sizeof(RecordDeclBitfields) <= 16,
----------------
I think allowing these to grow is likely a bad tradeoff -- we're paying 8 bytes per declaration in order to make only `ObjCContainerDecl`s smaller (actually, not even that -- there'll be 4 bytes of padding in an `ObjCContainerDecl` either way in 64-bit-`SourceLocation` mode). Can you try moving the `SourceLocation` out of `ObjCContainerDeclBitfields` and into `ObjCContainerDecl` and see if that makes a noticeable difference to the memory overhead of this patch?


================
Comment at: clang/include/clang/AST/Stmt.h:1154
   Stmt(StmtClass SC) {
-    static_assert(sizeof(*this) <= 8,
+    static_assert(sizeof(*this) <= 16,
                   "changing bitfields changed sizeof(Stmt)");
----------------
Oof, we're wasting a lot of space here. Lots of `Stmt` subclasses only need a couple of bytes of bitfields, but we're padding them all out to 16 because some of them store a `SourceLocation`. It'd be nice to estimate how much of that we could get back by moving source locations to the derived classes, if you're prepared to do the work there. (I suspect that would result in too many `#ifdef`s to be reasonable to check in, though.)


================
Comment at: clang/include/clang/Analysis/ProgramPoint.h:90-111
+  union PtrOrSLoc {
+    PtrOrSLoc() {}
+    PtrOrSLoc(const void *P) {
+      memset(Raw.buffer, 0, sizeof(Raw.buffer));
+      Ptr = P;
+    }
+    PtrOrSLoc(SourceLocation L) {
----------------
Can we reasonably say that 32-bit clang builds don't support 64-bit `SourceLocation`s, and keep using a `void*` (or `uintptr_t`) here?


================
Comment at: clang/lib/Serialization/ASTReader.cpp:3893-3894
 
-    uint32_t SLocOffset =
-        endian::readNext<uint32_t, little, unaligned>(Data);
+    SourceLocation::UIntType SLocOffset =
+        endian::readNext<SourceLocation::UIntType, little, unaligned>(Data);
     uint32_t IdentifierIDOffset =
----------------
I think we should ensure the on-disk AST file format doesn't depend on the new macro. Going to 64 bits unconditionally here is probably fine.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D97204/new/

https://reviews.llvm.org/D97204



More information about the cfe-commits mailing list