[PATCH] D76594: [clang][AST] Support AST files larger than 512M

Sam McCall via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Tue Apr 14 05:18:10 PDT 2020


sammccall accepted this revision.
sammccall added a comment.
This revision is now accepted and ready to land.

We're starting to see quite a lot of production crashes with this and would very much like a fix.
At this point it seems like we're best moving ahead and if someone comes up with a better idea, we can improve it later.



================
Comment at: clang/include/clang/Serialization/ASTBitCodes.h:220
     /// Source range/offset of a preprocessed entity.
     struct DeclOffset {
+      /// Raw source location. The unsigned i.e. 32-bit integer is enough for
----------------
DmitryPolukhin wrote:
> sammccall wrote:
> > Is there one of these for every decl in the module? It seems like we're probably giving up a good fraction of the 4% increase for just using absolute 64 bit offsets everywhere :-( Is there still a significant gain from using section-relative elsewhere?
> > 
> > If there are indeed lots of these, giving up 4 bytes to padding (in addition to the wide offset) seems unfortunate and because we memcpy the structs into the AST file seems like a sad reason :-)
> > Can we align this to 4 bytes instead?
> > (e.g. by splitting into two fields and encapsulating the few direct accesses, though there's probably a neater way)
> Yes, it seems that all referenced decls should have an entry in this table. 4% increase was counted on previous diff before I discovered DeclOffsets and TypeOffsets. This diff uses relative offsets for previous cases and increase is only 2 additional 64-bit integers per file. DeclOffsets are about 4.7% and TypeOffsets are about 2.5% of the whole file. With this diff both number will be 2x, splitting DeclOffset in two separate arrays could make the increase only 1.5x for DeclOffsets. If approach from this diff looks fine, I can split array of DeclOffset into 2 separate arrays. It could be a bit less efficient because of worse memory locality but will save some space.
Or even just 32-bit BitOffsetLow and BitOffsetHigh fields so the overall struct is 32-bit aligned... then you'd still be local?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D76594/new/

https://reviews.llvm.org/D76594





More information about the cfe-commits mailing list