[PATCH] D109632: [clang] de-duplicate methods from AST files

Richard Howell via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Tue Sep 14 16:46:11 PDT 2021


rmaz added a comment.

In D109632#3000469 <https://reviews.llvm.org/D109632#3000469>, @vsapsai wrote:

> I'm a little bit confused here. Where is the speed-up coming from? Is it because ObjCMethodList for `init` not being serialized? I'm asking because right now I don't see how `seen` helps with that.
>
> My current understanding is that `Sema::addMethodToGlobalList` is too expensive and `seen` reduces the number of calls to it. So maybe it makes sense to invest into making it faster? For example, `Sema::MatchTwoMethodDeclarations` can return early if both method decl pointers are equal. But I haven't done any measurements, that's just an example.

The speedup is coming from reducing the number of times `Sema::addMethodToGlobalList` is called when looking up methods loaded from modules. From what I can see each serialized module will contain an ObjCMethodList with all the methods from all visible modules at the point of serialization. This means that if you have a lot of modules that redeclare `-(id)init`, which is a common pattern in our code, then each serialized module will contain a lot of shared method decls, which we do not de-duplicate. To illustrate this, here are the top 5 instance method counts sorted by amount of duplicates (using pointer comparison) from a pass of the `ReadMethodPoolVisitor` for a single file:

| **selector** | **# methods** | **# unique** | **# duplicated** |
| init         | 9825          | 1652         | 8173             |
| init         | 2155          | 930          | 1225             |
| copy         | 1398          | 633          | 765              |
| init         | 1027          | 270          | 757              |
| init         | 948           | 410          | 538              |
|

Without having a set to deduplicate I'm not sure how we could make `Sema::addMethodToGlobalList` fast enough, wouldn't it require a list traversal for each insert, making it O(n^2)?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109632/new/

https://reviews.llvm.org/D109632



More information about the cfe-commits mailing list