[PATCH] D109632: [clang] de-duplicate methods from AST files

Fri Oct 1 14:14:30 PDT 2021

vsapsai added a comment.

In D109632#3033489 <https://reviews.llvm.org/D109632#3033489>, @rmaz wrote:

> In D109632#3032381 <https://reviews.llvm.org/D109632#3032381>, @vsapsai wrote:
>
>> What's interesting, I was able to trigger more diagnostic. Specifically, I got warnings about `length` ambiguity because in NSStatusItem it is CGFloat <https://developer.apple.com/documentation/appkit/nsstatusitem/1529402-length>, while in NSString it is NSUInteger. I also had more diagnostic that's unclear how it is triggered. It can be a project issue or a bug somewhere else, need to check it more carefully.
>
> Yes, I also had a couple of files fail to compile due to mismatched (or differently matched) selectors.

Diagnostic about `-length` is due to isAcceptableMethodMismatch <https://github.com/llvm/llvm-project/blob/4cdee8de6bad3cf38165bfb7788e58f2469a22ce/clang/lib/Sema/SemaDeclObjC.cpp#L3441-L3461> returning different results depending on the order of methods. Probably other diagnostic is caused by a similar issue. Will need to add tests and to fix `isAcceptableMethodMismatch`.

>> In theory, "set dedupe" approach should be slower than "no external" as we are iterating through shared dependencies. But in practice it isn't, which is puzzling. My current theory is that "no external" also feeds into correctness, so we might be doing more [correct] method overloading checks.
>
> Well, wouldn't the tradeoff there be that we now have to descend into all dependent modules during selector lookup when we didn't have to previously? And this would do more work as only a small subset of those modules would have a selector match, which in the current case has already been handled and serialized on a single method list.

My assumption was that all dependent modules are in memory at this point. And we visit transitive modules only once, so I don't expect it to be a big performance hit (though I can be wrong). And deserializing methods from different modules shouldn't be more work because we are deserializing fewer methods than with "set dedupe". Maybe the order of methods matters? I can see us short circuiting in some cases but not the others. Though that's not a very plausible explanation for 20-25% discrepancy in compilation time.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109632/new/

https://reviews.llvm.org/D109632