[PATCH] D23488: ThinLTO: add early "dead-stripping" on the Index

Teresa Johnson via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Dec 22 16:34:14 PST 2016


tejohnson added a comment.

In https://reviews.llvm.org/D23488#630029, @mehdi_amini wrote:

> Forgot to ask, did you collect any statistics?
>  I'd be interested in things like (for let say clang itself):
>
> - ThinLink time
> - Link time
> - Number of imported function
> - Number of imported module
> - Number of internalization
> - Whatever else you think about :)


No, I can collect some stats next week after the holiday. The compile time data may take a bit longer to get (have to run when my machine is otherwise mostly idle).



================
Comment at: include/llvm/LTO/LTO.h:389
+    /// Keep track if the symbol was referenced by the regular LTO partition.
+    bool VisibleToRegularLTOPartition = false;
+
----------------
mehdi_amini wrote:
> tejohnson wrote:
> > mehdi_amini wrote:
> > > tejohnson wrote:
> > > > mehdi_amini wrote:
> > > > > tejohnson wrote:
> > > > > > pcc wrote:
> > > > > > > mehdi_amini wrote:
> > > > > > > > pcc wrote:
> > > > > > > > > Don't you just need one flag: VisibleOutsideThinLTO (or something).
> > > > > > > > I think one flag should be enough, but shouldn't we have the same flag for LTO internalization?
> > > > > > > > 
> > > > > > > Wouldn't that be redundant with Partition == 0?
> > > > > > You're right, we only need one flag (forgot that the VisibleToRegularObj flag was added to GlobalResolutions for this patch). I will combine them as suggested.
> > > > > You're right! 
> > > > > So do we need to record every ThinLTO module as an individual partition? When do we use this?
> > > > > We could have `enum partition { Unkown, Global, LTO, ThinLTO }`?
> > > > > So do we need to record every ThinLTO module as an individual partition? When do we use this?
> > > > 
> > > > To detect when symbols are used by multiple ThinLTO modules (i.e. exported). E.g. it transitions from Unknown -> partition1  when first called with Partition=partition1, then from partition1 -> External if invoked again for the same symbol with a different partition2.
> > > > To detect when symbols are used by multiple ThinLTO modules (i.e. exported). E.g. it transitions from Unknown -> partition1 when first called with Partition=partition1, then from partition1 -> External if invoked again for the same symbol with a different partition2.
> > > 
> > > The way partitions are handled and the flag VisibleOutsideThinLTO is still not clear to me, we know when symbols are exported from the index itself why do we need partitions for that?
> > > 
> > > 
> > > The way partitions are handled and the flag VisibleOutsideThinLTO is still not clear to me, we know when symbols are exported from the index itself why do we need partitions for that?
> > 
> > At some point we have to collate that information to be able to look up which guid are exported from their modules (for creating the ExportedGUIDs set for internalization). We could detect cross module references among ThinLTO modules by walking the index, but since we have the information here already, it seems more efficient to just keep track of it. Otherwise we need to walk the whole index, comparing module paths, etc.
> > At some point we have to collate that information to be able to look up which guid are exported from their modules (for creating the ExportedGUIDs set for internalization). We could detect cross module references among ThinLTO modules by walking the index, but since we have the information here already, it seems more efficient to just keep track of it. Otherwise we need to walk the whole index, comparing module paths, etc.
> 
> Do we have an example that would cover the following case:
> 
> ```
> // main.cpp
> void bar1();
> void bar2();
> void baz();
> void foo1() {
>   bar1();
> }
> void foo2() {
>   bar2();
> }
> int main() {
>   baz()
> }
> ````
> 
> And 
> 
> ````
> // bar.cpp
> void bar1() {
> }
> void bar2() {
>   foo2(); // The cycle shouldn't prevent dead-stripping.
> }
> void baz() {
>    // Adding a call to bar() here should not prevent internalization.
> }
> ````
> 
> With only main exported by the linker, I expect us to be able to:
> 
> 1) dead-strip from the index foo() and bar()
> 2) "internalize" (drop) foo() and bar() eagerly from the IR
> 
> I think the current logic should be OK. But I still find the partition thing and the `ExportedGUIDs` confusing.
> Do we have an example that would cover the following case:
>
> ```
> // main.cpp
> void bar1();
> void bar2();
> void baz();
> void foo1() {
>   bar1();
> }
> void foo2() {
>   bar2();
> }
> int main() {
>   baz()
> }
> ````
>
> And
>
> ````
> // bar.cpp
> void bar1() {
> }
> void bar2() {
>   foo2(); // The cycle shouldn't prevent dead-stripping.
> }
> void baz() {
>    // Adding a call to bar() here should not prevent internalization.

bar1 or bar2? Adding a call to bar2 would prevent its dead stripping and prevent internalization. Adding a call to bar1 should not prevent its internalization.

> }
> ````
>
> With only main exported by the linker, I expect us to be able to:
>
> 1) dead-strip from the index foo() and bar()

I don't have a cycle case like foo2/bar2, will add. The existing test case covers the foo1/bar1 case. Although note if calls to either bar2 or bar1 are added to baz as you suggest above, it will prevent their dead stripping.

> 2) "internalize" (drop) foo() and bar() eagerly from the IR

Yes the existing test case ensures we internalize where appropriate, even when there were cross-module references (but where both caller/callee were dead per new index-based analysis), and that we remove them as appropriate. Other than the cycle case, which I'll add, the above cases are covered by the new test case in the patch.

>
> I think the current logic should be OK. But I still find the partition thing and the `ExportedGUIDs` confusing.


================
Comment at: lib/LTO/LTO.cpp:879
           Res.second.Partition == GlobalResolution::External)
         ExportedGUIDs.insert(GlobalValue::getGUID(Res.second.IRName));
     }
----------------
mehdi_amini wrote:
> Could we *not* add DeadSymbols to `ExportedGUIDs` here and remove the `DeadSymbols.count(GUID)` from the test below?
Good idea.


================
Comment at: lib/LTO/ThinLTOCodeGenerator.cpp:727
             ExportList->second.count(GUID)) ||
-           GUIDPreservedSymbols.count(GUID);
+           (!DeadSymbols.count(GUID) && GUIDPreservedSymbols.count(GUID));
   };
----------------
Actually, in ThinLTOCodeGenerator checks here and elsewhere it looks like this test can just be !DeadSymbols.count(GUID) rather than (!DeadSymbols.count(GUID) && GUIDPreservedSymbols.count(GUID)). Here the same GUIDPreservedSymbols set was passed computeDeadSymbols, so they would already be excluded from the DeadSymbol set. ThinLTOCodeGenerator does not seem to distinguish between references outside of ThinLTO vs cross-references between ThinLTO modules, which in fact means that the dead analysis will be conservative here...


https://reviews.llvm.org/D23488





More information about the llvm-commits mailing list