[llvm-dev] Expected behavior of lld during LTO for global symbols (Attr Internal/Common)

Rui Ueyama via llvm-dev llvm-dev at lists.llvm.org
Mon Jun 24 01:08:24 PDT 2019


The direct cause of this issue is that, previously lld converted common
symbols to defined symbols before passing input files to LTO, and
after r360841 they are passed as common symbols to LTO. Making lld to work
as before is easy, as we can convert common symbols to defined symbols as
before. Here is a patch to do that, and I confirmed that that restores the
original behavior for the reported issue.

The question is why LTO cannot internalize common symbols under some
conditions. Looks like if there's no file other than bitcode files, LTO can
internalize them, but if there's other DSO file, LTO can't, even if the
DSOs don't contain any symbols. But I don't fully understand what is going
on. I'll try to investigate tomorrow.

diff --git a/lld/ELF/Driver.cpp b/lld/ELF/Driver.cpp
index 008a6cd7954..d9deddbf357 100644
--- a/lld/ELF/Driver.cpp
+++ b/lld/ELF/Driver.cpp
@@ -1789,6 +1789,11 @@ template <class ELFT> void
LinkerDriver::link(opt::InputArgList &Args) {
   if (!Config->Relocatable)
     Symtab->scanVersionScript();

+  // Replace common symbols with regular symbols, so that common
+  // symbols in input object files appear as regular symbols in .bss
+  // in the output.
+  replaceCommonSymbols();
+
   // Do link-time optimization if given files are LLVM bitcode files.
   // This compiles bitcode files into real object files.
   //
@@ -1798,6 +1803,11 @@ template <class ELFT> void
LinkerDriver::link(opt::InputArgList &Args) {
   if (errorCount())
     return;

+  // LTO may have introduced new common symbols, so convert them
+  // to regular defined symbols.
+  if (!BitcodeFiles.empty())
+    replaceCommonSymbols();
   // If -thinlto-index-only is given, we should create only "index
   // files" and not object files. Index file creation is already done
   // in addCombinedLTOObject, so we are done if that's the case.
@@ -1879,7 +1889,6 @@ template <class ELFT> void
LinkerDriver::link(opt::InputArgList &Args) {
   if (!Config->Relocatable)
     InputSections.push_back(createCommentSection());

-  // Replace common symbols with regular symbols.
-  replaceCommonSymbols();

   // Do size optimizations: garbage collection, merging of SHF_MERGE
sections

On Fri, Jun 21, 2019 at 8:39 PM Rui Ueyama <ruiu at google.com> wrote:

> Let me investigate.
>
> On Fri, Jun 21, 2019 at 5:08 PM Mani, Suresh <Suresh.Mani at amd.com> wrote:
>
>> Thanks for the info Teresa,
>>
>>
>>
>> Regards
>>
>> M Suresh
>>
>>
>>
>> *From:* Teresa Johnson <tejohnson at google.com>
>> *Sent:* Thursday, June 20, 2019 7:15 PM
>> *To:* Mani, Suresh <Suresh.Mani at amd.com>
>> *Cc:* Rui Ueyama <ruiu at google.com>; llvm-dev <llvm-dev at lists.llvm.org>
>> *Subject:* Re: [llvm-dev] Expected behavior of lld during LTO for global
>> symbols (Attr Internal/Common)
>>
>>
>>
>> [CAUTION: External Email]
>>
>> I haven't had a chance to look, but as mentioned, the linker resolution
>> for the symbol is exported, which explains the LTO side behavior. Someone
>> from the linker will probably need to see what changed in the symbol info
>> they are giving LTO is changing after that patch. If you want you can debug
>> lld's BitcodeCompiler::add to see what info is different in the Resols
>> array for that symbol that gets passed to LTO. Or what else is different in
>> the Sym used to generate the resolution. Both of those are examined in
>> LTO::addModuleToGlobalRes when we note that the symbol is external.
>>
>>
>>
>> Teresa
>>
>>
>>
>> On Thu, Jun 20, 2019 at 2:36 AM Mani, Suresh <Suresh.Mani at amd.com> wrote:
>>
>> Hi Teresa,
>>
>>
>>
>> Can you please let me know if there is any update on this issue.
>>
>>
>>
>> Thanks
>>
>> M Suresh
>>
>>
>>
>> *From:* Teresa Johnson <tejohnson at google.com>
>> *Sent:* Tuesday, June 11, 2019 7:23 PM
>> *To:* Rui Ueyama <ruiu at google.com>
>> *Cc:* Mani, Suresh <Suresh.Mani at amd.com>; llvm-dev <
>> llvm-dev at lists.llvm.org>
>> *Subject:* Re: [llvm-dev] Expected behavior of lld during LTO for global
>> symbols (Attr Internal/Common)
>>
>>
>>
>> [CAUTION: External Email]
>>
>> LTO can, but it is linker driven. I confirmed that when it is a common
>> symbol the resolution indicates that the symbol is exported, and when I add
>> an initializer so that it is a def we no longer think it is exported and
>> are able to internalize. So this seems to be due to a change in what the
>> linker is telling LTO. I would have to dig in the debugger to confirm, but
>> perhaps lld is now indicating that it might be used by a regular obj? I.e.
>> in BitcodeCompiler::add.
>>
>>
>>
>> Teresa
>>
>>
>>
>> On Tue, Jun 11, 2019 at 5:48 AM Rui Ueyama <ruiu at google.com> wrote:
>>
>> Looks like this is indeed related to r360841.
>>
>>
>>
>> In C, there are distinctions between declarations, definitions and
>> tentative definitions. Global variables declared with "extern" are
>> declarations. Global variables that don't have "extern" and have
>> initializers are definitions. If global variables have neither "extern" nor
>> initializers, they are called tentative definitions.
>>
>>
>>
>> Common symbols represent tentative definitions.
>>
>>
>>
>> Tentative definition get special treatment in the linker. Usually if you
>> define the same symbol in two object files, a linker report an error.
>> However, common symbols are allowed to duplicate. Two or more common
>> symbols are merged and then placed to the .bss section, so that they will
>> be zero-initialized at runtime.
>>
>>
>>
>> So, a global variable defined as `struct Node* head` is actually a common
>> symbol.
>>
>>
>>
>> I'm not sure why LTO cannot internalize common symbols though. Teresa, is
>> this expected?
>>
>>
>>
>> On Mon, Jun 10, 2019 at 11:06 PM Teresa Johnson <tejohnson at google.com>
>> wrote:
>>
>> My guess is that it is due to lld change r360841 on that date (Introduce
>> CommonSymbol). +Rui for comments.
>>
>>
>>
>> On Mon, Jun 10, 2019 at 4:45 AM Mani, Suresh via llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>>
>>
>>
>>
>>
>> Hi ,
>>
>>
>>
>> I have an issue during LTO phase of llvm compiler which is as follows,
>>
>>
>>
>>
>>
>> File t3.c
>>
>> ---------
>>
>>
>>
>>
>>
>> #include <stdio.h>
>>
>> #include <stdlib.h>
>>
>>
>>
>> // A linked list node
>>
>> struct Node {
>>
>>     int data;
>>
>>     struct Node* next;
>>
>>     struct Node* prev;
>>
>> };
>>
>>
>>
>> *struct Node* head;*
>>
>>
>>
>> /* Given a reference (pointer to pointer) to the head of a list
>>
>> and an int, inserts a new node on the front of the list. */
>>
>> void push(struct Node** head_ref, int new_data)
>>
>> {
>>
>>     struct Node* new_node = (struct Node*)malloc(sizeof(struct Node));
>>
>>
>>
>>     new_node->data = new_data;
>>
>>
>>
>>     new_node->next = (*head_ref);
>>
>>     new_node->prev = NULL;
>>
>>
>>
>>     if ((*head_ref) != NULL)
>>
>>         (*head_ref)->prev = new_node;
>>
>>
>>
>>     (*head_ref) = new_node;
>>
>> }
>>
>>
>>
>>
>>
>> // This function prints contents of linked list starting from the given
>> node
>>
>> void printList(struct Node* node)
>>
>> {
>>
>>     struct Node* last;
>>
>>     printf("\nTraversal in forward direction \n");
>>
>>     while (node != NULL) {
>>
>>         printf(" %d ", node->data);
>>
>>         last = node;
>>
>>         node = node->next;
>>
>>     }
>>
>>
>>
>>     printf("\nTraversal in reverse direction \n");
>>
>>     while (last != NULL) {
>>
>>         printf(" %d ", last->data);
>>
>>         last = last->prev;
>>
>>     }
>>
>> }
>>
>>
>>
>>
>>
>> /* Driver program to test above functions*/
>>
>> int main()
>>
>> {
>>
>>
>>
>>     head = NULL;
>>
>>     push(&head, 7);
>>
>>     push(&head, 1);
>>
>>     push(&head, 4);
>>
>>
>>
>>     printList(head);
>>
>>
>>
>>     return 0;
>>
>> }
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> Compiler invocation:
>>
>> --------------------
>>
>>
>>
>> clang -flto -fuse-ld=lld -O3 t3.c -o a.out
>>
>>
>>
>>
>>
>> Expected behavior during LTO:
>>
>> ------------------------------
>>
>>
>>
>> The compiler optimization during LTO needs to figure out that variable
>> "head" is not referred by any precompiled object or library.
>>
>> Until May-16-2019 variable "head" had internal attribute as follows,
>>
>>
>>
>> @head = internal global %struct.Node* null, align 8
>>
>>
>>
>> And the compiler was rightly able to recognize that "head" is not
>> referred by any external precompiled object or library.
>>
>>
>>
>> But after May-16-2019  the attribute of head was changed as follows,
>>
>>
>>
>> @head = common dso_local global %struct.Node* null, align 8
>>
>>
>>
>>
>>
>> Not sure if this is correct behavior?
>>
>>
>>
>> If this is a correct behavior then can you please let me know how could
>> the compiler figure out that variable "head" is not referred by any
>> external precompiled object or library?
>>
>>
>>
>>
>>
>> Thanks
>>
>> M Suresh
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>>
>>
>>
>> --
>>
>> Teresa Johnson |
>>
>>  Software Engineer |
>>
>>  tejohnson at google.com |
>>
>>
>>
>>
>> --
>>
>> Teresa Johnson |
>>
>>  Software Engineer |
>>
>>  tejohnson at google.com |
>>
>>
>>
>>
>>
>>
>> --
>>
>> Teresa Johnson |
>>
>>  Software Engineer |
>>
>>  tejohnson at google.com |
>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190624/497e874c/attachment.html>


More information about the llvm-dev mailing list