[llvm-dev] Expected behavior of lld during LTO for global symbols (Attr Internal/Common)

Teresa Johnson via llvm-dev llvm-dev at lists.llvm.org
Tue Jun 25 06:24:53 PDT 2019


On Mon, Jun 24, 2019 at 1:08 AM Rui Ueyama <ruiu at google.com> wrote:

> The direct cause of this issue is that, previously lld converted common
> symbols to defined symbols before passing input files to LTO, and
> after r360841 they are passed as common symbols to LTO. Making lld to work
> as before is easy, as we can convert common symbols to defined symbols as
> before. Here is a patch to do that, and I confirmed that that restores the
> original behavior for the reported issue.
>
> The question is why LTO cannot internalize common symbols under some
> conditions. Looks like if there's no file other than bitcode files, LTO can
> internalize them, but if there's other DSO file, LTO can't, even if the
> DSOs don't contain any symbols. But I don't fully understand what is going
> on. I'll try to investigate tomorrow.
>

LTO doesn't do anything special for common symbols when detecting the
symbol resolution. It looks like this got fixed in D63752/r364273, which is
different than the below patch. From that patch description it seems as
though LLD was incorrectly marking these symbols as VisibleToRegularObj?
That would explain the LTO behavior.

Teresa


> diff --git a/lld/ELF/Driver.cpp b/lld/ELF/Driver.cpp
> index 008a6cd7954..d9deddbf357 100644
> --- a/lld/ELF/Driver.cpp
> +++ b/lld/ELF/Driver.cpp
> @@ -1789,6 +1789,11 @@ template <class ELFT> void
> LinkerDriver::link(opt::InputArgList &Args) {
>    if (!Config->Relocatable)
>      Symtab->scanVersionScript();
>
> +  // Replace common symbols with regular symbols, so that common
> +  // symbols in input object files appear as regular symbols in .bss
> +  // in the output.
> +  replaceCommonSymbols();
> +
>    // Do link-time optimization if given files are LLVM bitcode files.
>    // This compiles bitcode files into real object files.
>    //
> @@ -1798,6 +1803,11 @@ template <class ELFT> void
> LinkerDriver::link(opt::InputArgList &Args) {
>    if (errorCount())
>      return;
>
> +  // LTO may have introduced new common symbols, so convert them
> +  // to regular defined symbols.
> +  if (!BitcodeFiles.empty())
> +    replaceCommonSymbols();
>    // If -thinlto-index-only is given, we should create only "index
>    // files" and not object files. Index file creation is already done
>    // in addCombinedLTOObject, so we are done if that's the case.
> @@ -1879,7 +1889,6 @@ template <class ELFT> void
> LinkerDriver::link(opt::InputArgList &Args) {
>    if (!Config->Relocatable)
>      InputSections.push_back(createCommentSection());
>
> -  // Replace common symbols with regular symbols.
> -  replaceCommonSymbols();
>
>    // Do size optimizations: garbage collection, merging of SHF_MERGE
> sections
>
> On Fri, Jun 21, 2019 at 8:39 PM Rui Ueyama <ruiu at google.com> wrote:
>
>> Let me investigate.
>>
>> On Fri, Jun 21, 2019 at 5:08 PM Mani, Suresh <Suresh.Mani at amd.com> wrote:
>>
>>> Thanks for the info Teresa,
>>>
>>>
>>>
>>> Regards
>>>
>>> M Suresh
>>>
>>>
>>>
>>> *From:* Teresa Johnson <tejohnson at google.com>
>>> *Sent:* Thursday, June 20, 2019 7:15 PM
>>> *To:* Mani, Suresh <Suresh.Mani at amd.com>
>>> *Cc:* Rui Ueyama <ruiu at google.com>; llvm-dev <llvm-dev at lists.llvm.org>
>>> *Subject:* Re: [llvm-dev] Expected behavior of lld during LTO for
>>> global symbols (Attr Internal/Common)
>>>
>>>
>>>
>>> [CAUTION: External Email]
>>>
>>> I haven't had a chance to look, but as mentioned, the linker resolution
>>> for the symbol is exported, which explains the LTO side behavior. Someone
>>> from the linker will probably need to see what changed in the symbol info
>>> they are giving LTO is changing after that patch. If you want you can debug
>>> lld's BitcodeCompiler::add to see what info is different in the Resols
>>> array for that symbol that gets passed to LTO. Or what else is different in
>>> the Sym used to generate the resolution. Both of those are examined in
>>> LTO::addModuleToGlobalRes when we note that the symbol is external.
>>>
>>>
>>>
>>> Teresa
>>>
>>>
>>>
>>> On Thu, Jun 20, 2019 at 2:36 AM Mani, Suresh <Suresh.Mani at amd.com>
>>> wrote:
>>>
>>> Hi Teresa,
>>>
>>>
>>>
>>> Can you please let me know if there is any update on this issue.
>>>
>>>
>>>
>>> Thanks
>>>
>>> M Suresh
>>>
>>>
>>>
>>> *From:* Teresa Johnson <tejohnson at google.com>
>>> *Sent:* Tuesday, June 11, 2019 7:23 PM
>>> *To:* Rui Ueyama <ruiu at google.com>
>>> *Cc:* Mani, Suresh <Suresh.Mani at amd.com>; llvm-dev <
>>> llvm-dev at lists.llvm.org>
>>> *Subject:* Re: [llvm-dev] Expected behavior of lld during LTO for
>>> global symbols (Attr Internal/Common)
>>>
>>>
>>>
>>> [CAUTION: External Email]
>>>
>>> LTO can, but it is linker driven. I confirmed that when it is a common
>>> symbol the resolution indicates that the symbol is exported, and when I add
>>> an initializer so that it is a def we no longer think it is exported and
>>> are able to internalize. So this seems to be due to a change in what the
>>> linker is telling LTO. I would have to dig in the debugger to confirm, but
>>> perhaps lld is now indicating that it might be used by a regular obj? I.e.
>>> in BitcodeCompiler::add.
>>>
>>>
>>>
>>> Teresa
>>>
>>>
>>>
>>> On Tue, Jun 11, 2019 at 5:48 AM Rui Ueyama <ruiu at google.com> wrote:
>>>
>>> Looks like this is indeed related to r360841.
>>>
>>>
>>>
>>> In C, there are distinctions between declarations, definitions and
>>> tentative definitions. Global variables declared with "extern" are
>>> declarations. Global variables that don't have "extern" and have
>>> initializers are definitions. If global variables have neither "extern" nor
>>> initializers, they are called tentative definitions.
>>>
>>>
>>>
>>> Common symbols represent tentative definitions.
>>>
>>>
>>>
>>> Tentative definition get special treatment in the linker. Usually if you
>>> define the same symbol in two object files, a linker report an error.
>>> However, common symbols are allowed to duplicate. Two or more common
>>> symbols are merged and then placed to the .bss section, so that they will
>>> be zero-initialized at runtime.
>>>
>>>
>>>
>>> So, a global variable defined as `struct Node* head` is actually a
>>> common symbol.
>>>
>>>
>>>
>>> I'm not sure why LTO cannot internalize common symbols though. Teresa,
>>> is this expected?
>>>
>>>
>>>
>>> On Mon, Jun 10, 2019 at 11:06 PM Teresa Johnson <tejohnson at google.com>
>>> wrote:
>>>
>>> My guess is that it is due to lld change r360841 on that date (Introduce
>>> CommonSymbol). +Rui for comments.
>>>
>>>
>>>
>>> On Mon, Jun 10, 2019 at 4:45 AM Mani, Suresh via llvm-dev <
>>> llvm-dev at lists.llvm.org> wrote:
>>>
>>>
>>>
>>>
>>>
>>> Hi ,
>>>
>>>
>>>
>>> I have an issue during LTO phase of llvm compiler which is as follows,
>>>
>>>
>>>
>>>
>>>
>>> File t3.c
>>>
>>> ---------
>>>
>>>
>>>
>>>
>>>
>>> #include <stdio.h>
>>>
>>> #include <stdlib.h>
>>>
>>>
>>>
>>> // A linked list node
>>>
>>> struct Node {
>>>
>>>     int data;
>>>
>>>     struct Node* next;
>>>
>>>     struct Node* prev;
>>>
>>> };
>>>
>>>
>>>
>>> *struct Node* head;*
>>>
>>>
>>>
>>> /* Given a reference (pointer to pointer) to the head of a list
>>>
>>> and an int, inserts a new node on the front of the list. */
>>>
>>> void push(struct Node** head_ref, int new_data)
>>>
>>> {
>>>
>>>     struct Node* new_node = (struct Node*)malloc(sizeof(struct Node));
>>>
>>>
>>>
>>>     new_node->data = new_data;
>>>
>>>
>>>
>>>     new_node->next = (*head_ref);
>>>
>>>     new_node->prev = NULL;
>>>
>>>
>>>
>>>     if ((*head_ref) != NULL)
>>>
>>>         (*head_ref)->prev = new_node;
>>>
>>>
>>>
>>>     (*head_ref) = new_node;
>>>
>>> }
>>>
>>>
>>>
>>>
>>>
>>> // This function prints contents of linked list starting from the given
>>> node
>>>
>>> void printList(struct Node* node)
>>>
>>> {
>>>
>>>     struct Node* last;
>>>
>>>     printf("\nTraversal in forward direction \n");
>>>
>>>     while (node != NULL) {
>>>
>>>         printf(" %d ", node->data);
>>>
>>>         last = node;
>>>
>>>         node = node->next;
>>>
>>>     }
>>>
>>>
>>>
>>>     printf("\nTraversal in reverse direction \n");
>>>
>>>     while (last != NULL) {
>>>
>>>         printf(" %d ", last->data);
>>>
>>>         last = last->prev;
>>>
>>>     }
>>>
>>> }
>>>
>>>
>>>
>>>
>>>
>>> /* Driver program to test above functions*/
>>>
>>> int main()
>>>
>>> {
>>>
>>>
>>>
>>>     head = NULL;
>>>
>>>     push(&head, 7);
>>>
>>>     push(&head, 1);
>>>
>>>     push(&head, 4);
>>>
>>>
>>>
>>>     printList(head);
>>>
>>>
>>>
>>>     return 0;
>>>
>>> }
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> Compiler invocation:
>>>
>>> --------------------
>>>
>>>
>>>
>>> clang -flto -fuse-ld=lld -O3 t3.c -o a.out
>>>
>>>
>>>
>>>
>>>
>>> Expected behavior during LTO:
>>>
>>> ------------------------------
>>>
>>>
>>>
>>> The compiler optimization during LTO needs to figure out that variable
>>> "head" is not referred by any precompiled object or library.
>>>
>>> Until May-16-2019 variable "head" had internal attribute as follows,
>>>
>>>
>>>
>>> @head = internal global %struct.Node* null, align 8
>>>
>>>
>>>
>>> And the compiler was rightly able to recognize that "head" is not
>>> referred by any external precompiled object or library.
>>>
>>>
>>>
>>> But after May-16-2019  the attribute of head was changed as follows,
>>>
>>>
>>>
>>> @head = common dso_local global %struct.Node* null, align 8
>>>
>>>
>>>
>>>
>>>
>>> Not sure if this is correct behavior?
>>>
>>>
>>>
>>> If this is a correct behavior then can you please let me know how could
>>> the compiler figure out that variable "head" is not referred by any
>>> external precompiled object or library?
>>>
>>>
>>>
>>>
>>>
>>> Thanks
>>>
>>> M Suresh
>>>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>
>>>
>>>
>>>
>>> --
>>>
>>> Teresa Johnson |
>>>
>>>  Software Engineer |
>>>
>>>  tejohnson at google.com |
>>>
>>>
>>>
>>>
>>> --
>>>
>>> Teresa Johnson |
>>>
>>>  Software Engineer |
>>>
>>>  tejohnson at google.com |
>>>
>>>
>>>
>>>
>>>
>>>
>>> --
>>>
>>> Teresa Johnson |
>>>
>>>  Software Engineer |
>>>
>>>  tejohnson at google.com |
>>>
>>>
>>>
>>

-- 
Teresa Johnson |  Software Engineer |  tejohnson at google.com |
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190625/af911528/attachment.html>


More information about the llvm-dev mailing list