[llvm-dev] Expected behavior of lld during LTO for global symbols (Attr Internal/Common)

Rui Ueyama via llvm-dev llvm-dev at lists.llvm.org
Mon Jun 24 01:25:08 PDT 2019


I posted a minimal test case to reproduce the issue to
https://bugs.llvm.org/show_bug.cgi?id=41978.

On Mon, Jun 24, 2019 at 5:16 PM Mani, Suresh <Suresh.Mani at amd.com> wrote:

> Sure Rui, Thanks for the update and investigation.
>
>
>
> Regards
>
> M Suresh
>
>
>
> *From:* Rui Ueyama <ruiu at google.com>
> *Sent:* Monday, June 24, 2019 1:38 PM
> *To:* Mani, Suresh <Suresh.Mani at amd.com>
> *Cc:* Teresa Johnson <tejohnson at google.com>; llvm-dev <
> llvm-dev at lists.llvm.org>
> *Subject:* Re: [llvm-dev] Expected behavior of lld during LTO for global
> symbols (Attr Internal/Common)
>
>
>
> [CAUTION: External Email]
>
> The direct cause of this issue is that, previously lld converted common
> symbols to defined symbols before passing input files to LTO, and
> after r360841 they are passed as common symbols to LTO. Making lld to work
> as before is easy, as we can convert common symbols to defined symbols as
> before. Here is a patch to do that, and I confirmed that that restores the
> original behavior for the reported issue.
>
>
>
> The question is why LTO cannot internalize common symbols under some
> conditions. Looks like if there's no file other than bitcode files, LTO can
> internalize them, but if there's other DSO file, LTO can't, even if the
> DSOs don't contain any symbols. But I don't fully understand what is going
> on. I'll try to investigate tomorrow.
>
>
>
> diff --git a/lld/ELF/Driver.cpp b/lld/ELF/Driver.cpp
> index 008a6cd7954..d9deddbf357 100644
> --- a/lld/ELF/Driver.cpp
> +++ b/lld/ELF/Driver.cpp
> @@ -1789,6 +1789,11 @@ template <class ELFT> void
> LinkerDriver::link(opt::InputArgList &Args) {
>    if (!Config->Relocatable)
>      Symtab->scanVersionScript();
>
> +  // Replace common symbols with regular symbols, so that common
> +  // symbols in input object files appear as regular symbols in .bss
> +  // in the output.
> +  replaceCommonSymbols();
> +
>    // Do link-time optimization if given files are LLVM bitcode files.
>    // This compiles bitcode files into real object files.
>    //
> @@ -1798,6 +1803,11 @@ template <class ELFT> void
> LinkerDriver::link(opt::InputArgList &Args) {
>    if (errorCount())
>      return;
>
> +  // LTO may have introduced new common symbols, so convert them
> +  // to regular defined symbols.
> +  if (!BitcodeFiles.empty())
> +    replaceCommonSymbols();
>    // If -thinlto-index-only is given, we should create only "index
>    // files" and not object files. Index file creation is already done
>    // in addCombinedLTOObject, so we are done if that's the case.
> @@ -1879,7 +1889,6 @@ template <class ELFT> void
> LinkerDriver::link(opt::InputArgList &Args) {
>    if (!Config->Relocatable)
>      InputSections.push_back(createCommentSection());
>
> -  // Replace common symbols with regular symbols.
> -  replaceCommonSymbols();
>
>    // Do size optimizations: garbage collection, merging of SHF_MERGE
> sections
>
>
>
> On Fri, Jun 21, 2019 at 8:39 PM Rui Ueyama <ruiu at google.com> wrote:
>
> Let me investigate.
>
>
>
> On Fri, Jun 21, 2019 at 5:08 PM Mani, Suresh <Suresh.Mani at amd.com> wrote:
>
> Thanks for the info Teresa,
>
>
>
> Regards
>
> M Suresh
>
>
>
> *From:* Teresa Johnson <tejohnson at google.com>
> *Sent:* Thursday, June 20, 2019 7:15 PM
> *To:* Mani, Suresh <Suresh.Mani at amd.com>
> *Cc:* Rui Ueyama <ruiu at google.com>; llvm-dev <llvm-dev at lists.llvm.org>
> *Subject:* Re: [llvm-dev] Expected behavior of lld during LTO for global
> symbols (Attr Internal/Common)
>
>
>
> [CAUTION: External Email]
>
> I haven't had a chance to look, but as mentioned, the linker resolution
> for the symbol is exported, which explains the LTO side behavior. Someone
> from the linker will probably need to see what changed in the symbol info
> they are giving LTO is changing after that patch. If you want you can debug
> lld's BitcodeCompiler::add to see what info is different in the Resols
> array for that symbol that gets passed to LTO. Or what else is different in
> the Sym used to generate the resolution. Both of those are examined in
> LTO::addModuleToGlobalRes when we note that the symbol is external.
>
>
>
> Teresa
>
>
>
> On Thu, Jun 20, 2019 at 2:36 AM Mani, Suresh <Suresh.Mani at amd.com> wrote:
>
> Hi Teresa,
>
>
>
> Can you please let me know if there is any update on this issue.
>
>
>
> Thanks
>
> M Suresh
>
>
>
> *From:* Teresa Johnson <tejohnson at google.com>
> *Sent:* Tuesday, June 11, 2019 7:23 PM
> *To:* Rui Ueyama <ruiu at google.com>
> *Cc:* Mani, Suresh <Suresh.Mani at amd.com>; llvm-dev <
> llvm-dev at lists.llvm.org>
> *Subject:* Re: [llvm-dev] Expected behavior of lld during LTO for global
> symbols (Attr Internal/Common)
>
>
>
> [CAUTION: External Email]
>
> LTO can, but it is linker driven. I confirmed that when it is a common
> symbol the resolution indicates that the symbol is exported, and when I add
> an initializer so that it is a def we no longer think it is exported and
> are able to internalize. So this seems to be due to a change in what the
> linker is telling LTO. I would have to dig in the debugger to confirm, but
> perhaps lld is now indicating that it might be used by a regular obj? I.e.
> in BitcodeCompiler::add.
>
>
>
> Teresa
>
>
>
> On Tue, Jun 11, 2019 at 5:48 AM Rui Ueyama <ruiu at google.com> wrote:
>
> Looks like this is indeed related to r360841.
>
>
>
> In C, there are distinctions between declarations, definitions and
> tentative definitions. Global variables declared with "extern" are
> declarations. Global variables that don't have "extern" and have
> initializers are definitions. If global variables have neither "extern" nor
> initializers, they are called tentative definitions.
>
>
>
> Common symbols represent tentative definitions.
>
>
>
> Tentative definition get special treatment in the linker. Usually if you
> define the same symbol in two object files, a linker report an error.
> However, common symbols are allowed to duplicate. Two or more common
> symbols are merged and then placed to the .bss section, so that they will
> be zero-initialized at runtime.
>
>
>
> So, a global variable defined as `struct Node* head` is actually a common
> symbol.
>
>
>
> I'm not sure why LTO cannot internalize common symbols though. Teresa, is
> this expected?
>
>
>
> On Mon, Jun 10, 2019 at 11:06 PM Teresa Johnson <tejohnson at google.com>
> wrote:
>
> My guess is that it is due to lld change r360841 on that date (Introduce
> CommonSymbol). +Rui for comments.
>
>
>
> On Mon, Jun 10, 2019 at 4:45 AM Mani, Suresh via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>
>
>
>
> Hi ,
>
>
>
> I have an issue during LTO phase of llvm compiler which is as follows,
>
>
>
>
>
> File t3.c
>
> ---------
>
>
>
>
>
> #include <stdio.h>
>
> #include <stdlib.h>
>
>
>
> // A linked list node
>
> struct Node {
>
>     int data;
>
>     struct Node* next;
>
>     struct Node* prev;
>
> };
>
>
>
> *struct Node* head;*
>
>
>
> /* Given a reference (pointer to pointer) to the head of a list
>
> and an int, inserts a new node on the front of the list. */
>
> void push(struct Node** head_ref, int new_data)
>
> {
>
>     struct Node* new_node = (struct Node*)malloc(sizeof(struct Node));
>
>
>
>     new_node->data = new_data;
>
>
>
>     new_node->next = (*head_ref);
>
>     new_node->prev = NULL;
>
>
>
>     if ((*head_ref) != NULL)
>
>         (*head_ref)->prev = new_node;
>
>
>
>     (*head_ref) = new_node;
>
> }
>
>
>
>
>
> // This function prints contents of linked list starting from the given
> node
>
> void printList(struct Node* node)
>
> {
>
>     struct Node* last;
>
>     printf("\nTraversal in forward direction \n");
>
>     while (node != NULL) {
>
>         printf(" %d ", node->data);
>
>         last = node;
>
>         node = node->next;
>
>     }
>
>
>
>     printf("\nTraversal in reverse direction \n");
>
>     while (last != NULL) {
>
>         printf(" %d ", last->data);
>
>         last = last->prev;
>
>     }
>
> }
>
>
>
>
>
> /* Driver program to test above functions*/
>
> int main()
>
> {
>
>
>
>     head = NULL;
>
>     push(&head, 7);
>
>     push(&head, 1);
>
>     push(&head, 4);
>
>
>
>     printList(head);
>
>
>
>     return 0;
>
> }
>
>
>
>
>
>
>
>
>
> Compiler invocation:
>
> --------------------
>
>
>
> clang -flto -fuse-ld=lld -O3 t3.c -o a.out
>
>
>
>
>
> Expected behavior during LTO:
>
> ------------------------------
>
>
>
> The compiler optimization during LTO needs to figure out that variable
> "head" is not referred by any precompiled object or library.
>
> Until May-16-2019 variable "head" had internal attribute as follows,
>
>
>
> @head = internal global %struct.Node* null, align 8
>
>
>
> And the compiler was rightly able to recognize that "head" is not referred
> by any external precompiled object or library.
>
>
>
> But after May-16-2019  the attribute of head was changed as follows,
>
>
>
> @head = common dso_local global %struct.Node* null, align 8
>
>
>
>
>
> Not sure if this is correct behavior?
>
>
>
> If this is a correct behavior then can you please let me know how could
> the compiler figure out that variable "head" is not referred by any
> external precompiled object or library?
>
>
>
>
>
> Thanks
>
> M Suresh
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
>
>
> --
>
> Teresa Johnson |
>
>  Software Engineer |
>
>  tejohnson at google.com |
>
>
>
>
> --
>
> Teresa Johnson |
>
>  Software Engineer |
>
>  tejohnson at google.com |
>
>
>
>
>
>
> --
>
> Teresa Johnson |
>
>  Software Engineer |
>
>  tejohnson at google.com |
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190624/eef26b9b/attachment.html>


More information about the llvm-dev mailing list