[llvm-dev] Support for zero flag ELF section groups in LLVM IR

Reid Kleckner via llvm-dev llvm-dev at lists.llvm.org
Thu Feb 11 14:04:07 PST 2021


We are already using LLVM IR comdat groups for the same purpose, linker GC
association, on COFF. I think we just need a flag to mark ELF comdat groups
as, essentially, not actually being common data that the linker should
deduplicate by name, aka a zero flag group. See how Windows ASan uses
comdat groups on internal globals for metadata registration:

$ cat t.cpp
int f();
static int gv = f();

$ clang -S t.cpp  --target=x86_64-windows-msvc -o - -emit-llvm
-fsanitize=address
...
$gv = comdat noduplicates
...
@gv = internal global { i32, [60 x i8] } zeroinitializer, comdat, align 32
...
@__asan_global_gv = private global { i64, i64, i64, i64, i64, i64, i64, i64
} { i64 ptrtoint ({ i32, [60 x i8] }* @gv to i64), i64 4, i64 64, i64
ptrtoint ([3 x i8]* @___asan_gen_.1 to i64), i64 ptrtoint ([6 x i8]*
@___asan_gen_ to i64), i64 1, i64 ptrtoint ({ [6 x i8]*, i32, i32 }*
@___asan_gen_.3 to i64), i64 -1 }, section ".ASAN$GL", comdat($gv), align
64, !associated !0


We are using the "noduplicates" comdat flag here, but @gv has internal
linkage, and COFF linkers merge symbols, not section group names, so this
code does what we want it to. Maybe it would make more sense if we used
some kind of portable flag, like "internal" or "unique" on the comdat group
to indicate that the group doesn't participate in merging. On COFF, we'd
have the limitation that this feature only works for comdat groups named
after internal linkage globals, but on ELF, the group could have any name.

You could rename the Comdat class to Group or SectionGroup or something,
but I'm not sure there's much value in it. The terminology as it is makes
sense for COFF, if not for ELF. ELF makes the distinction between comdat
section groups and non-comdat section groups, but MSVC and clang-cl use the
IMAGE_SCN_COMDAT symbol flag and the IMAGE_COMDAT_SELECT_ASSOCIATIVE
selection flag to implement these types of groups.

Then, there's the cost of churning the textual IR spellings and method
names. We have the freedom to change these things, but we should
acknowledge that it does create work for ourselves and others. IMO, it is
worth living with COFF-centric naming of an IR feature to avoid paying
these costs. However, I am probably biased, as I have been calling this
idea of a group of sections that travel together a "comdat" for a while now.

On Thu, Feb 11, 2021 at 12:00 AM Petr Hosek via llvm-dev <
llvm-dev at lists.llvm.org> wrote:

> D95851 introduces support for zero flag ELF section groups to LLVM. LLVM
> already supports COMDAT sections, which in ELF are a special type of ELF
> section groups. These are generally useful to enable linker GC where you
> want a group of sections to always travel together, that is to be either
> retained or discarded as a whole, but without the COMDAT semantics. Other
> ELF assemblers and linkers already support zero flag ELF section groups and
> this change helps us reach feature parity.
>
> An open question is how to best represent these in LLVM IR.
>
> We represent COMDAT sections as global variables and other global
> variables can be included in COMDAT sections, see
> https://llvm.org/docs/LangRef.html#comdats for details.
>
> We want to capture the fact that COMDAT sections are a special type of ELF
> section groups and we also want to preserve the existing syntax and API for
> backwards compatibility, but also because other formats like COFF support
> COMDAT sections, but not section groups.
>
> Our proposal is to introduce ELF section groups as a new type of global
> variable akin to COMDAT sections. We would extend the language by changing:
>
>   [, comdat[($name)]]
>
> when declaring a global variable to:
>
>   [, \(group[($name)] | [group] comdat[($name)]\)]
>
> When it comes to C++ API, we would introduce Group as a superclass of
> Comdat:
>
>   class Group {
>     StringRef getName() const;
>   };
>   class Comdat : public Group {
>     ...
>   };
>   class GlobalObject : public GlobalValue {
>     ...
>     bool hasGroup();
>     Group *getGroup();
>     void setGroup(Group G);
>     // has/get/setComdat functions re-implemented in terms of
> has/get/setGroup
>     ...
>   };
>
> Does this make sense? Can anyone think of a better representation?
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210211/dd9247e0/attachment.html>


More information about the llvm-dev mailing list