[llvm-dev] RFC - a proposal to support additional symbol metadata in ELF object files in the ARM compiler

Christof Douma via llvm-dev llvm-dev at lists.llvm.org
Wed May 1 05:22:49 PDT 2019


Hi Snider.

As you and Peter mentioned there are indeed toolchains that allow location placement from within the C/C++ source code, using attributes or similar. I always wonder if such extension is worth the effort. There are downsides like the non-standard ways of communicating this information to the linker, different places that control location of things (linker and compiler sources). I would love to understand more of what is problematic in the more common approach for placement that is already available.

The conceptual model I follow is that the C/C++ source describes the semantics of the program, and the linker sources (LD scripts or similar, depending on the toolchain in use) describe the placement of the program on the system/device. This gives rise to two common ways for placement that are used a lot that work without any non-standard extensions:

* Define a variable in C/C++ in a dedicated section that a linker can move individually ('section' attribute in the compiler, and regular section placement in the linker).
* Define a symbol in the linker at a certain place and used an extern declaration in C/C++. At this point you can either take the address of it (commonly used) or use it as a regular object (less common).

I am very interested to hear what the weakness in these methods are, to understand the need of a 'location' attribute.

Thanks,
Christof

On 30/04/2019, 16:51, "llvm-dev on behalf of Peter Smith via llvm-dev" <llvm-dev-bounces at lists.llvm.org on behalf of llvm-dev at lists.llvm.org> wrote:

    On Tue, 30 Apr 2019 at 16:17, Snider, Todd via llvm-dev
    <llvm-dev at lists.llvm.org> wrote:
    >
    >
    >
    > Hello All,
    >
    >
    >
    > In ARM embedded applications, there are some compilers that support useful function and variable attributes that help the compiler communicate information about symbols to downstream object consumers (i.e. linkers).
    >
    >
    >
    > One such attribute is the “location” attribute. This attribute can be applied to a global or local static data object or a function to indicate to the linker that the definition of the data object or function should be placed at a specific address in memory.
    >
    >
    >
    > For example, in the following code:
    >
    >
    >
    > #include <stdio.h>
    >
    >
    >
    > extern int a;
    >
    > int a __attribute__((location(0x1000))) = 4;
    >
    >
    >
    > struct bstruct
    >
    > {
    >
    >     int f1;
    >
    >     int f2;
    >
    > };
    >
    >
    >
    > struct bstruct b __attribute__((location(0x1004))) = {10, 12};
    >
    > double c __attribute__((location(0x1010))) = 1.0;
    >
    > char d[] __attribute__((location(0x2000)))  = {1, 2, 3, 4};
    >
    > void foo(double x) __attribute((location(0x4000)));
    >
    >
    >
    > void foo(double x) { printf("%f\n", x); }
    >
    >
    >
    > A location attribute has been applied to several  data objects and the function “foo.”  The compiler would then encode information into the compiled object file that tells the downstream linker about these memory placement constraints on the data objects and function.
    >
    >
    >
    > Without extending the ELF object format, how would this work?
    >
    >
    >
    > I propose to encode metadata information about a symbol in special absolute symbols, “__sym_attr_metadata.<int>”, that the linker can recognize when scanning the symbol table for an incoming object file. In an ELF symbol table entry:
    >
    >
    >
    > typedef struct {
    >
    >        Elf32_Word     st_name;
    >
    >        Elf32_Addr     st_value;
    >
    >        Elf32_Word     st_size;
    >
    >        unsigned char  st_info;
    >
    >        unsigned char  st_other;
    >
    >        Elf32_Half     st_shndx;
    >
    > } Elf32_Sym;
    >
    >
    >
    > typedef struct {
    >
    >        Elf64_Word     st_name;
    >
    >        unsigned char  st_info;
    >
    >        unsigned char  st_other;
    >
    >        Elf64_Half     st_shndx;
    >
    >        Elf64_Addr     st_value;
    >
    >        Elf64_Xword    st_size;
    >
    > } Elf64_Sym;
    >
    >
    >
    > The st_size and st_value fields could be used to represent attribute information about a given symbol:
    >
    >
    >
    > The st_size field can be split into an attribute ID and a symbol index for the symbol that the attribute applies to
    >
    > attribute ID: bits 0..7
    > symbol index: bits 8..31
    >
    > The st_value field can contain the value associated with the attribute (i.e. the address argument of a location attribute)
    >
    >
    >
    > If the compiler is generating assembly code, a new directive similar to the .eabi_attribute can be used:
    >
    >
    >
    >         .symbol_attribute <symbol name>, <attribute kind>, <attribute value>
    >
    >
    >
    > Where:
    >
    > symbol name - will unambiguously identify the symbol that the attribute/value pair applies to
    > attribute kind - is an unsigned integer between 1 and 255 that specifies the kind of attribute to be applied to the symbol
    >
    > I propose a starting base set of 2 attribute IDs: used (1), location (2)
    > the compiler will emit the integer constant that identifies the attribute kind
    >
    > attribute value - a value that is appropriate for the specified attribute kind
    >
    >
    >
    > Thoughts? Comments? Concerns?
    >

    Hello Todd,

    Thanks for bringing this up, I've got a few comments for you based on
    the implementation of a similar attribute in another Embedded Compiler
    (http://infocenter.arm.com/help/topic/com.arm.doc.dui0472m/chr1359124981140.html).
     In that case it was __attribute__((at(address))) but the name is not
    that important.

    The communication with the linker in that case was via section name
    and not symbol, from memory at(<address>) translated to a section name
    of .ARM.__at_<address>. For us this had some advantages:
    - We could use __attribute__((section(".ARM.__at_<address>")))  when
    the compiler didn't support the attribute, it also needed no support
    in the assembler. This wasn't ideal as it is nice to be able to use
    expressions for the address, but it gets you most of the way there.
    - In practice you'd likely need a separate section for each variable
    to avoid problems at link time. For example if you had two variables
    with non-contiguous locations you'd most likely not want these in the
    same section so this mapped quite well to something similar to
    __attribute__((section(name))).
    - We did find some properties of __attribute__((section("name")))
    inconvenient, especially that variables would come out as SHT_PROGBITS
    when in many cases the user wanted SHT_NOBITS (memory mapped
    peripheral), we had our custom attribute fix that.

    If you used a section name rather than a symbol then you may not need
    any backend changes and it would generalise over all ELF targets.
    Linker support is another question entirely though.

    Peter

    >
    >
    > The anticipated next steps would be to add support for the location attribute and update the ARM/ELF LLVM back-end to support encoding the used attribute with the new mechanism.
    >
    >
    >
    > ~ Todd Snider
    >
    >
    >
    > Code Generation Tools Group
    >
    > Texas Instruments Incorporated
    >
    >
    >
    >
    >
    > _______________________________________________
    > LLVM Developers mailing list
    > llvm-dev at lists.llvm.org
    > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
    _______________________________________________
    LLVM Developers mailing list
    llvm-dev at lists.llvm.org
    https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev


IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.


More information about the llvm-dev mailing list