[cfe-dev] [llvm-dev] [RFC][ARM] Add support for embedded position-independent code (ROPI/RWPI)

Fri Dec 4 09:45:01 PST 2015

> On Fri, Dec 04, 2015 at 01:46:13PM -0000, Oliver Stannard via llvm-dev
> wrote:
> > In addition to passing the command-line options through to the backend,
> > clang must be changed to work around a limitation in these modes: since
> > there is no dynamic loader, if a variable is initialised to the address
> > of a global value, it's initial value is not known at static link time.
> > For example:
> >
> >   extern int i;
> >   int *p = &a; // Initial value unknown at static link time
> >
> > SysV-style PIC solves this by having the dynamic linker fix up any
> > relocations on the data segment. Since these modes are trying to avoid
> > the need for a dynamic linker, we instead have the compiler emit code to
> > initialise these variables at startup time. These initiailisers are
> > expected to be rare, so the dynamic initiaslisers will be smaller than
> > the equivalent dynamic linker plus relocation and symbol tables.
> 
> You don't need a full blown dynamic linker to handle that, just that the
> linker creates output that can be appropiately references by the init
> code.

That works fine for references in writable data, but not for references in read-only data, which a dynamic linker can't change. There are a few different ways this can be solved:
 1) The patch I proposed, which makes const data writable and inserts dynamic initialisers.
 2) Just move const data needing initialisation to an RW section in the compiler, and have a dynamic loader which adjusts it (based on tables generated by the linker). This would provide the same behaviour as option 1, but I don't think it would reduce the clang patch that much. 
 3) Don't make any change in the compiler. The linker can detect that const data needs dynamic relocations, and emit an error, so the user can change their code to make the data non-const. The linker can't make this change automatically, as there is already compiled code which accesses it, and making it non-const changes the way it should be addressed.

> I don't think that dynamic initialisers will work correctly at
> all, since you can access "i" in a separate module that doesn't know
> about the initialiser at all.

This is only a problem when a const initialiser has to be is placed in a read-write section, as other translation units will access it incorrectly. I've added a warning when this happens.

> Consider taking a look how most dynamic linkers operate themselve in the
> ELF world. One of the first things they do is relocate themselve by
> processing their own relocation table and applying the fixups. This
> doesn't involve symbol tables at all, just patching up addresses.

That sounds like the same thing as options 2 and 3 above, right? I think the main difference in the embedded world is that code and RO data are stored in ROM or flash which are impossible or slow to overwrite, and minimising the amount of RAM used is desirable. Also, since this isn't being used for actual dynamic linking but just for a few static initialisers, the dynamic loader would be an unnecessary increase in code size.

> As such, I don't think such transformation belongs into clang.

Fair enough, I posted this as an RFC because it is quite different to what already exists in clang, and option 3 above shows that the backend change can still be used without the majority of the clang patch.

Oliver