[llvm-dev] [cfe-dev] -fpic ELF default: reclaim some -fno-semantic-interposition optimization opportunities?

Sun Jun 6 14:22:45 PDT 2021

On Sun, Jun 06, 2021 at 10:50:41AM -0700, Fāng-ruì Sòng wrote:
> On Sun, Jun 6, 2021 at 7:08 AM Joerg Sonnenberger via cfe-dev
> <cfe-dev at lists.llvm.org> wrote:
> >
> > On Sat, Jun 05, 2021 at 06:08:57PM -0700, Fāng-ruì Sòng via llvm-dev wrote:
> > > On 2021-06-06, Joerg Sonnenberger wrote:
> > > > On Fri, Jun 04, 2021 at 03:26:53PM -0700, Fāng-ruì Sòng via llvm-dev wrote:
> > > > > Fixing the last point is actually easy: let -fno-pic use GOT when
> > > > > taking the address of an non-definition function.
> > > >
> > > > I'd far prefer to have an attribute to explicitly say that the address
> > > > of a given symbol should always be computed indirectly (e.g. via GOT).
> > > > That gives the explicit control necessary for libraries without
> > > > penalizing the larger executables like clang.
> > > >
> > > > Joerg
> > >
> > > Taking the address (in code) of a non-definition function is rare,
> > > rarer after optimization. At least when building clang, I cannot find
> > > any penalizing.
> >
> > I was not talking about just functions. I can't even think of a case
> > where pointer equality for function pointers matters. But the case I
> > care far more about is being able to avoid copy relocations for global
> > variables and that's the same problem (loading the address of a symbol).
> >
> > Joerg
> 
> On the Clang side, `-fno-pic -fno-direct-access-external-data` uses
> GOT to access a default visibility global variable today.
> If all TUs use this option and assembly files do the right thing, copy
> relocations can be avoided.

Most code in the wild doesn't use visibility flags and would be
penalized by that. An attribute would allow explicitly opting out of it
of direct access for system headers and other libraries.

> I know some folks prefer eliminating copy relocations for ABI and
> security reasons.
> I deliberately make the scope narrow to functions because functions
> are where we can improve performance.

For functions there are two cases: "unnamed" address use and "named"
address use. Kind of similar to what we have already for global
variables on whether they can be merged or not. Unnamed as in "I don't
care if it is the canonical address", so the linker is free to introduce
a PLT slot. This works fine on all architectures and without any
penalties if the binding is local. There might be some flag needed here
because the glibc implementation of the dynamic linker wants to do some
wonky fixup on the PLT, but that's a glibc specific issue and outside
the scope of LLVM. For the named address use we do care about the
canonical address and that's where the distinction of attributed vs
default assumption makes a difference: loading a pointer from the GOT vs
doing a (PC relative) address load. On i386 the former didn't have
patchable relocation support for a long time and I'm not sure it exists
nowadays, i.e. allow the linker to relax the mov into lea. It can be
even more complicated on other archs where address computations are
complicated like Sparc. The attribute infrastructure here is the same as
would be needed for global variables and those are where the more
expensive issues are. Copy relocations e.g. for a constant array can be
arbitrarily expensive and are an ABI maintainance nightmare, so finally
having a way that is cheap to avoid them would be a great step forward.

Proposal for this would be to have an attribute to specify the "owner"
of the implementation as a string and a matching clang option to specify
a non-default owner (e.g. __attributed__((definedby("libc"))) and
-fdefining=libc) and the empty string being the default, meaning the
main binary.

Joerg