[llvm-dev] Zero length function pointer equality

David Chisnall via llvm-dev llvm-dev at lists.llvm.org
Fri Jul 24 02:41:48 PDT 2020


On 24/07/2020 01:46, David Blaikie via llvm-dev wrote:
> I believe C++ requires that all functions have a distinct address (ie:
> &f1 != &f2) and LLVM optimizes code on this basis (assert(f1 == f2)
> gets optimized into an unconditional assertion failure)
> 
> But these zero length functions can end up with identical addresses.
> 
> I'm unaware of anything in the C++ spec (or the LLVM langref) that
> would indicate that would allow distinct functions to have identical
> addresses - so should we do something about this in the LLVM backend?
> add a little padding? a nop instruction? (if we're adding an
> instruction anyway, perhaps we might as well make it an int3?)

This is also a problem with identical function merging in the linker, 
which link.exe does quite aggressively.  The special case of zero-length 
functions seems less common than the more general case of merging, in 
both cases you will end up with a single implementation in the binary 
that has two symbols for the same address.  For example, consider the 
following trivial program:

#include <stdio.h>

int a()
{
         return 42;
}

int b()
{
         return 42;
}

int main()
{
         printf("a == b? %d\n", a == b);
         return 0;
}

Compiled with cl.exe /Gy, this prints:

a == b? 1

Given that functions are immutable, it's a somewhat odd decision at the 
abstract machine level to assume that they have identity that is 
distinct from their value (though it can simplify debugging - back 
traces in Windows executables are sometimes quite confusing when you see 
a call into a function that is structurally correct but nominally 
incorrect).

Given that link.exe can happily violate this guarantee in the general 
case, I'm not too concerned that LLVM can violate it in the special 
case.  From the perspective of a programmer, I'm not sure what kind of 
logic would be broken by function equality returning true when two 
functions with different names but identical behaviour are invoked.  I'm 
curious if you have any examples.

David



More information about the llvm-dev mailing list