[cfe-dev] [llvm-dev] Zero length function pointer equality

Tue Sep 14 08:05:16 PDT 2021

Seems we also have a tangentially related issue with zero length variables
too: https://bugs.llvm.org/show_bug.cgi?id=51848 fwiw. (will likely require
a distinct solution, so I don't think there's much overlap here in terms of
design/implementation - just conceptually related with some similar bugs in
user code that this can create)

On Wed, Sep 8, 2021 at 2:21 PM David Blaikie <dblaikie at gmail.com> wrote:

> On Wed, Sep 8, 2021 at 1:20 PM Reid Kleckner <rnk at google.com> wrote:
>
>> IMO TrapUnreachable goes farther than most users want: for every noreturn
>> function call (abort, assert_fail, cxa_throw), it would add a trap
>> instruction.
>>
>
> Yep, there's a separate target option to turn that off:
> NoTrapAfterNoreturn. TrapUnreachable == true && NoTrapAfterNoreturn == ture
> is the mode I was considering in that direction.
>
>
>> Those can be common, so that seems like a fairly high cost. And, it
>> doesn't prevent the optimizer from exploiting UB to fold away a conditional
>> branch to unreachable. If users want a mode with less UB, I think it would
>> be more principled for the frontend to emit traps before every unreachable
>> instruction, and to suppress the use of the LLVM noreturn function
>> attribute.
>>
>
> I think that's a separate option/mode - at least that's what clang does at
> -O0. Not sure if you can ask for it with other flags too.
>
>
>> I think it would be cheap and non-controversial to emit traps instead of
>> zero-length functions, which are really just labels pointing at unrelated
>> code. Various targets and object file formats already require this behavior
>> (MachO for atomization, others for other reasons). I think it would be
>> reasonable to make this behavior standard to all platforms.
>>
>
> Yeah - I'm OK with that too - though my attempt to implement it stalled
> out a bit (others with more knowledge/grit here might have more luck).
> Specifically I think that was the AMDGPU bit where it has to have functions
> attributed it a certain way to emit a trap instruction - and then I ran
> into a case where even inspecting (maybe the IR) instruction to check if it
> was zero length (to then add the attribute it needed) was insufficient
> since there were some cases where it became zero length... oh, yeah, a call
> to a null pointer, I think - just got elided entirely by the AMDGPU backend
> or something, I think.
>
> - Dave
>
>
>>
>> On Tue, Sep 7, 2021 at 5:24 AM Hans Wennborg via llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>>
>>> Did anything come of this?
>>>
>>> The Linux folks (cc Nick) keep running into issues where a function
>>> which ends with unreachable can fall through to an unrelated function,
>>> which confuses some machine code analyser they run (e.g.
>>> https://github.com/ClangBuiltLinux/linux/issues/1440).
>>>
>>> It seems TrapUnreachable would avoid such issues.
>>>
>>> On Fri, Mar 19, 2021 at 10:11 PM David Blaikie <dblaikie at gmail.com>
>>> wrote:
>>>
>>>> Just writing it down in this thread - this issue's been discussed a bit
>>>> in this bug: https://bugs.llvm.org/show_bug.cgi?id=49599
>>>>
>>>> And yeah, I'm considering adopting MachO's default (TrapUnreachable +
>>>> NoTrapOnNoreturn) as the default for LLVM (will require some design
>>>> discussion, no doubt) since it seems to capture most of the functionality
>>>> desired. Maybe there are some cases where we have extra unreachables that
>>>> could've otherwise been optimized away/elided, but hopefully nothing
>>>> drastic.
>>>>
>>>> (some platforms still need the trap-on-noreturn - Windows+AArch64 and
>>>> maybe Sony, etc - happy for some folks to opt into that). I wonder whether
>>>> TrapUnreachable shouldn't even be an option anymore though, if it becomes
>>>> load bearing for correctness - or should it become a fallback option - "no
>>>> trap unreachable" maybe means nop instead of trap, in case your target
>>>> can't handle a trap sometimes (I came across an issue with AMDGPU not being
>>>> able to add traps to functions that it isn't expecting - the function needs
>>>> some special attribute to have a trap in it - but I guess it can be updated
>>>> to add that attribute if the function has an unreachable in it (though then
>>>> it has to recreate the no-trap-on-noreturn handling too when deciding
>>>> whether to add the attribute... ))
>>>>
>>>> On Mon, Jul 27, 2020 at 9:20 AM Robinson, Paul <paul.robinson at sony.com>
>>>> wrote:
>>>>
>>>>>
>>>>>
>>>>> > -----Original Message-----
>>>>> > From: llvm-dev <llvm-dev-bounces at lists.llvm.org> On Behalf Of Hans
>>>>> > Wennborg via llvm-dev
>>>>> > Sent: Monday, July 27, 2020 9:11 AM
>>>>> > To: David Blaikie <dblaikie at gmail.com>
>>>>> > Cc: llvm-dev <llvm-dev at lists.llvm.org>; Clang Dev <
>>>>> cfe-dev at lists.llvm.org>
>>>>> > Subject: Re: [llvm-dev] [cfe-dev] Zero length function pointer
>>>>> equality
>>>>> >
>>>>> > On Sat, Jul 25, 2020 at 3:40 AM David Blaikie <dblaikie at gmail.com>
>>>>> wrote:
>>>>> > >
>>>>> > > Looks perfect to me!
>>>>> > >
>>>>> > > well, a couple of questions: Why a noop, rather than int3/ud2/etc?
>>>>> >
>>>>> > Probably no reason.
>>>>>
>>>>> FTR there is TargetOptions.TrapUnreachable, which some targets turn on
>>>>> (for X86 it's on for MachO and PS4), this turns 'unreachable' into ud2.
>>>>> Clearly it covers more than "empty" functions but is probably the kind
>>>>> of thing you're looking for.
>>>>> --paulr
>>>>>
>>>>> >
>>>>> > > Might be worth using the existing code that places such an
>>>>> instruction
>>>>> > > when building at -O0?
>>>>> >
>>>>> > I wasn't aware of that. Does it happen for all functions (e.g. I
>>>>> think
>>>>> > I got pulled into this due to functions with the naked attribute)?
>>>>> >
>>>>> > > & you mention that this causes problems on Windows - but ICF done
>>>>> by
>>>>> > > the Windows linker does not cause such problems? (I'd have thought
>>>>> > > they'd result in the same situation - two functions described as
>>>>> being
>>>>> > > at the same address?) is there a quick summary of why those two
>>>>> cases
>>>>> > > turn out differently?
>>>>> >
>>>>> > The case that we hit was that the Control Flow Guard table of
>>>>> > addresses in the binary ended up listing the same address twice,
>>>>> which
>>>>> > the loader didn't expect. It may be that the linker took care to
>>>>> avoid
>>>>> > that for ICF (if two ICF'd functions got the same address, only list
>>>>> > it once in the CFG table) but still didn't handle the "empty
>>>>> function"
>>>>> > problem.
>>>>> >
>>>>> > > On Fri, Jul 24, 2020 at 6:17 AM Hans Wennborg <hans at chromium.org>
>>>>> wrote:
>>>>> > > >
>>>>> > > > Maybe we can just expand this to always apply:
>>>>> > https://reviews.llvm.org/D32330
>>>>> > > >
>>>>> > > > On Fri, Jul 24, 2020 at 2:46 AM David Blaikie via cfe-dev
>>>>> > > > <cfe-dev at lists.llvm.org> wrote:
>>>>> > > > >
>>>>> > > > > LLVM can produce zero length functions from cases like this
>>>>> (when
>>>>> > > > > optimizations are enabled):
>>>>> > > > >
>>>>> > > > > void f1() { __builtin_unreachable(); }
>>>>> > > > > int f2() { /* missing return statement */ }
>>>>> > > > >
>>>>> > > > > This code is valid, so long as the functions are never called.
>>>>> > > > >
>>>>> > > > > I believe C++ requires that all functions have a distinct
>>>>> address
>>>>> > (ie:
>>>>> > > > > &f1 != &f2) and LLVM optimizes code on this basis (assert(f1
>>>>> == f2)
>>>>> > > > > gets optimized into an unconditional assertion failure)
>>>>> > > > >
>>>>> > > > > But these zero length functions can end up with identical
>>>>> addresses.
>>>>> > > > >
>>>>> > > > > I'm unaware of anything in the C++ spec (or the LLVM langref)
>>>>> that
>>>>> > > > > would indicate that would allow distinct functions to have
>>>>> identical
>>>>> > > > > addresses - so should we do something about this in the LLVM
>>>>> > backend?
>>>>> > > > > add a little padding? a nop instruction? (if we're adding an
>>>>> > > > > instruction anyway, perhaps we might as well make it an int3?)
>>>>> > > > >
>>>>> > > > > (I came across this due to DWARF issues with zero length
>>>>> functions &
>>>>> > > > > thinking about if/how this should be supported)
>>>>> > > > > _______________________________________________
>>>>> > > > > cfe-dev mailing list
>>>>> > > > > cfe-dev at lists.llvm.org
>>>>> > > > > https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>>>> > _______________________________________________
>>>>> > LLVM Developers mailing list
>>>>> > llvm-dev at lists.llvm.org
>>>>> > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>
>>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20210914/78e1b70e/attachment.html>