[cfe-dev] Propagating llvm.assume across function calls to enhance de-virtualization

Thu Nov 12 15:16:01 PST 2015

Resending my email because I've sent it from wrong mailbox.

2015-11-13 0:04 GMT+01:00 Piotr Padlewski <piotrekpad at gmail.com>:

> There are another things left:
> 1. Adding check for ubsan to detect UBs caused by inplace new
> 2. upgrading GVN to optimize based on !invariant.group across BBs
> 3. Adding something like "nocapture-global" that will say that this
> pointer is not being captured by global, but may be captured for example by
> returning it from function.
> This thing is important because this is exactly what
> invariant.group.barrier is doing, and right now, because it is not
> nocapture, emiting this intransic may remove nocapture from function in
> which we
> emit invariant.group.barrier.
> 4. Fix compile time regression caused by many assume instructions (after
> constructor call). I don't remember which pass it was, but there was one
> pice of code that had some huge complexity,
> and we coudn't make it better enough to make this change imperceptible. I
> am not sure what would be good solution to this - maybe there should be
> some other assume like intrinsic for
> doing this numbers assumes that we had problem.
> 5. Maybe add logic that will remove all invariant.group stuff when doing
> LTO with module that was not compiled with -fstrict-vtable-pointers.
>
> Besides the things that Richard said.
>
> Piotr
>
> 2015-11-12 23:40 GMT+01:00 Richard Smith via cfe-dev <
> cfe-dev at lists.llvm.org>:
>
>> On Thu, Nov 12, 2015 at 2:24 PM, Geoff Berry via cfe-dev <
>> cfe-dev at lists.llvm.org> wrote:
>>
>>> Hi All,
>>>
>>>
>>>
>>> I have a two-part de-virtualization enhancement that I’m considering
>>> working on and am looking for any feedback on how feasible it is.  The two
>>> parts are:
>>>
>>>
>>>
>>> 1.       llvm: Extending inter-procedural SCCP (or some other IPO
>>> module pass) to propagate llvm.assume’s across function calls.  The basic
>>> idea would be to collect the set of assumptions for each argument at each
>>> call sight and compute the intersection across all call sites, then
>>> duplicate the intersection assumption computations in the callee.  The
>>> reason I’m starting with SCCP is that it already deals with keeping track
>>> of computing when all of a function’s possible call sites are known, as
>>> well as merging values in a lattice.
>>>
>> Given that we use !invariant.group loads when loading vptrs, what
>> additional value do you think you can get from this? An example of a case
>> where you could do better than the current approach of
>> -fstrict-vtable-pointers with this technique would help a lot in
>> understanding this.
>>
>>> 2.       clang: Emitting llvm.assume vtable load sequences for each
>>> global variable with virtual functions referenced inside a function.  This
>>> is similar to what is currently done for local variables and would produce
>>> more vtable load assumptions to be propagated by (1).
>>>
>> Given that it's valid to placement new another object on top of a global,
>> there are some limits on what we can do here -- we can only emit these
>> assumption loads at places in the code where we know the original vptr is
>> present. For instance, we can do this at any point where we emit a member
>> access or member function call on an object of known dynamic type (whether
>> it's local or global), but we cannot do so when such an object is passed by
>> reference into a function or when its address is taken (those operations
>> don't require the object to be within its lifetime).
>>
>> Related to (2), does anyone know what the status is of enabling clang’s
>>> –fstrict-vtable-pointers by default?  Are there known issues with this code
>>> that would need to be resolved as well?
>>>
>>
>> There are two known issues:
>>
>> 1) At the IR level (but not at the object code level), it introduces an
>> ABI break: for LTO, all modules must be built with the same setting of the
>> flag or the necessary invariant barriers may be missing, resulting in
>> incorrect devirtualization in rare cases. (If you try to LTO modules with
>> different settings for the flag, we trap the problem and issue an error.)
>>
>> 2) Not all optimization passes have been updated to understand
>> @llvm.invariant.group.barrier, and as such, inserting it can sometimes
>> result in a pessimization when optimization passes are unable to correctly
>> reason about it. Thus the flag may degrade performance.
>>
>> Plus, of course, it can cause existing code that breaks the language
>> rules to start misbehaving (as with any of the -fstrict-* flags that
>> optimize on UB).
>>
>>
>>> Thanks,
>>>
>>>
>>>
>>>
>>>
>>> --
>>>
>>> Geoff Berry
>>>
>>> Employee of Qualcomm Innovation Center, Inc.
>>>
>>> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a
>>> Linux Foundation Collaborative Project
>>>
>>>
>>>
>>> _______________________________________________
>>> cfe-dev mailing list
>>> cfe-dev at lists.llvm.org
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>>
>>>
>>
>> _______________________________________________
>> cfe-dev mailing list
>> cfe-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20151113/baad3ff1/attachment.html>