[LLVMdev] Proposal to improve vzeroupper optimization strategy

Sean Silva chisophugis at gmail.com
Fri Sep 20 17:07:43 PDT 2013


Is it realistic to worry about performance of vectorized code that does PIC
calls into a non-vectorized sin() in libc? Maybe there's an example other
than sin() that is more realistic?

-- Sean Silva


On Fri, Sep 20, 2013 at 7:11 PM, Eli Friedman <eli.friedman at gmail.com>wrote:

> On Fri, Sep 20, 2013 at 2:58 PM, Gao, Yunzhong <
> yunzhong_gao at playstation.sony.com> wrote:
>
>>  Hi Eli,****
>>
>> Thanks for the feedback. Please see below.
>> - Gao.****
>>
>> ** **
>>
>> From: Eli Friedman [mailto:eli.friedman at gmail.com] ****
>>
>> Sent: Thursday, September 19, 2013 12:31 PM****
>>
>> To: Gao, Yunzhong****
>>
>> Cc: llvmdev at cs.uiuc.edu****
>>
>> Subject: Re: [LLVMdev] Proposal to improve vzeroupper optimization
>> strategy****
>>
>> ** **
>>
>> > This is essentially equivalent to "don't insert vzeroupper anywhere", as
>> ****
>>
>> > far as I can tell. (The case of SSE instructions without a v- prefixed*
>> ***
>>
>> > equivalent is rare enough we can separate it from this discussion.)****
>>
>> ** **
>>
>> So will you be interested in a patch that disables vzeroupper by default?
>>
>
> A patch which adds a switch/LLVM IR function attribute to disable
> vzeroupper would be fine.  A patch that disables vzeroupper on your
> platform would be fine (assuming the target triple is distinguishable).
>  Turning off vzeroupper by default on all platforms is not fine.
>
>
>> I implemented this possibly over-engineering solution in our local tree
>> to work****
>>
>> around some bad instruction selection issues in LLVM backend. When
>> benchmarking****
>>
>> on our game codes, I noticed that sometimes legacy SSE instructions were*
>> ***
>>
>> selected despite existence of AVX equivalent, in which case the vzeroupper
>> ****
>>
>> instruction was needed. And it is much easier to detect existence of
>> vzeroupper****
>>
>> instruction than to detect each single legacy SSE instructions.****
>>
>> ** **
>>
>> The instruction selection issues were later fixed in our tree (patches to
>> be****
>>
>> submitted later), at least for the handful of games I tested on. So a
>> simple****
>>
>> change to just disable vzeroupper by default will be acceptable to us as
>> well.****
>>
>> ** **
>>
>> > The reason we need vzeroupper in the first place is because we can't
>> assume****
>>
>> > other functions won't use legacy SSE instructions; for example, on most
>> ****
>>
>> > systems, calling sin() will use legacy SSE instructions.  I mean, if
>> you can****
>>
>> > make some unusual guarantee about your platform, it might make sense to
>> ****
>>
>> > disable vzeroupper generation in general, but it simply doesn't make
>> sense****
>>
>> > on most platforms.****
>>
>> ** **
>>
>> I am confused by this point. By "most systems," do you have in mind a
>> platform****
>>
>> where the sin() function was compiled by gcc but the application codes
>> were****
>>
>> compiled by clang?
>>
>
> On, for example, OS X, AVX is not enabled by default, so the sin()
> function uses legacy SSE instructions.  Users can still turn on AVX in
> their applications.
>
> -Eli
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130920/53eed112/attachment.html>


More information about the llvm-dev mailing list