[llvm-dev] LLVM struct, alloca, SROA and the entry basic block

Philip Reames via llvm-dev llvm-dev at lists.llvm.org
Wed Sep 9 15:38:17 PDT 2015


Seems reasonable.  Do you have commit access or should I submit on your 
behalf?

On 09/08/2015 11:39 AM, Benoit Belley wrote:
> How about:
>
>     --- a/docs/Frontend/PerformanceTips.rst
>     +++ b/docs/Frontend/PerformanceTips.rst
>     @@ -19,20 +19,32 @@ Avoid loads and stores of large aggregate type
>      ================================================
>      LLVM currently does not optimize well loads and stores of large
>     :ref:`aggregate
>      types <t_aggregate>` (i.e. structs and arrays).  As an
>     alternative, consider
>      loading individual fields from memory.
>      Aggregates that are smaller than the largest (performant) load or
>     store
>      instruction supported by the targeted hardware are well
>     supported.  These can
>      be an effective way to represent collections of small packed fields.
>     +Issue alloca in the entry basic block
>     +=======================================
>     +
>     +Issue alloca instructions in the entry basic block of a function.
>     Also, issue
>     +them before any call instructions. Call instructions might get
>     inlined into
>     +multiple basic blocks. The end result is that a following alloca
>     instruction
>     +would no longer be in the entry basic block afterward.
>     +
>     +The SROA (Scalar Replacement Of Aggregates) pass only attempts to
>     elminate
>     +alloca instructions that are in the entry basic block. Following
>     optimizations
>     +passes relies on such alloca instructions to have been eliminated.
>     +
>      Prefer zext over sext when legal
>      ==================================
>      On some architectures (X86_64 is one), sign extension can involve
>     an extra
>      instruction whereas zero extension can be folded into a load.
>      LLVM will try to
>      replace a sext with a zext when it can be proven safe, but if you
>     have
>      information in your source language about the range of a integer
>     value, it can
>      be profitable to use a zext rather than a sext.
>      Alternatively, you can :ref:`specify the range of the value using
>     metadata
>
>
> Benoit
>
> *Benoit Belley*
>
> Sr Principal Developer
>
> M&E-Product Development Group
>
> *MAIN* +1 514 393 1616
>
> *DIRECT* +1 438 448 6304
>
> *FAX* +1 514 393 0110
>
> Twitter <http://twitter.com/autodesk>
>
> Facebook <https://www.facebook.com/Autodesk>
>
> *Autodesk, Inc.*
>
> 10 Duke Street
>
> Montreal, Quebec, Canada H3C 2L7
>
> www.autodesk.com <http://www.autodesk.com/>
>
> Description: Email_Signature_Logobar
>
>
> From: <mehdi.amini at apple.com <mailto:mehdi.amini at apple.com>> on behalf 
> of Mehdi Amini <mehdi.amini at apple.com <mailto:mehdi.amini at apple.com>>
> Date: mardi 8 septembre 2015 13:27
> To: Benoit Belley <benoit.belley at autodesk.com 
> <mailto:benoit.belley at autodesk.com>>
> Cc: Philip Reames <listmail at philipreames.com 
> <mailto:listmail at philipreames.com>>, "llvm-dev at lists.llvm.org 
> <mailto:llvm-dev at lists.llvm.org>" <llvm-dev at lists.llvm.org 
> <mailto:llvm-dev at lists.llvm.org>>
> Subject: Re: [llvm-dev] LLVM struct, alloca, SROA and the entry basic 
> block
>
>     Hi,
>
>>     On Sep 8, 2015, at 10:11 AM, Benoit Belley via llvm-dev
>>     <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
>>
>>     From: Philip Reames <listmail at philipreames.com
>>     <mailto:listmail at philipreames.com>>
>>     Date: mardi 8 septembre 2015 12:50
>>     To: Benoit Belley <benoit.belley at autodesk.com
>>     <mailto:benoit.belley at autodesk.com>>, "llvm-dev at lists.llvm.org
>>     <mailto:llvm-dev at lists.llvm.org>" <llvm-dev at lists.llvm.org
>>     <mailto:llvm-dev at lists.llvm.org>>
>>     Subject: Re: [llvm-dev] LLVM struct, alloca, SROA and the entry
>>     basic block
>>
>>>     On 09/08/2015 07:21 AM, Benoit Belley via llvm-dev wrote:
>>>>     Hi everyone,
>>>>
>>>>     We have noticed that the SROA pass will only eliminate ‘alloca’
>>>>     instructions if those are located in the entry basic block of a
>>>>     function.
>>>>
>>>>     /*As a general recommendation, should the LLVM IR emitted by
>>>>     our compiler always place ‘alloca’ instructions in the entry
>>>>     basic block ? (I couldn’t find any recommendations concerning
>>>>     this matter.)*/
>>>     Yes.
>>
>>
>>     Thanks Phil. Should this be mentioned somewhere in the
>>     documentation ? As a footnote in the LLVM Language Reference
>>     manual maybe ?
>
>
>     This sounds like a candidate for:
>     http://llvm.org/docs/Frontend/PerformanceTips.html ?
>
>>     Mehdi
>
>
>
>>     As a note, I have also find out that alloca instructions should
>>     be placed before any call instructions as these can get inlined
>>     and then, the original alloca can no longer by placed in the
>>     entry basic block!
>>
>>>     /*
>>>     */
>>>>     /
>>>>     /
>>>>     In addition, we have noticed that the MemCpy pass will attempt
>>>>     to copy LLVM struct using moves that are as large as possible.
>>>>     For example, a struct of 3 floats is copied using a 64-bit and
>>>>     a 32-bit move. It is therefore important that such a struct be
>>>>     aligned on 8-byte boundary, not just 4 bytes! Else, one runs
>>>>     the risk of triggering store-forwarding failure pipelining
>>>>     stalls (which we did encountered really badly with one of our
>>>>     internal performance benchmark).
>>>     This sounds like a bug to me.  We shouldn't be using the large
>>>     load/stores without knowing they're aligned or that unaligned
>>>     access is fast on a particular target.  Where this is best fixed
>>>     (memcpy, store lowering?) I don't know.
>>
>>     I’ll send out a test case. Maybe, that will help.
>>
>>>
>>>>
>>>>     */Is there any guidelines for specifying the alignment of LLVM
>>>>     structs allocated by alloca instructions ? Is rounding down to
>>>>     the structure size to the next power of 2 a good strategy ?
>>>>     Will the MemCpy pass issue moves of up to 64-bytes on AVX-512
>>>>     capable processors ?/*
>>>>     */
>>>>     /*
>>>>     Cheers,
>>>>     Benoit//
>>>>
>>>>     *Benoit Belley*
>>>>     Sr Principal Developer
>>>>     M&E-Product Development Group
>>>>
>>>>     *MAIN* +1 514 393 1616
>>>>     *DIRECT* +1 438 448 6304
>>>>     *FAX* +1 514 393 0110
>>>>
>>>>     Twitter <http://twitter.com/autodesk>
>>>>     Facebook <https://www.facebook.com/Autodesk>
>>>>
>>>>     *Autodesk, Inc.*
>>>>     10 Duke Street
>>>>     Montreal, Quebec, Canada H3C 2L7
>>>>     www.autodesk.com <http://www.autodesk.com/>
>>>>
>>>>     <ATT00001.png>
>>>>
>>>>
>>>>
>>>>     _______________________________________________
>>>>     LLVM Developers mailing list
>>>>     llvm-dev at lists.llvm.orghttp://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>
>>     <ATT00001.png>_______________________________________________
>>     LLVM Developers mailing list
>>     llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>>     http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150909/0ff330d7/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/png
Size: 4316 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150909/0ff330d7/attachment-0001.png>


More information about the llvm-dev mailing list