[llvm-dev] LLVM struct, alloca, SROA and the entry basic block

Benoit Belley via llvm-dev llvm-dev at lists.llvm.org
Tue Sep 8 11:39:56 PDT 2015


How about:

--- a/docs/Frontend/PerformanceTips.rst
+++ b/docs/Frontend/PerformanceTips.rst
@@ -19,20 +19,32 @@ Avoid loads and stores of large aggregate type
 ================================================

 LLVM currently does not optimize well loads and stores of large :ref:`aggregate
 types <t_aggregate>` (i.e. structs and arrays).  As an alternative, consider
 loading individual fields from memory.

 Aggregates that are smaller than the largest (performant) load or store
 instruction supported by the targeted hardware are well supported.  These can
 be an effective way to represent collections of small packed fields.

+Issue alloca in the entry basic block
+=======================================
+
+Issue alloca instructions in the entry basic block of a function. Also, issue
+them before any call instructions. Call instructions might get inlined into
+multiple basic blocks. The end result is that a following alloca instruction
+would no longer be in the entry basic block afterward.
+
+The SROA (Scalar Replacement Of Aggregates) pass only attempts to elminate
+alloca instructions that are in the entry basic block. Following optimizations
+passes relies on such alloca instructions to have been eliminated.
+
 Prefer zext over sext when legal
 ==================================

 On some architectures (X86_64 is one), sign extension can involve an extra
 instruction whereas zero extension can be folded into a load.  LLVM will try to
 replace a sext with a zext when it can be proven safe, but if you have
 information in your source language about the range of a integer value, it can
 be profitable to use a zext rather than a sext.

 Alternatively, you can :ref:`specify the range of the value using metadata

Benoit

Benoit Belley
Sr Principal Developer
M&E-Product Development Group

MAIN +1 514 393 1616
DIRECT +1 438 448 6304
FAX +1 514 393 0110

Twitter<http://twitter.com/autodesk>
Facebook<https://www.facebook.com/Autodesk>

Autodesk, Inc.
10 Duke Street
Montreal, Quebec, Canada H3C 2L7
www.autodesk.com<http://www.autodesk.com/>

[Description: Email_Signature_Logobar]


From: <mehdi.amini at apple.com<mailto:mehdi.amini at apple.com>> on behalf of Mehdi Amini <mehdi.amini at apple.com<mailto:mehdi.amini at apple.com>>
Date: mardi 8 septembre 2015 13:27
To: Benoit Belley <benoit.belley at autodesk.com<mailto:benoit.belley at autodesk.com>>
Cc: Philip Reames <listmail at philipreames.com<mailto:listmail at philipreames.com>>, "llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>" <llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>>
Subject: Re: [llvm-dev] LLVM struct, alloca, SROA and the entry basic block

Hi,

On Sep 8, 2015, at 10:11 AM, Benoit Belley via llvm-dev <llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>> wrote:

From: Philip Reames <listmail at philipreames.com<mailto:listmail at philipreames.com>>
Date: mardi 8 septembre 2015 12:50
To: Benoit Belley <benoit.belley at autodesk.com<mailto:benoit.belley at autodesk.com>>, "llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>" <llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>>
Subject: Re: [llvm-dev] LLVM struct, alloca, SROA and the entry basic block

On 09/08/2015 07:21 AM, Benoit Belley via llvm-dev wrote:
Hi everyone,

We have noticed that the SROA pass will only eliminate ‘alloca’ instructions if those are located in the entry basic block of a function.

As a general recommendation, should the LLVM IR emitted by our compiler always place ‘alloca’ instructions in the entry basic block ? (I couldn’t find any recommendations concerning this matter.)
Yes.


Thanks Phil. Should this be mentioned somewhere in the documentation ? As a footnote in the LLVM Language Reference manual maybe ?


This sounds like a candidate for: http://llvm.org/docs/Frontend/PerformanceTips.html ?

—
Mehdi



As a note, I have also find out that alloca instructions should be placed before any call instructions as these can get inlined and then, the original alloca can no longer by placed in the entry basic block!



In addition, we have noticed that the MemCpy pass will attempt to copy LLVM struct using moves that are as large as possible. For example, a struct of 3 floats is copied using a 64-bit and a 32-bit move. It is therefore important that such a struct be aligned on 8-byte boundary, not just 4 bytes! Else, one runs the risk of triggering store-forwarding failure pipelining stalls (which we did encountered really badly with one of our internal performance benchmark).
This sounds like a bug to me.  We shouldn't be using the large load/stores without knowing they're aligned or that unaligned access is fast on a particular target.  Where this is best fixed (memcpy, store lowering?) I don't know.

I’ll send out a test case. Maybe, that will help.



Is there any guidelines for specifying the alignment of LLVM structs allocated by alloca instructions ? Is rounding down to the structure size to the next power of 2 a good strategy ? Will the MemCpy pass issue moves of up to 64-bytes on AVX-512 capable processors ?

Cheers,
Benoit

Benoit Belley
Sr Principal Developer
M&E-Product Development Group

MAIN +1 514 393 1616
DIRECT +1 438 448 6304
FAX +1 514 393 0110

Twitter<http://twitter.com/autodesk>
Facebook<https://www.facebook.com/Autodesk>

Autodesk, Inc.
10 Duke Street
Montreal, Quebec, Canada H3C 2L7
www.autodesk.com<http://www.autodesk.com/>

<ATT00001.png>




_______________________________________________
LLVM Developers mailing list
llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

<ATT00001.png>_______________________________________________
LLVM Developers mailing list
llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150908/7f234820/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 350F40DB-4457-4455-A632-0DF05738AF15[22].png
Type: image/png
Size: 4316 bytes
Desc: 350F40DB-4457-4455-A632-0DF05738AF15[22].png
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150908/7f234820/attachment.png>


More information about the llvm-dev mailing list