[cfe-dev] Disable integer promotion (Dilan Manatunga via cfe-dev)

Dilan Manatunga via cfe-dev cfe-dev at lists.llvm.org
Mon May 30 18:15:12 PDT 2016


Hi,

Thanks everyone for their suggestions!

@David
I tried your method first, but it lead to compilation errors for invalid
bitcasts for some of the custom intrinsics I had added to my backend. I
wasn't sure why and didn't investigate it too much.

@Norman
Thanks for the paper and slides you sent. They were useful in giving me
some ideas on how to solve it.

@James
Thanks for giving me architectures which already dealt with the issue. I
had checked NVPTX and ARM actually for scalars, and when I saw they didn't
do anything I decided to ask the question. My fault for not checking if X86
handled the issue.

Again, thanks everyone.
-Dilan

On Mon, May 30, 2016 at 1:05 PM James Molloy <james at jamesmolloy.co.uk>
wrote:

> Hi,
>
> I'm on vacation at the moment with only a phone to reply on but...
>
> TruncateToMinimalBitwidths is, as you point out, only for vectorisation.
> There are three cases:
>
> 1. I8 and i16 are never supported on the target.
> 2. I8 and i16 are supported for vectors but not for scalars (ARM)
> 3. I8 and i16 are supported for scalars and vectors (x86)
>
> (3) is handled by simplifyDemandedBits in the instruction combiner, so it
> will work on scalars and vectors but will only ever truncate promotions if
> the smaller integer operation is valid on the target.
>
> (2) is handled by truncateToMinimalBitwidths where we need to, as part of
> the vectorisation profitability analysis, determine what the loop will look
> like after vectorisation. We insert trunc and ext nodes simply as a
> shortcut - we could elide the promotions as you do, but there are corner
> cases that make it a bit awkward so we just add more casts and let a
> cleanup pass (instcombine) remove them intelligently.
>
> I hope this answers your queries a bit more? Both of these should be
> kicking in already if your target advertises i8 as being legal for scalars
> or vectors, so I would check your target transform info to ensure the
> legality hook is returning true when it should.
>
> Cheers
>
> James
> On Mon, 30 May 2016 at 16:11, Norman Rink <norman.rink at tu-dresden.de>
> wrote:
>
>> Hi all,
>>
>> I realize this is potentially only tangent to the ongoing discussion, but
>> does anyone have significant experience with how integer promotion
>> interacts with vectorization? When I looked into this interaction, I did
>> not have the time to conduct a careful analysis, but I have reason to
>> believe that integer promotion can get in the way of vectorization, thereby
>> limiting its benefits. Can anyone comment? Thanks.
>>
>> Best,
>>
>> Norman
>>
>>
>> From: "Martin J. O'Riordan" <martin.oriordan at movidius.com>
>> Organization: Movidius Ltd.
>> Date: Monday 30 May 2016 14:53
>> To: 'James Molloy' <james at jamesmolloy.co.uk>, 'David Majnemer' <
>> david.majnemer at gmail.com>, 'Dilan Manatunga' <manatunga at gmail.com>
>> Cc: 'Clang Dev' <cfe-dev at lists.llvm.org>, Norman Rink <
>> norman.rink at tu-dresden.de>
>> Subject: RE: [cfe-dev] Disable integer promotion (Dilan Manatunga via
>> cfe-dev)
>>
>> Hi James and thanks for pointing out the existence of this
>> transformation, we were quite unaware of it.
>>
>>
>>
>> As it happens, I am highly allergic to re-invention and avoid doing so
>> whenever possible; the only reason an already overburdened team of 2
>> developers will re-invent is because they are unaware of an existing
>> solution which is not difficult given the scope and complexity of LLVM.
>>
>>
>>
>> So far as I can tell, ‘truncateToMinimalBitwidths’ is always enabled, so
>> it is not a target specific selection and our target should automatically
>> reap the rewards of this optimisation pass.  I certainly cannot find a
>> switch to enable or disable it.  But in fact we are not seeing anywhere
>> near the benefits we would expect.
>>
>>
>>
>> void InnerLoopVectorizer::truncateToMinimalBitwidths() {
>>
>>   // For every instruction `I` in MinBWs, truncate the operands, create a
>>
>>   // truncated version of `I` and reextend its result. InstCombine runs
>>
>>   // later and will remove any ext/trunc pairs.
>>
>>
>>
>> This appears to only run on inner-loops, and it appear to insert
>> narrowings/truncations and subsequent widenings/extendings into the IR
>> chains.
>>
>>
>>
>> The DataLayout for our target includes “-n8:16:32”, so it should see the
>> benefits of optimisations for multiple native integer support.  We also
>> provide both 32-bit SIMD and 128-bit SIMD native support.
>>
>>
>>
>> The pass that we wrote is quite different.  It is run as a machine pass
>> prior to loop-unrolling and vectorisation, and instead of pre-truncating
>> and post-extending IR chains, it removes the existing pre-extending and
>> post-truncating that brackets a sequence of IR operations if it can prove
>> that the outcome is the same.  The results are actually very good and match
>> what our expectations are from such a transformation, which makes me wonder
>> “why does ‘truncateToMinimalBitwidths’ not already produce comparable
>> results?”.
>>
>>
>>
>> Our observations are that with the new pass, a significant majority of
>> vectorised code showed some improvement, with results as high as 40X faster
>> than without.  Of the small number of tests that regressed in performance,
>> adding a ‘#pragma clang unroll_count(N)’ eliminated the loss.  This
>> could probably be eliminate too by better tuning of the cost-models.
>>
>>
>>
>> The re-invention is inadvertent, but in any event our new pass appears to
>> provide considerable additional performance improvements that are not
>> currently happening with the stock LLVM transformations.
>>
>>
>>
>> I will have to contrive some tests to see why ‘truncateToMinimalBitwidths’
>> is not already doing this, and if there is something that we have done
>> wrong in our target that is breaking it, I will happily revert to an
>> existing solution.
>>
>>
>>
>>             MartinO
>>
>>
>>
>> *From:* James Molloy [mailto:james at jamesmolloy.co.uk
>> <james at jamesmolloy.co.uk>]
>> *Sent:* 28 May 2016 19:58
>> *To:* Martin.ORiordan at movidius.com; David Majnemer; Dilan Manatunga
>> *Cc:* Clang Dev; Norman Rink
>> *Subject:* Re: [cfe-dev] Disable integer promotion (Dilan Manatunga via
>> cfe-dev)
>>
>>
>>
>> Hi,
>>
>> X86 has native support for i8 and i16. Aarch64 and ARM have native i8 and
>> i16 vector operations that are lowered and analysed using
>> truncateToMinimalBitwidths in LoopVectorize. Similarly for scalar code on
>> x86 truncation is done in instcombine.
>>
>> Why do you need to reinvent this?
>>
>> Cheers,
>>
>> James
>>
>> On Sat, 28 May 2016 at 19:02, Martin J. O'Riordan via cfe-dev <
>> cfe-dev at lists.llvm.org> wrote:
>>
>> Instead of suppressing the integer promotion rules which are part of the
>> ISO C/C++ Standards, we wrote a new pass that analyses the IR to see if the
>> input values and output value were of an integer type that was narrower
>> than the promoted types used in the IR, and if we could prove that the
>> outcome would be identical if the type was unpromoted, then we reduced the
>> IR to use the narrower form.
>>
>>
>>
>> In our case the motive was to enhance vectorisation because our vector
>> ALU can work with 8-, 16- and 32-bit integers natively, and handling ‘
>> vXi8’ vectors ended was actually being promoted to multiple ‘v4i32’
>> vectors requiring 4 times as many instructions as were necessary, or worse
>> still, fully scalarized.
>>
>>
>>
>> This pass was presented by my colleague Stephen Rogers in a “Lighting
>> Talk” at the October 2015 LLVM Conference in San Jose and titled “Integer
>> Vector Optimizations and “Usual Arithmetic Conversions””.  I can’t find
>> the paper or slides on the LLVM Meetings page, perhaps these are not
>> archived for Lightning Talks (?), but as they are not large I have attached
>> them here.
>>
>>
>>
>> This approach allowed us to gain the optimisations that are possible with
>> our architecture which supports 8-, 16- and 32-bit native integer
>> computations (scalar and vector), while also respecting the ISO C and C++
>> Standards.  I am a lot more nervous of a front-end switch for this, as it
>> will lead to non-compliant programs, and in the presence of overloading and
>> template-instantiation it could also lead to very different programs, and
>> would recommend that we do not add a front-end switch which alters the
>> semantics of the language in this way.
>>
>>
>>
>> It is my intention to publish this pass if it is of general interest, and
>> since it is target independent there are no particular blocking issue for
>> me (Patents, IP, etc.) to doing so.  I do have to catch-up on the HEAD
>> revision to ensure that it still works correctly, but it was working
>> perfectly at SVN #262824 and it will be a month before I have enough time
>> to catch up on the HEAD revision as we are busy with a product release that
>> takes precedence.
>>
>>
>>
>> All the best,
>>
>>
>>
>>             MartinO
>>
>>
>>
>> *From:* cfe-dev [mailto:cfe-dev-bounces at lists.llvm.org] *On Behalf Of *David
>> Majnemer via cfe-dev
>> *Sent:* 27 May 2016 19:55
>> *To:* Dilan Manatunga <manatunga at gmail.com>
>> *Cc:* clang developer list <cfe-dev at lists.llvm.org>; Norman Rink <
>> norman.rink at tu-dresden.de>; cfe-dev-request at lists.llvm.org
>> *Subject:* Re: [cfe-dev] Disable integer promotion (Dilan Manatunga via
>> cfe-dev)
>>
>>
>>
>> You could set IntWidth to 16 or 8 in clang, not unlike what MSP430 does:
>>
>>
>> https://github.com/llvm-mirror/clang/blob/3317d0fa0bd1f5c5adc14bcc6adc2a38acc9064b/lib/Basic/Targets.cpp#L6823
>>
>>
>>
>> On Fri, May 27, 2016 at 10:32 AM, Dilan Manatunga via cfe-dev <
>> cfe-dev at lists.llvm.org> wrote:
>>
>> I need disabling this feature because I am researching architectures
>> where 8-bit or 16-bit adds are preferred to 32-bit. So, integer promotion
>> kinda mucks everything up. I was hoping there was a way in clang to disable
>> it, instead of having to implement an LLVM pass to coalesce unnecessary
>> promotions.
>>
>>
>>
>> Thanks for catching the IR mistake. Should have double checked that. This
>> should be the correct version:
>>
>> nt8_t a  = 1;
>>
>> int8_t b = 2;
>>
>> int8_t c = a + b
>>
>>
>>
>> The LLVM IR will be:
>>
>> %x = sext i8 %a to i32
>>
>> %y = sext i8 %b to i32
>>
>> %z = add nsw i32 %x, %y
>>
>> %c = trunc i32 %z to i8
>>
>>
>>
>> Instead, it would simply compile to:
>>
>> $c = add nsw i8 %z, $y
>>
>>
>>
>> -Dilan
>>
>>
>>
>>
>>
>> On Fri, May 27, 2016 at 5:30 AM Norman Rink via cfe-dev <
>> cfe-dev at lists.llvm.org> wrote:
>>
>> Hi Dilan,
>>
>> I would like to second your request for an option to disable integer
>> promotion. What do you need it for?
>>
>> As far as I am aware, there is no such option and the code that implements
>> integer promotion is somewhat scattered across ³SemaExpr.cpp².
>>
>> Also, I think your example code snippet contains a few ³i32²s too many. It
>> will be clearer to people what you are looking for if your code example is
>> consistent with your question.
>>
>> Best,
>>
>> Norman
>>
>>
>> >Message: 1
>> >Date: Fri, 27 May 2016 01:50:12 +0000
>> >From: Dilan Manatunga via cfe-dev <cfe-dev at lists.llvm.org>
>> >To: cfe-dev at lists.llvm.org
>> >Subject: [cfe-dev] Disable integer promotion
>> >Message-ID:
>> >       <CAHpgGu4=
>> jFC9ohQQZZMp2NMG3Hw0sE5U4-Lqrgb+6gcXv9SEtQ at mail.gmail.com>
>> >Content-Type: text/plain; charset="utf-8"
>> >
>> >Is there a way to disable integer promotion when performing math
>> >operations. For example, when compiling a statement such as this:
>> >int8_t a  = 1;
>> >int8_t b = 2;
>> >int8_t c = a + b
>> >
>> >The LLVM IR will be:
>> >%x = sext i32 %a to i32
>> >%y = sext i32 %b to i32
>> >%z = add nsw i32 %x, %y
>> >%c = trunc i32 %z to i16
>> >
>> >Instead, it would simply compile to:
>> >$c = add nsw i32 %z, $y
>> >
>> >-Dilan Manatunga
>> >-------------- next part --------------
>> >An HTML attachment was scrubbed...
>> >URL:
>> ><
>> http://lists.llvm.org/pipermail/cfe-dev/attachments/20160527/4a7920ab/att
>> >achment-0001.html>
>> >
>> >------------------------------
>>
>>
>> _______________________________________________
>> cfe-dev mailing list
>> cfe-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>
>>
>> _______________________________________________
>> cfe-dev mailing list
>> cfe-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>
>>
>>
>> _______________________________________________
>> cfe-dev mailing list
>> cfe-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20160531/0045be86/attachment.html>


More information about the cfe-dev mailing list