[llvm-dev] KNL Vectorization with larger vector width

hameeza ahmed via llvm-dev llvm-dev at lists.llvm.org
Tue Jul 24 10:34:19 PDT 2018


Hello,
I need help here. I am able to adjust the vector width through
WidestRegister value. When number of iterations=31 and  I set vector
width=32 it gives <16xi32> and <8xi32> instructions.

However if i replicate same behavior with number of iterations=63 and  I
set vector width=64, no vector instructions are emitted. it should do as
previous and gives <32xi32> and <16xi32> vector instructions.

How to do this?
What adjustments are needed?

Please help

I m trying this but unable to solve.

Thank You

On Tue, Jul 24, 2018 at 4:44 PM, hameeza ahmed <hahmed2305 at gmail.com> wrote:

> Hello,
> Do i need to change following function;
>
> unsigned X86TTIImpl::getNumberOfRegisters(bool Vector) {
>   if (Vector && !ST->hasSSE1())
>     return 0;
>
>   if (ST->is64Bit()) {
>     if (Vector && ST->hasAVX512())
>       return 32;
>     return 16;
>   }
>   return 8;
> }
>
> to
>
> if (ST->is2048Bit()) {
>     if (Vector && ST->hasAVX512())
>       return 1024;
>     return 512;
>   }
>   return 256;
>
>
> please help...
>
> On Tue, Jul 24, 2018 at 5:05 AM, hameeza ahmed <hahmed2305 at gmail.com>
> wrote:
>
>> Thank You.
>> Right now to see the effect i did following changes;
>>
>> unsigned X86TTIImpl::getRegisterBitWidth(bool Vector) {
>>   if (Vector) {
>>     if (ST->hasAVX512())
>>       return 65536;
>>
>> here i changed 512 to 65536. Then in loopvectorize.cpp i did following;
>>
>>  assert(MaxVectorSize <= 2048 && "Did not expect to pack so many elements"
>>                                 " into one vector!");
>>
>> changed 64 to 2048.
>>
>> It runs fine. I can see in IR <2048xi32> or <1024xi64> emission.
>>
>> But I cannot see the vector mix like in default knl if iterations=15 we
>> see 1<8xi32> and rest scalar. so here when i keep iteration=2047 i get all
>> scalar why is that so? similarly in polly as well i cant see vector mixes
>> like its happening for KNL it emits <v16i32>, <v8i32>,<v4i32>...so here it
>> should emit recursively like <v2048i32> <v1024i32> <v512i32>.....<v32i32>
>>
>> how to do this?
>>
>> What am i missing here?
>> what further changes do i need to make?
>>
>> Please help...
>>
>>
>>
>>
>>
>>
>> On Tue, Jul 24, 2018 at 1:52 AM, Friedman, Eli <efriedma at codeaurora.org>
>> wrote:
>>
>>> On 7/23/2018 12:40 PM, hameeza ahmed wrote:
>>>
>>>> Thank You. I got it. Version issue.
>>>>
>>>> TTI.getRegisterBitWidth(true)
>>>>
>>>> How to put my target machine info in TTI?
>>>>
>>>
>>> Each target has an implementation, e.g. X86TTIImpl::getRegisterBitWidth.
>>>
>>>
>>> -Eli
>>>
>>> --
>>> Employee of Qualcomm Innovation Center, Inc.
>>> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a
>>> Linux Foundation Collaborative Project
>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180724/b303f357/attachment.html>


More information about the llvm-dev mailing list