[llvm-dev] [X86][AVX512] RFC: make i1 illegal in the Codegen

Martin J. O'Riordan via llvm-dev llvm-dev at lists.llvm.org
Tue Jan 24 05:07:25 PST 2017


I can't comment specifically on the impact on your target and 'i1', but
there is an issue with LLVM that I do have a concern about.

 

Having a type that can be possibly legalized to two different register
classes exposes a fundamental limitation of the current instruction
selection framework

 

one of the problems I have encountered with LLVM is that I "do" want to be
able to legalise and optimise for 2 (or more) register classes for the same
type, and LLVM does not really cope with this well.  But it is not 'i1' to
scalar versus vector that I run into the limitation, but small vectors and
large vectors.

 

In our architecture, we have two register files that can be used for SIMD
operations, one is 32-bits and the other is 128-bits.  But quite often due
to register pressure or simply to reduce moving information, I would like to
be able to place something like 'v2i16' or 'v4i8' vectors into either the
32-bit SIMD capable register class or into the low bits of a 128-bit SIMD
capable register class.  I expect that other chip architectures have similar
capabilities.

 

Your statement above is true, but making it illegal means that these kinds
of SIMD transformations also become illegal.

 

            MartinO

 

From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On Behalf Of Blank,
Guy via llvm-dev
Sent: 24 January 2017 11:54
To: llvm-dev at lists.llvm.org
Subject: [llvm-dev] [X86][AVX512] RFC: make i1 illegal in the Codegen

 

Hi All,

 

AVX-512 introduced the K mask registers and masked operations which make a
natural choice for legalizing vectors of i1's.

For example,

define <8 x i32> @foo(<8 x i32>%a, <8 x i32*> %p) {

  %r = call <8 x i32> @llvm.masked.gather.v8i32(<8 x i32*> %p, i32 4, <8 x
i1> <i1 true, i1 true, i1 true, i1 true, i1 true, i1 true, i1 true, i1
true>, <8 x i32> undef)

  ret 8 x i32>%r

}

 

Can be lowered to

 

# BB#0:

kxnorw    %k0, %k0, %k1

vpgatherqd    (,%zmm1), %ymm0 {%k1}

retq

 

 

Legal vectors of i1's require support for BUILD_VECTOR(i1, i1, .., i1), i1
EXTRACT_VEC_ELEMENT (.) and INSERT_VEC_ELEMENT(i1, .) , so making i1 legal
seemed like a sensible decision, and this is the current state in the top of
trunk.

 

However, making i1 legal affected instruction selection of scalar code as
well. Currently, there are cases where operations producing or consuming
i1's are selected (sub-optimally) to instructions that act on K-regs.

PR28650 <https://llvm.org/bugs/show_bug.cgi?id=28650>  is an example showing
that i1's live-in or live-out of basic-blocks are being selected to K
register classes, even though we don't want this to happen. This problem
does not happen on subtargets without the AVX-512 feature enabled.
The following is the AVX-512 code from the bug report:

 

# BB#0:                                 # %entry

testb        $1, %dil

je        .LBB0_1

# BB#2:                                 # %if

pushq        %rax

callq        bar

                                        # kill: %AL<def> %AL<kill> %EAX<def>

andl        $1, %eax

kmovw        %eax, %k0

addq        $8, %rsp

jmp        .LBB0_3

.LBB0_1:

kxnorw        %k0, %k0, %k0

kshiftrw        $15, %k0, %k0

.LBB0_3:                                # %else

kmovw        %k0, %eax

                                        # kill: %AL<def> %AL<kill>
%EAX<kill>

Retq

 

The kmov,kxnor,kshiftr instructions here are the instructions operating on K
registers. These are undesirable in the purely scalar input code.

 

 

Having a type that can be possibly legalized to two different register
classes exposes a fundamental limitation of the current instruction
selection framework, and that is we cannot always make the right decision
about live-in/live-out i1's because we cannot see beyond the boundary of the
current basic-block we are visiting. As a side-note, with GlobalISel this
can be solved, since we see the entire use-def chain at the function level.

 

Our initial thought was to write a pass that will be run after ISel to
correct bad selections. The pass would examine the use-def chains containing
values that were selected to K-regsiter classes, and, when profitable,
re-assign the values to GPR register classes (and replace the
producing/consuming instructions accordingly). But even with this fix-up
pass, we would still be losing many ISel pattern-matching rules that will be
missed because the instruction set acting on GPR is richer than the
instruction set acting on K-regs. For example, a test trying to match the
sbb instruction:

 

define i32 @test2(i32 %x, i32 %y, i32 %res) nounwind uwtable readnone ssp {

entry:

  %cmp = icmp ugt i32 %x, %y

  %dec = sext i1 %cmp to i32

  %dec.res = add nsw i32 %dec, %res

  ret i32 %dec.res

}

 

Generates the following with AVX2:

# BB#0:                                 # %entry

cmpl        %edi, %esi

sbbl        $0, %edx

movl        %edx, %eax

retq

 

While AVX512 produces:

# BB#0:                                 # %entry

xorl        %ecx, %ecx

cmpl        %esi, %edi

movl        $-1, %eax

cmovbel        %ecx, %eax

addl        %edx, %eax

retq

 

So we would still end-up with cases where when the AVX-512 feature is
enabled, instruction selection for scalar code becomes inferior.

 

Finally, we suggest to undo the above issues cause by legalizing i1, by
making i1 illegal. This would make instruction selection of scalar code
identical for both cases when the AVX-512 feature is on and off. As for
supporting BUILD_VECTOR, EXTRACT_VEC_ELEMENT and INSERT_VEC_ELEMENT, we
believe we can support these operations even when i1 is illegal and the
vectors of i1 *are* legal by using the i8 type instead of i1, as it should
be implicitly truncated/extended to the element type of the vNi1 vectors. 
I am now working on a patch that will implement this approach.

 

Would appreciate to get feedback and comments.

 

Thanks,

Guy

 

 

---------------------------------------------------------------------
Intel Israel (74) Limited

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170124/f0ad3b3f/attachment-0001.html>


More information about the llvm-dev mailing list