[PATCH] D42031: [X86] Legalize v32i1 without BWI via splitting to v16i1 rather than the default of promoting to v32i8.

Sat Jan 20 10:35:00 PST 2018

craig.topper added a comment.

Is this patch good enough to go in as is? And we file bugs and work on any problems after?

================
Comment at: test/CodeGen/X86/avx512-insert-extract.ll:1068
+; KNL-NEXT:    kmovw %k0, %eax
 ; KNL-NEXT:    andb $1, %al
 ; KNL-NEXT:    movb $4, %cl
----------------
RKSimon wrote:
> craig.topper wrote:
> > RKSimon wrote:
> > > craig.topper wrote:
> > > > RKSimon wrote:
> > > > > Is the andb necessary? Are we missing some known bits handling here?
> > > > Yeah its unnecessary based on how it gets isel. But I'm not sure the way the DAG is written that we can guarantee it.
> > > > 
> > > > ```
> > > >   t0: ch = EntryToken
> > > >                   t2: v64i8,ch = CopyFromReg t0, Register:v64i8 %0
> > > >                   t4: v64i8,ch = CopyFromReg t0, Register:v64i8 %1
> > > >                 t24: v64i1 = X86ISD::CMPMU t2, t4, Constant:i8<6>
> > > >               t28: v64i1 = X86ISD::KSHIFTR t24, Constant:i8<63>
> > > >             t30: i32 = extract_vector_elt t28, Constant:i64<0>
> > > >           t26: i8 = truncate t30
> > > >         t22: i8 = and t26, Constant:i8<1>
> > > >       t19: i8 = sub Constant:i8<4>, t22
> > > >     t13: i32 = zero_extend t19
> > > >   t16: ch,glue = CopyToReg t0, Register:i32 %eax, t13
> > > >   t17: ch = X86ISD::RET_FLAG t16, TargetConstant:i32<0>, Register:i32 %eax, t16:1
> > > > ```
> > > > 
> > > > I don't think we can make any assumptions about the upper bits of the input to the extract_vector_elt being passed through.
> > > > 
> > > > We could maybe pattern match this out during isel?
> > > > 
> > > > Or we could legalize to a DAG that explicitly contains a KMOVW node instead of extract_vector_elt. Then we could probably put out an AssertZExt.
> > > Couldn't we handle this by adding X86ISD::KSHIFTR handling to X86TargetLowering::computeKnownBitsForTargetNode (with suitable DemandedElts twiddling)?
> > But computeKnownBits for extract_vector_elt won't pass the info through. It doesn't know the bits from the upper elements are going to make it through to the result. (they don't for any other vector type). It would just take the known bits info about element 0 and any extend that to i32.
> OK - sorry I was thinking that the extract_vector_elt disappeared when it became a KSHIFT+KMOVW.
Oh sorry. If we use KMOVW explicitly in the DAG then yes the extract_vector_elt goes away. But it still looks sort of like a vector to scalar bitcast.

The KMOVW will demand all elts from the KSHIFT.  And the KSHIFT won't be able to say that some elts are 0 and the lowest element is unknown will it?

Repository:
  rL LLVM

https://reviews.llvm.org/D42031