[PATCH] D42031: [X86] Legalize v32i1 without BWI via splitting to v16i1 rather than the default of promoting to v32i8.

Craig Topper via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Jan 19 10:16:05 PST 2018


craig.topper added inline comments.


================
Comment at: test/CodeGen/X86/avx512-insert-extract.ll:1068
+; KNL-NEXT:    kmovw %k0, %eax
 ; KNL-NEXT:    andb $1, %al
 ; KNL-NEXT:    movb $4, %cl
----------------
RKSimon wrote:
> craig.topper wrote:
> > RKSimon wrote:
> > > Is the andb necessary? Are we missing some known bits handling here?
> > Yeah its unnecessary based on how it gets isel. But I'm not sure the way the DAG is written that we can guarantee it.
> > 
> > ```
> >   t0: ch = EntryToken
> >                   t2: v64i8,ch = CopyFromReg t0, Register:v64i8 %0
> >                   t4: v64i8,ch = CopyFromReg t0, Register:v64i8 %1
> >                 t24: v64i1 = X86ISD::CMPMU t2, t4, Constant:i8<6>
> >               t28: v64i1 = X86ISD::KSHIFTR t24, Constant:i8<63>
> >             t30: i32 = extract_vector_elt t28, Constant:i64<0>
> >           t26: i8 = truncate t30
> >         t22: i8 = and t26, Constant:i8<1>
> >       t19: i8 = sub Constant:i8<4>, t22
> >     t13: i32 = zero_extend t19
> >   t16: ch,glue = CopyToReg t0, Register:i32 %eax, t13
> >   t17: ch = X86ISD::RET_FLAG t16, TargetConstant:i32<0>, Register:i32 %eax, t16:1
> > ```
> > 
> > I don't think we can make any assumptions about the upper bits of the input to the extract_vector_elt being passed through.
> > 
> > We could maybe pattern match this out during isel?
> > 
> > Or we could legalize to a DAG that explicitly contains a KMOVW node instead of extract_vector_elt. Then we could probably put out an AssertZExt.
> Couldn't we handle this by adding X86ISD::KSHIFTR handling to X86TargetLowering::computeKnownBitsForTargetNode (with suitable DemandedElts twiddling)?
But computeKnownBits for extract_vector_elt won't pass the info through. It doesn't know the bits from the upper elements are going to make it through to the result. (they don't for any other vector type). It would just take the known bits info about element 0 and any extend that to i32.


https://reviews.llvm.org/D42031





More information about the llvm-commits mailing list