[PATCH] D49531: [PowerPC] Enhance the selection(ISD::VSELECT) of vector type

Zixuan Wu via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Sun Nov 4 22:15:37 PST 2018


wuzish added inline comments.


================
Comment at: llvm/trunk/lib/Target/PowerPC/PPCISelLowering.cpp:590
+      setOperationAction(ISD::VSELECT, VT, Promote);
+      AddPromotedToType (ISD::VSELECT, VT, MVT::v4i32);
       setOperationAction(ISD::SELECT_CC, VT, Promote);
----------------
efriedma wrote:
> wuzish wrote:
> > efriedma wrote:
> > > wuzish wrote:
> > > > efriedma wrote:
> > > > > I don't think Promote does the right thing here; VectorLegalizer::Promote on a VSELECT just bitcasts the operands/result, and the documentation for ISD::VSELECT say "The condition follows the BooleanContent format of the target."  So DAGCombine might transform it incorrectly in some cases involving v16i8 or v8i16.
> > > > > 
> > > > > Granted, I'm not sure DAGCombine actually does any relevant transforms on the condition of a VSELECT, but it seems like a bad idea to rely on that.
> > > > Well, for v16i8 the initial DAG is like this
> > > > 
> > > > 
> > > > ```
> > > > SelectionDAG has 15 nodes:
> > > >   t0: ch = EntryToken
> > > >         t6: v16i8,ch = CopyFromReg t0, Register:v16i8 %2
> > > >         t8: v16i8,ch = CopyFromReg t0, Register:v16i8 %3
> > > >       **t10: v16i1 = setcc t6, t8, seteq:ch**
> > > >       t2: v16i8,ch = CopyFromReg t0, Register:v16i8 %0
> > > >       t4: v16i8,ch = CopyFromReg t0, Register:v16i8 %1
> > > >     t11: v16i8 = vselect t10, t2, t4
> > > >   t13: ch,glue = CopyToReg t0, Register:v16i8 $v2, t11
> > > >   t14: ch = PPCISD::RET_FLAG t13, Register:v16i8 $v2, t13:1
> > > > ```
> > > > 
> > > > The setcc result type is v16i1, but legalize type phase will change it to v16i8 as you said that follow the BooleanContent format of the target. So it 's like blow.
> > > > 
> > > > 
> > > > ```
> > > >   t0: ch = EntryToken
> > > >         t6: v16i8,ch = CopyFromReg t0, Register:v16i8 %2
> > > >         t8: v16i8,ch = CopyFromReg t0, Register:v16i8 %3
> > > > **      t15: v16i8 = setcc t6, t8, seteq:ch**
> > > >       t2: v16i8,ch = CopyFromReg t0, Register:v16i8 %0
> > > >       t4: v16i8,ch = CopyFromReg t0, Register:v16i8 %1
> > > > **    t16: v16i8 = vselect t15, t2, t4**
> > > >   t13: ch,glue = CopyToReg t0, Register:v16i8 $v2, t16
> > > >   t14: ch = PPCISD::RET_FLAG t13, Register:v16i8 $v2, t13:1
> > > > ```
> > > > 
> > > > What promote does is just make v16i8 to v4i32 consistently or canonically, so that we can just write one single pattern in td file to select `vselect` since the xxsel instruction is suitable for all vector type like v16i8/v8i16 because it's bit selection.
> > > > 
> > > > After legalize operation phase:
> > > > 
> > > > ```
> > > >  SelectionDAG has 19 nodes:
> > > >   t0: ch = EntryToken
> > > >             t6: v16i8,ch = CopyFromReg t0, Register:v16i8 %2
> > > >              t8: v16i8,ch = CopyFromReg t0, Register:v16i8 %3
> > > >            t15: v16i8 = setcc t6, t8, seteq:ch
> > > >  **        t17: v4i32 = bitcast t15**
> > > >            t2: v16i8,ch = CopyFromReg t0, Register:v16i8 %0
> > > > **        t18: v4i32 = bitcast t2**
> > > >            t4: v16i8,ch = CopyFromReg t0, Register:v16i8 %1
> > > > **        t19: v4i32 = bitcast t4**
> > > >  **      t20: v4i32 = vselect t17, t18, t19**
> > > >      t21: v16i8 = bitcast t20
> > > >    t13: ch,glue = CopyToReg t0, Register:v16i8 $v2, t21
> > > >    t14: ch = PPCISD::RET_FLAG t13, Register:v16i8 $v2, t13:1
> > > > ```
> > > The SelectionDAG DAG is a target-independent IR, just like LLVM IR; it has target-independent opcodes with target-independent meanings.  It's not defined quite as formally as LLVM IR, but the concept is the same.  And DAGCombine makes transforms based on that target-independent meaning.  And if the target-independent meaning for an operation is that it returns an undefined value, you can't "extend" it to return something defined just because it would be convenient for your target.  (IIRC the MIPS backend tried to do this with vector shifts, and ended up with miscompiles.)
> > > 
> > > The issue here, specifically, is that ISD::VSELECT is defined as a per-element select, and the result is undefined if any of the elements is not a boolean of the correct form (on PPC, BooleanContent for a vector requires an all-zeros or all-ones value).
> > > 
> > > There are a couple of ways to solve this: one, introduce a target-specific opcode (PPCISD::XXSEL, or something like that, which has the correct semantics).  Two, you could change the promotion logic; promoting a v4i32 vselect to a v16i8 vselect should be fine.
> > > The issue here, specifically, is that ISD::VSELECT is defined as a per-element select, and the result is undefined if any of the elements is not a boolean of the correct form (on PPC, BooleanContent for a vector requires an all-zeros or all-ones value).
> > > 
> > 
> > It's already correct form boolean content in PPC (all-ones) after the type legalization.( That's why the result type of setcc is v16i8 not v16i1). 
> > So all vector type bitcast to v4i32 is just a kind of canonical so that we do not need to write many patterns in td.
> > 
> > 
> No, it's not the correct boolean content: the elements of the condition of a v4i32 VSELECT are required to have value 0x00000000 or 0xFFFFFFFF, but your v4i32 can have elements of the form 0x0000FF00, which is not a boolean.
Yes, it's not a perfect way to simply the pattern matching lazily. 

It's better to write every legal pattern in td file.


Repository:
  rL LLVM

https://reviews.llvm.org/D49531





More information about the llvm-commits mailing list