[PATCH] [PPC64LE] Remove unnecessary swaps from lane-insensitive vector computations

Mon Apr 13 12:42:30 PDT 2015

REPOSITORY
  rL LLVM

================
Comment at: lib/Target/PowerPC/PPCVSXSwapRemoval.cpp:167
@@ +166,3 @@
+  // Return true iff the given register is a full vector register.
+  bool isVecReg(unsigned Reg) {
+    return (isRegInClass(Reg, &PPC::VSRCRegClass) ||
----------------
wschmidt wrote:
> hfinkel wrote:
> > wschmidt wrote:
> > > hfinkel wrote:
> > > > As a general note, this pass does not handle the case where a register has a subregister index. This should be mostly irrelevant here (we probably don't generate them for these kinds of instructions), but we also don't want to mishandle them should they occur.
> > > > 
> > > > (maybe also give up in lookThruCopyLike should you run into a register with a subreg index too).
> > > I don't believe any special handling is necessary for the current patch.  If a subregister mention is connected, directly or indirectly, to a full register load or store, it will have to be done via an EXTRACT_SUBREG, INSERT_SUBREG, or SUBREG_TO_REG.  But these are operations that kill the optimization anyway (except SUBREG_TO_REG for full 128-bit copies).
> > > 
> > > We don't want to exclude them from the analysis, because we can handle EXTRACT_SUBREG and INSERT_SUBREG by adjusting the subregister number to account for the doubleword swap, and dealing with SUBREG_TO_REG accordingly as well.  I plan to incorporate this into a future patch.
> > > 
> > > However, there may be subtleties in how subregs are handled in LLVM that I'm not familiar with, so please let me know if I'm missing something.
> > > it will have to be done via an EXTRACT_SUBREG, INSERT_SUBREG, or SUBREG_TO_REG. 
> > 
> > No, I think you've misunderstood my comment. MI register operands have have *implicit* subregister indices. These are separate from the subreg pseudo instructions you've named above.
> > 
> > For example, MO.getReg() may equal PPC::CR0, and the operand refers to the whole register if MO.getSubReg() == 0. But, if MO.getReg() == PPC::CR0 and MO.getSubReg() == PPC::sub_eq, then the operand is really referring to PPC::CR0EQ. Now this seems silly for physical registers, but is useful for virtual registers when you know that you want a particular subregister of a virtual register.
> > 
> OK.  Well, I really don't think this can actually happen today for any of the various vector register classes. (If it can, I'd be interested in an example.)  I will add some logic that will just kill the optimization if it ever occurs, but to my knowledge subregs of vector registers are always generated explicitly by operation (as they should be).  I can understand using this for fixed bits of a status register, but it would be really bad practice to use this for vector subregs.
Right; I don't think it can happen today either, but if it does for whatever reason, I'd rather we handle it gracefully instead of crashing/miscompiling. That was my entire point ;)

http://reviews.llvm.org/D8565

EMAIL PREFERENCES
  http://reviews.llvm.org/settings/panel/emailpreferences/