[PATCH] [PPC64LE] Remove unnecessary swaps from lane-insensitive vector computations

Bill Schmidt wschmidt at linux.vnet.ibm.com
Fri Apr 10 11:49:05 PDT 2015


Hal, thanks for the review!  Responses to your comments are inline.  Please let me know if I'm off base on any of them.  I'll work on a new version shortly.


REPOSITORY
  rL LLVM

================
Comment at: lib/Target/PowerPC/PPCVSXSwapRemoval.cpp:21
@@ +20,3 @@
+// The extra xxswapd instructions reduce performance.  This can be
+// particularly bad for vectorized code.  The purpose of this pass
+// is to reduce the number of xxswapd instructions required for
----------------
hfinkel wrote:
> What does "particularly bad for vectorized code" mean? Doesn't this exclusively apply to vectorized code?
> 
I suppose I had meant "autovectorized code" at the time, because the user has no way to anticipate these swaps are going to show up out of nowhere and reduce performance in a performance-enhancing optimization.  However, the sentence really adds nothing, so I'll remove it. ;)

================
Comment at: lib/Target/PowerPC/PPCVSXSwapRemoval.cpp:126
@@ +125,3 @@
+  // Walk the machine instructions to gather vector usage information.
+  // Return true iff vector mentions are present.
+  bool gatherVectorMentions();
----------------
hfinkel wrote:
> Why don't we just call these "vector instructions"? mentions sounds stilted to me.
That's fine.  "Mentions" is a more or less standard term for defs U kills U uses.  The important thing is to get anything that mentions a vector register, including things that just copy into or out of them.  But in our current ISA, everything that lives in the Category:Altivec and Category:VSX fits this description, so "vector instructions" is equivalent.

================
Comment at: lib/Target/PowerPC/PPCVSXSwapRemoval.cpp:167
@@ +166,3 @@
+  // Return true iff the given register is a full vector register.
+  bool isVecReg(unsigned Reg) {
+    return (isRegInClass(Reg, &PPC::VSRCRegClass) ||
----------------
hfinkel wrote:
> As a general note, this pass does not handle the case where a register has a subregister index. This should be mostly irrelevant here (we probably don't generate them for these kinds of instructions), but we also don't want to mishandle them should they occur.
> 
> (maybe also give up in lookThruCopyLike should you run into a register with a subreg index too).
I don't believe any special handling is necessary for the current patch.  If a subregister mention is connected, directly or indirectly, to a full register load or store, it will have to be done via an EXTRACT_SUBREG, INSERT_SUBREG, or SUBREG_TO_REG.  But these are operations that kill the optimization anyway (except SUBREG_TO_REG for full 128-bit copies).

We don't want to exclude them from the analysis, because we can handle EXTRACT_SUBREG and INSERT_SUBREG by adjusting the subregister number to account for the doubleword swap, and dealing with SUBREG_TO_REG accordingly as well.  I plan to incorporate this into a future patch.

However, there may be subtleties in how subregs are handled in LLVM that I'm not familiar with, so please let me know if I'm missing something.

================
Comment at: lib/Target/PowerPC/PPCVSXSwapRemoval.cpp:224
@@ +223,3 @@
+
+  for (MachineFunction::iterator I = MF->begin(); I != MF->end();) {
+    MachineBasicBlock &MBB = *I++;
----------------
hfinkel wrote:
> range-based for?
OK -- I'll rework this and the ones below.

================
Comment at: lib/Target/PowerPC/PPCVSXSwapRemoval.cpp:284
@@ +283,3 @@
+        // handling for this in the future.
+        SwapVector[VecIdx].IsLoad = 1;
+        break;
----------------
hfinkel wrote:
> Where to we actually reject webs with non-swapping load/stores?
> 
The fact that these are not marked as either IsSwap or IsSwappable causes them to be rejected.  See lines 537-538.

================
Comment at: lib/Target/PowerPC/PPCVSXSwapRemoval.cpp:307
@@ +306,3 @@
+        // These are fine provided they are moving between full vector
+        // register classes.
+        if (isVecReg(MI->getOperand(0).getReg()) &&
----------------
hfinkel wrote:
> We should add a comment here explaining that this happens because some of the 128-bit VSX registers have 128-bit Altivec sub-registers.
OK.

================
Comment at: lib/Target/PowerPC/PPCVSXSwapRemoval.cpp:333
@@ +332,3 @@
+      // FIXME: Is there a way we could specify this as a flag on the
+      // instruction in the tblgen description?  That would seem
+      // a better approach for future maintenance, but I don't
----------------
hfinkel wrote:
> Yes, that would be better. If nothing else, we should be able to use the InstrMapping facility to generate a lookup table (this is how we currently keep track of which instructions are record forms, and the mapping between record-form and the non-record-form variants).
> 
I'll look at that, thanks.  There didn't seem to be a way to just create a simple flag that I could see, which would have been nice...

================
Comment at: lib/Target/PowerPC/PPCVSXSwapRemoval.cpp:459
@@ +458,3 @@
+    CopySrcReg = MI->getOperand(1).getReg();
+  else // isSubregToReg()
+    CopySrcReg = MI->getOperand(2).getReg();
----------------
hfinkel wrote:
> add an assert that isSubregToReg() is true.
OK.

http://reviews.llvm.org/D8565

EMAIL PREFERENCES
  http://reviews.llvm.org/settings/panel/emailpreferences/






More information about the llvm-commits mailing list