[PATCH] D22632: Wrong code generation for VSX floating-point loads with fast-isel

Thu Jul 21 10:02:30 PDT 2016

uweigand created this revision.
uweigand added reviewers: hfinkel, kbarton.
uweigand added a subscriber: llvm-commits.
Herald added a subscriber: nemanjai.

See https://llvm.org/bugs/show_bug.cgi?id=28630 for a description of the problem.

There were two locations where fast-isel would generate a LFD instruction with a target register class VSFRC instead of F8RC when VSX was enabled.  The first is PPCFastISel::PPCEmitLoad, which had multiple problems:

1.) Typo in the definition of is64VSXLoad:
  bool Is32VSXLoad = IsVSSRC && Opc == PPC::LFS;
  bool Is64VSXLoad = IsVSSRC && Opc == PPC::LFD;
The second line needs to use isVSFRC (like PPCEmitStore does)

2.) Move creation of resultReg.  Even if the above typo is fixed, IsVSFRC would often be wrong, because it is computed from resultReg -- but in many cases, resultReg was still zero at this point.

3.) Once both the above are fixed, we're now generating a VSX instruction -- but an incorrect one, since generation of an indexed instruction with null index is wrong.  Fixed by copying the code handling the same issue in PPCEmitStore.

The second place is PPCFastISel::PPCMaterializeFP, where we would emit an LFD to load a constant from the literal pool, and use the wrong result register class.  Fixed by hardcoding a F8RC class even on systems supporting VSX.


https://reviews.llvm.org/D22632

Files:
  lib/Target/PowerPC/PPCFastISel.cpp
  test/CodeGen/PowerPC/pr28630.ll

Index: test/CodeGen/PowerPC/pr28630.ll
===================================================================

--- test/CodeGen/PowerPC/pr28630.ll
+++ test/CodeGen/PowerPC/pr28630.ll
@@ -0,0 +1,13 @@
+; RUN: llc -mtriple=powerpc64le-unknown-linux-gnu -verify-machineinstrs -O0 < %s
+
+define double @test() {
+  ret double 1.000000e+00
+}
+
+ at g = common global double 0.000000e+00, align 8
+
+define double @testitd() {
+  %g = load double, double* @g, align 8
+  ret double %g
+}
+
Index: lib/Target/PowerPC/PPCFastISel.cpp
===================================================================
--- lib/Target/PowerPC/PPCFastISel.cpp
+++ lib/Target/PowerPC/PPCFastISel.cpp
@@ -519,21 +519,21 @@
   unsigned IndexReg = 0;
   PPCSimplifyAddress(Addr, UseOffset, IndexReg);
 
+  if (ResultReg == 0)
+    ResultReg = createResultReg(UseRC);
+
   // If this is a potential VSX load with an offset of 0, a VSX indexed load can
   // be used.
   bool IsVSSRC = (ResultReg != 0) && isVSSRCRegister(ResultReg);
   bool IsVSFRC = (ResultReg != 0) && isVSFRCRegister(ResultReg);
   bool Is32VSXLoad = IsVSSRC && Opc == PPC::LFS;
-  bool Is64VSXLoad = IsVSSRC && Opc == PPC::LFD;
+  bool Is64VSXLoad = IsVSFRC && Opc == PPC::LFD;
   if ((Is32VSXLoad || Is64VSXLoad) &&
       (Addr.BaseType != Address::FrameIndexBase) && UseOffset &&
       (Addr.Offset == 0)) {
     UseOffset = false;
   }
 
-  if (ResultReg == 0)
-    ResultReg = createResultReg(UseRC);
-
   // Note: If we still have a frame index here, we know the offset is
   // in range, as otherwise PPCSimplifyAddress would have converted it
   // into a RegBase.
@@ -579,8 +579,18 @@
       case PPC::LFS:    Opc = IsVSSRC ? PPC::LXSSPX : PPC::LFSX; break;
       case PPC::LFD:    Opc = IsVSFRC ? PPC::LXSDX : PPC::LFDX; break;
     }
-    BuildMI(*FuncInfo.MBB, FuncInfo.InsertPt, DbgLoc, TII.get(Opc), ResultReg)
-      .addReg(Addr.Base.Reg).addReg(IndexReg);
+
+    auto MIB = BuildMI(*FuncInfo.MBB, FuncInfo.InsertPt, DbgLoc,
+		       TII.get(Opc), ResultReg);
+
+    // If we have an index register defined we use it in the store inst,
+    // otherwise we use X0 as base as it makes the vector instructions to
+    // use zero in the computation of the effective address regardless the
+    // content of the register.
+    if (IndexReg)
+      MIB.addReg(Addr.Base.Reg).addReg(IndexReg);
+    else
+      MIB.addReg(PPC::ZERO8).addReg(Addr.Base.Reg);
   }
 
   return true;
@@ -1907,7 +1917,9 @@
   unsigned Align = DL.getPrefTypeAlignment(CFP->getType());
   assert(Align > 0 && "Unexpectedly missing alignment information!");
   unsigned Idx = MCP.getConstantPoolIndex(cast<Constant>(CFP), Align);
-  unsigned DestReg = createResultReg(TLI.getRegClassFor(VT));
+  const TargetRegisterClass *RC =
+    (VT == MVT::f32) ? &PPC::F4RCRegClass : &PPC::F8RCRegClass;
+  unsigned DestReg = createResultReg(RC);
   CodeModel::Model CModel = TM.getCodeModel();
 
   MachineMemOperand *MMO = FuncInfo.MF->getMachineMemOperand(


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D22632.64917.patch
Type: text/x-patch
Size: 2974 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20160721/a1f4ba13/attachment.bin>