[PATCH, RFC] DAG postprocessing phase for PPC

Wed Feb 20 14:30:11 PST 2013

----- Original Message -----
> From: "Bill Schmidt" <wschmidt at linux.vnet.ibm.com>
> To: llvm-commits at cs.uiuc.edu, hfinkel at anl.gov
> Sent: Monday, February 18, 2013 4:00:00 PM
> Subject: [PATCH, RFC]  DAG postprocessing phase for PPC
> 
> This patch implements the PPCDAGToDAGISel::PostprocessISelDAG virtual
> method to perform post-selection peephole optimizations on the DAG
> representation.
> 
> One optimization is implemented here:  folds to clean up complex
> addressing expressions for thread-local storage and medium code
> model.  It will also be useful for large code model sequences when
> those are added later.  I originally thought about doing this on the
> MI representation prior to register assignment, but it's difficult to
> do effective global dead code elimination at that point.  DCE is
> trivial on the DAG representation.
> 
> A typical example of a candidate code sequence in assembly:
> 
>    addis 3, 2, globalvar at toc@ha
>    addi  3, 3, globalvar at toc@l
>    lwz   5, 0(3)
> 
> When the final instruction is a load or store with an immediate
> offset
> of zero, the offset from the add-immediate can replace the zero,
> provided the relocation information is carried along:
> 
>    addis 3, 2, globalvar at toc@ha
>    lwz   5, globalvar at toc@l(3)
> 
> Since the addi can in general have multiple uses, we need to only
> delete the instruction when the last use is removed.
> 
> I'm planning to add another fold shortly that was recently disabled
> in PPCISelLowering.cpp because it was undone by constant folding
> (lowering BUILD_VECTOR into an add of two identical splats).  The DAG
> representation is the right place for the fold, so long as it occurs
> after the constant folding that currently makes it moot.
> 
> I reorganized the target flags in PPC.h to be more useful (flag 2 was
> unused, so I shifted the flags over to make an extra bit available in
> the field used for distinct values).  I think some of the other bits
> used as "flags" should be reconsidered in the future, as some of them
> appear to be mutually exclusive and the encoding could be more
> efficient.
> 
> I'm posting this as an RFC in case there are objections with doing
> this in a new pass, suggestions for a better place to do this, etc.
> If no objections, I'll commit it shortly.

This looks good to me.

Can you split this into two commits: one with the changes to PPCMCInstLower.cpp, PPCELFObjectWriter.cpp, PPCInstrInfo.td and PPC.h and a second commit with the peephole logic and associated tests?

Also, purely to assist with my education, can you add a more detailed description of what is going on here:
+    SDValue ImmOpnd = Base.getOperand(1);
+
+    if (ReplaceFlags) {
+      GlobalAddressSDNode *GA = dyn_cast<GlobalAddressSDNode>(ImmOpnd);
+
+      if (GA) {
+        DebugLoc dl = GA->getDebugLoc();
+        const GlobalValue *GV = GA->getGlobal();
+        ImmOpnd = CurDAG->getTargetGlobalAddress(GV, dl, MVT::i64, 0, Flags);
+      } else {
+        ConstantPoolSDNode *CP = dyn_cast<ConstantPoolSDNode>(ImmOpnd);
+        if (CP) {
+          const Constant *C = CP->getConstVal();
+          ImmOpnd = CurDAG->getTargetConstantPool(C, MVT::i64,
+                                                  CP->getAlignment(),
+                                                  0, Flags);
+        }
+      }
+    }
and how this differs from using the original operand value? I see this comment above it:
+      // In some cases (such as TLS) the relocation information
+      // is already in place.
+      ReplaceFlags = false;
but I don't understand exactly what this means.

Thanks again,
Hal

> 
> Thanks!
> Bill
> --
> Bill Schmidt, Ph.D.
> IBM Advance Toolchain for PowerLinux
> IBM Linux Technology Center
> wschmidt at us.ibm.com
> wschmidt at linux.vnet.ibm.com
> 
> 
> 
> 
> 
>