[llvm-commits] [PATCH]IDIV->DIVB Atom Optimization

Wed Aug 1 11:58:49 PDT 2012

I don't understand what you are suggesting. 

ISD::SDIV, ISD::SREM and ISD::UDIV, ISD::UREM are combined to form ISD::SDIVREM, and ISD::UDIVREM. These are handled by X86ISelDAGToDAG:2301 which takes care of all the special cases by inserting the appropriate movs, cdq/cwd, and X86::DIV or X86::IDIV instructions. A custom inserter is invoked after this, but it's too late because the div/rem operations have been replaced by numerous instructions.

What is wrong with doing this in a backend IR pass?

Tyler

> -----Original Message-----
> From: Evan Cheng [mailto:evan.cheng at apple.com]
> Sent: Wednesday, August 01, 2012 2:18 PM
> To: Nowicki, Tyler
> Cc: llvm-commits at cs.uiuc.edu
> Subject: Re: [llvm-commits] [PATCH]IDIV->DIVB Atom Optimization
> 
> 
> On Aug 1, 2012, at 9:18 AM, "Nowicki, Tyler" <tyler.nowicki at intel.com>
> wrote:
> 
> > Thanks for the reply.
> >
> > Lower is too late because DIV gets expanded into 2 MOV, a CDQ or CWD,
> > the DIV or IDIV, and one or two more MOVs during instruction selection
> > on DAG conversion in X86ISelDAGToDAG. Adding
> 
> Not during X86ISelDAGToDAG. You should do this using custom lowering in
> X86ISelLowering. At that time, it just looks like a ISD::SDIV / UDIV node.
> 
> > this optimization after would require a custom inserter to search and remove
> these instructions, and generate new BBs. This is complicated by O2
> optimization which may remove MOVs when possible, by the scheduler on
> Atom which changes the order of instructions to avoid stalls, and by which form
> of the DIV instruction is used. Also, depending on the spills and neighboring
> instructions it is difficult to enumerate all the possible ways this optimization can
> be carried out.
> 
> The right thing to do is to custom lower into a X86ISD target node and select to a
> pseudo instruction. It can be marked with usesCustomInserter = 1 which allows
> you to expand it into the machine instructions and create new BBs if needed.
> 
> >
> > It makes sense to do the optimization as an IR pass because none of this has
> happened yet. Then CSE and scheduling can operate on the new BBs.
> 
> Scheduling doesn't happen until later and machine CSE can handle these
> instructions. Please look at some examples to learn more about how existing
> targets handle complex cases.
> 
> Evan
> 
> >
> > Thanks,
> >
> > Tyler
> >
> >> -----Original Message-----
> >> From: Evan Cheng [mailto:evan.cheng at apple.com]
> >> Sent: Tuesday, July 31, 2012 6:41 PM
> >> To: Nowicki, Tyler
> >> Cc: Eli Friedman; llvm-commits at cs.uiuc.edu
> >> Subject: Re: [llvm-commits] [PATCH]IDIV->DIVB Atom Optimization
> >>
> >> Hi Tyler,
> >>
> >> This doesn't look like the right approach. I don't understand why you
> >> are adding a LLVM ir pass to do the expansion. Why can't this be done
> >> at x86 instruction lowering time?
> >>
> >> Evan
> >>
> >> On Jul 31, 2012, at 1:25 PM, "Nowicki, Tyler"
> >> <tyler.nowicki at intel.com>
> >> wrote:
> >>
> >>> Thanks for the review. Good catch on the missing check, I've also
> >>> made use
> >> of DenseMap as suggested. Test cases omitted from the previously
> >> patch are in place again and a test case for the missing check you
> >> found has been added.
> >>>
> >>> Please review my corrections.
> >>>
> >>> Tyler Nowicki
> >>> Intel
> >>>
> >>>> -----Original Message-----
> >>>> From: Eli Friedman [mailto:eli.friedman at gmail.com]
> >>>> Sent: Friday, July 27, 2012 2:17 PM
> >>>> To: Nowicki, Tyler
> >>>> Cc: Sean Silva; llvm-commits at cs.uiuc.edu
> >>>> Subject: Re: [llvm-commits] [PATCH]IDIV->DIVB Atom Optimization
> >>>>
> >>>> On Fri, Jul 27, 2012 at 8:35 AM, Nowicki, Tyler
> >>>> <tyler.nowicki at intel.com>
> >>>> wrote:
> >>>>> This patch has not yet been reviewed. Could someone take a look at it?
> >>>>
> >>>> +    typedef std::pair<Value *, Value *> DivBOperands; //Dividend
> >>>> Value, Divisor Value
> >>>> +    typedef std::pair<PHINode *, PHINode *> DivBPhiNodes;
> >>>> + //Quotient
> >>>> PHI, Remainder Phi
> >>>> +    typedef std::map<DivBOperands, DivBPhiNodes> DivBCacheTy;
> >>>>
> >>>> llvm::DenseMap?
> >>>>
> >>>> +  // Replace operation value with previously generated phi node
> >>>> + DivBPhiNodes Value = CacheI->second;  if (UseDivOp) {
> >>>> +    // Replace all uses of div instruction with quotient phi node
> >>>> +    J->replaceAllUsesWith(Value.first);
> >>>> +  } else {
> >>>> +    // Replace all uses of rem instruction with remainder phi node
> >>>> +    J->replaceAllUsesWith(Value.second);
> >>>> +  }
> >>>>
> >>>> Looks like there's a missing check for UseSignedOp here.
> >>>>
> >>>> The patch doesn't appear to include any testcases.
> >>>>
> >>>> -Eli
> >>> <idivtodivb-
> >> svn.patch>_______________________________________________
> >>> llvm-commits mailing list
> >>> llvm-commits at cs.uiuc.edu
> >>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> >