[llvm-commits] [PATCH]IDIV->DIVB Atom Optimization

Evan Cheng evan.cheng at apple.com
Mon Sep 3 10:13:18 PDT 2012


Sorry, a few more nitpicks:

+  /// isSlowDivBypassed - Returns true if target has indicated at least one
+  /// type should be bypassed.
+  bool isSlowDivBypassed() const { return BypassSlowDivTypes.size() > 0; }

empty() is faster than size().

+static void insertFastDiv(Function &F,
+                          Function::iterator &I,
+                          BasicBlock::iterator &J,
+                          IntegerType *BypassType,
+                          bool UseDivOp,
+                          bool UseSignedOp,
+                          DivCacheTy &PerBBDivCache)
+{

Two issues. First, please include some function level comments. And second, the '{' should follow the ')' on previous line.

Please go ahead and commit (or have someone commit it for you) after you correct these.

Thanks,

Evan

On Aug 29, 2012, at 11:02 AM, Nowicki, Tyler wrote:

> I hid the implementation details as you suggested, here is the updated patch.
> 
> Tyler
> 
>> -----Original Message-----
>> From: Evan Cheng [mailto:evan.cheng at apple.com]
>> Sent: Monday, August 27, 2012 5:31 PM
>> To: Nowicki, Tyler
>> Cc: llvm-commits at cs.uiuc.edu
>> Subject: Re: [llvm-commits] [PATCH]IDIV->DIVB Atom Optimization
>> 
>> Sorry, I've been away.
>> 
>> The patch looks alright. However, I'm wondering if it is truly necessary to
>> expose the type definitions in BypassSlowDivision.h. They look like
>> implementation details that client of bypassSlowDivision() shouldn't need to
>> know about.
>> 
>> Evan
>> 
>> On Aug 27, 2012, at 9:17 AM, "Nowicki, Tyler" <tyler.nowicki at intel.com>
>> wrote:
>> 
>>> When you get a chance would you please review the changes.
>>> 
>>> Thanks,
>>> 
>>> Tyler
>>> 
>>>> -----Original Message-----
>>>> From: llvm-commits-bounces at cs.uiuc.edu [mailto:llvm-commits-
>>>> bounces at cs.uiuc.edu] On Behalf Of Nowicki, Tyler
>>>> Sent: Wednesday, August 15, 2012 3:03 PM
>>>> To: Evan Cheng
>>>> Cc: llvm-commits at cs.uiuc.edu
>>>> Subject: Re: [llvm-commits] [PATCH]IDIV->DIVB Atom Optimization
>>>> 
>>>> Thanks for the review.
>>>> 
>>>> I made the changes you suggested, here is the latest patch and commit
>>>> message.
>>>> 
>>>> Generic Bypass Slow Div
>>>> -	CodeGenPrepare pass for identifying div/rem ops
>>>> -	Backend specifies the type mapping using addBypassSlowDivType
>>>> -	Enabled only on Intel Atom with O2 for 32-bit -> 8-bit
>>>> -	Replace IDIV with instructions which test its value and use DIVB if the
>>>> value is positive and less than 256.
>>>> -	In the case when the quotient and remainder of a divide are used a
>>>> DIV and a REM instruction will be present in the IR. In the non-Atom
>>>> case they are both lowered to IDIVs and CSE removes the redundant
>>>> IDIV instruction, using the quotient and remainder from the first
>>>> IDIV. However, due to this optimization CSE is not able to eliminate
>>>> redundant IDIV instructions because they are located in different
>>>> basic blocks. This is overcome by calculating both the quotient (DIV)
>>>> and remainder (REM) in each basic block that is inserted by the
>>>> optimization and reusing the result values when a subsequent DIV or REM
>> instruction uses the same operands.
>>>> -	Test cases check for the presence of the optimization when
>>>> calculating either the quotient, remainder,  or both.
>>>> 
>>>> Tyler
>>>> 
>>>>> -----Original Message-----
>>>>> From: Evan Cheng [mailto:evan.cheng at apple.com]
>>>>> Sent: Tuesday, August 14, 2012 5:02 PM
>>>>> To: Nowicki, Tyler
>>>>> Cc: llvm-commits at cs.uiuc.edu
>>>>> Subject: Re: [llvm-commits] [PATCH]IDIV->DIVB Atom Optimization
>>>>> 
>>>>> +// This file contains the runtime instruction selection
>>>>> +optimization for signed // and unsigned integer division which
>>>>> +selects 8-bit unsigned divide over a // 32-bit signed or unsigned
>>>>> +divided when the dividend and divisor are positive // and less than
>>>>> +256. This is useful for architectures with slow dividers such // as
>>>>> +some models of Intel
>>>> Atom.
>>>>> 
>>>>> Can you make it more generic? Perhaps lower 64-bit div to 32-bit or
>>>>> 32 to 16, etc.? For that matter, we should make the target hook more
>> flexible.
>>>>> 
>>>>> +  if (TLI && TLI->isSmallerDivIsCheaper()) {
>>>>> 
>>>>> Rather than isSmallerDivIsCheaper(), perhaps a hook that take a type
>>>>> and return a smaller type to lower the div to?
>>>>> 
>>>>> +bool BypassSlowDivision(Function &F,
>>>>> +                     Function::iterator &I,
>>>>> +                     DivCacheTy &DivCache)
>>>>> 
>>>>> The first character should be lower case to match the new LLVM
>>>>> naming convention.
>>>>> 
>>>>> Evan
>>>>> 
>>>>> On Aug 13, 2012, at 7:35 AM, "Nowicki, Tyler"
>>>>> <tyler.nowicki at intel.com>
>>>>> wrote:
>>>>> 
>>>>>> My apologies, that was not an SVN patch. Here is the correct patch.
>>>>>> 
>>>>>> Tyler
>>>>>> 
>>>>>>> -----Original Message-----
>>>>>>> From: llvm-commits-bounces at cs.uiuc.edu [mailto:llvm-commits-
>>>>>>> bounces at cs.uiuc.edu] On Behalf Of Nowicki, Tyler
>>>>>>> Sent: Monday, August 13, 2012 10:21 AM
>>>>>>> To: Evan Cheng
>>>>>>> Cc: llvm-commits at cs.uiuc.edu
>>>>>>> Subject: Re: [llvm-commits] [PATCH]IDIV->DIVB Atom Optimization
>>>>>>> 
>>>>>>> Sorry for the delayed response. This sounds like the right place.
>>>>>>> I've attached a patch for a CodeGenPrep pass that bypasses slow
>>>>>>> division operations when the target indicates that smaller
>>>>>>> divisions are
>>>>> cheaper.
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> 
>>>>>>> Tyler
>>>>>>> 
>>>>>>>> -----Original Message-----
>>>>>>>> From: Evan Cheng [mailto:evan.cheng at apple.com]
>>>>>>>> Sent: Sunday, August 05, 2012 7:01 PM
>>>>>>>> To: Nowicki, Tyler
>>>>>>>> Cc: llvm-commits at cs.uiuc.edu
>>>>>>>> Subject: Re: [llvm-commits] [PATCH]IDIV->DIVB Atom Optimization
>>>>>>>> 
>>>>>>>> Another possibility would be to make this transformation generic.
>>>>>>>> It might make sense to put it in CodeGenPrep under control of
>>>>>>>> some target
>>>>>>> hook.
>>>>>>>> CodeGenPrep is the place where IR is being modified before it
>>>>>>>> goes through codegen and it is allowed to do have some target
>>>>>>>> specific
>>>>>>> knowledge.
>>>>>>>> 
>>>>>>>> Evan
>>>>>>>> 
>>>>>>>> On Aug 5, 2012, at 11:10 AM, Evan Cheng wrote:
>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On Aug 3, 2012, at 3:30 PM, Nowicki, Tyler wrote:
>>>>>>>>> 
>>>>>>>>>> Let me clarify how this optimization works. Instead of
>>>>>>>>>> 
>>>>>>>>>> result = dividend/divisor;
>>>>>>>>>> 
>>>>>>>>>> an application programmer looking for optimal performance,
>>>>>>>>>> targeting
>>>>>>>> Atom CPUs, without this optimization would have to write:
>>>>>>>>>> 
>>>>>>>>>> if (((dividend | divisor) & 0xFF) < 0) { result =
>>>>>>>>>> (char)dividend / (char)divisor; //fast divide } else { result =
>>>>>>>>>> dividend / divisor; //slow divide }
>>>>>>>>> 
>>>>>>>>> Yes I understand. You want a simple division to be expanded into
>>>>>>>>> code that
>>>>>>>> involves multiple BB and control flows.
>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> I still want the both the fast and slow divide to go through
>>>>>>>>>> normal isel. But
>>>>>>>> what you are suggesting significantly changes the way DIVs are
>>>>>>>> handled for Atom CPUs, specifically avoiding the UDIVREM/SDIVREM
>>>>> expansion!
>>>>>>>> Are you suggesting that I duplicate this code in a custom
>>>>>>>> inserter
>>>>> function?
>>>>>>>>> 
>>>>>>>>> No, you would custom lower sdiv / udiv / srem / urem nodes so
>>>>>>>>> UDIVREM /
>>>>>>>> SDIVREM code will never be formed.  The custom inserter function
>>>>>>>> create MachineInstr, not SDNodes.
>>>>>>>>> 
>>>>>>>>> Evan
>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> Tyler
>>>>>>>>>> 
>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>> From: Evan Cheng [mailto:evan.cheng at apple.com]
>>>>>>>>>>> Sent: Friday, August 03, 2012 6:07 PM
>>>>>>>>>>> To: Nowicki, Tyler
>>>>>>>>>>> Cc: llvm-commits at cs.uiuc.edu
>>>>>>>>>>> Subject: Re: [llvm-commits] [PATCH]IDIV->DIVB Atom
>>>>>>>>>>> Optimization
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> On Aug 3, 2012, at 2:28 PM, Nowicki, Tyler wrote:
>>>>>>>>>>> 
>>>>>>>>>>>> This optimization should not be classified as `instruction
>>>>>>>>>>>> selection'. We're not
>>>>>>>>>>> simply replacing one form of divide with another at
>>>>>>>>>>> compilation time. The optimization adds control flow to use an
>>>>>>>>>>> 8-bit DIV when the values of the operands are determined to be
>>>>>>>>>>> small enough during
>>>>>>>> program execution.
>>>>>>>>>>> 
>>>>>>>>>>> It is instruction selection. This is about selecting the
>>>>>>>>>>> optimal code sequence to implement an operation.
>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> Several approaches were looked at for this optimization and
>>>>>>>>>>>> the IR pass was
>>>>>>>>>>> found to be the best way to accomplish it. Here are the
>>>>>>>>>>> approaches that we looked at and the reasons they are not
>>>>>>>>>>> suitable for this
>>>>>>>> optimization.
>>>>>>>>>>>> 
>>>>>>>>>>>> 1. Using a custom lowering function Specifying a custom
>>>>>>>>>>>> lowering function for ISD::UDIV/UREM/etc instructions
>>>>>>>>>>> using setOperationAction(...) in X86ISelLowering. However, the
>>>>>>>>>>> lowering functions are limited to using a SelectionDAG.
>>>>>>>>>>> Because the optimization requires several new BBs to be added
>>>>>>>>>>> it is not possible to add it here. Is this the place you were thinking
>> of?
>>>>>>>>>>>> 
>>>>>>>>>>>> 2. Using a custom inserter
>>>>>>>>>>>> Marking appropriate instructions with usesCustomInserter in
>>>>>>>>>>>> tablegen and
>>>>>>>>>>> implementing the optimization by inserting several new
>>>>>>>>>>> MachineBasicBlocks. As said previously ISD::UDIVREM/SDIVREM
>>>> are
>>>>>>>>>>> expanded into multiple instructions, see X86ISelDAGToDAG:2301.
>>>>>>>>>>> Adding it here would require the inserted instructions to be
>>>>>>>>>>> identified, removed, and a similar set of instructions to be
>>>>>>>>>>> added for the fast and slow DIV cases. However, this cannot be
>>>>>>>>>>> easily done because there are no guarantees how
>>>>> UDIVREM/SDIVREM
>>>>>>>>>>> will be
>>>>>>>> expanded
>>>>>>>>>>> or in what order, and inserting the fast and slow DIVs here
>>>>>>>>>>> would
>>>>>>>> essentially require duplicating the UDIVREM/SDIVREM expansion
>> code.
>>>>>>>>>>> 
>>>>>>>>>>> You want to use combination of 1 and 2. Custom lower into a
>>>>>>>>>>> target specific node which will map to a pseudo instruction
>>>>>>>>>>> marked with
>>>>>>>> usesCustomInserter.
>>>>>>>>>>> That allows you to create new BBs. There are lots of examples,
>> e.g.
>>>>>>>> atomic ops.
>>>>>>>>>>> 
>>>>>>>>>>> Evan
>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 3. Other approaches
>>>>>>>>>>>> We want the UDIVREM/SDIVREM expansion to operate on the
>>>>>>> original
>>>>>>>>>>>> DIV
>>>>>>>>>>> instruction and the DIV this optimization inserts so it seems
>>>>>>>>>>> like this optimization could be implemented in
>>>>>>>>>>> X86DAGToDAGISel::PreprocessISelDAG. Can we add control flow
>> at
>>>>>>>>>>> this
>>>>>>>> point?
>>>>>>>>>>>> 
>>>>>>>>>>>> Tyler Nowicki
>>>>>>>>>>>> Intel
>>>>>>>>>>>> 
>>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>>> From: Evan Cheng [mailto:evan.cheng at apple.com]
>>>>>>>>>>>>> Sent: Wednesday, August 01, 2012 4:38 PM
>>>>>>>>>>>>> To: Nowicki, Tyler
>>>>>>>>>>>>> Cc: llvm-commits at cs.uiuc.edu
>>>>>>>>>>>>> Subject: Re: [llvm-commits] [PATCH]IDIV->DIVB Atom
>>>>> Optimization
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> On Aug 1, 2012, at 11:58 AM, "Nowicki, Tyler"
>>>>>>>>>>>>> <tyler.nowicki at intel.com>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> I don't understand what you are suggesting.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> ISD::SDIV, ISD::SREM and ISD::UDIV, ISD::UREM are
>> combined
>>>>>>>>>>>>>> to form
>>>>>>>>>>>>> ISD::SDIVREM, and ISD::UDIVREM. These are handled by
>>>>>>>>>>>>> X86ISelDAGToDAG:2301 which takes care of all the special
>>>>>>>>>>>>> cases by inserting the appropriate movs, cdq/cwd, and
>>>>>>>>>>>>> X86::DIV or X86::IDIV instructions. A custom inserter is
>>>>>>>>>>>>> invoked after this, but it's too late because the div/rem
>>>>>>>>>>>>> operations have been replaced by numerous
>>>>>>>>>>> instructions.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Looks like you don't understand how LLVM instruction works.
>>>>>>>>>>>>> Please spend some time looking at the passes in isel,
>>>>>>>>>>>>> especially how targets do custom lowering. Basically you
>>>>>>>>>>>>> want to lower these ISD nodes before they reach
>> X86ISelDAGToDAG.
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> What is wrong with doing this in a backend IR pass?
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Because this is a target specific instruction selection
>>>>>>>>>>>>> issue and it doesn't belong in LLVM IR level. We don't want
>>>>>>>>>>>>> arbitrary IR passes to do bits and pieces of optimization.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Evan
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Tyler
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>>>>> From: Evan Cheng [mailto:evan.cheng at apple.com]
>>>>>>>>>>>>>>> Sent: Wednesday, August 01, 2012 2:18 PM
>>>>>>>>>>>>>>> To: Nowicki, Tyler
>>>>>>>>>>>>>>> Cc: llvm-commits at cs.uiuc.edu
>>>>>>>>>>>>>>> Subject: Re: [llvm-commits] [PATCH]IDIV->DIVB Atom
>>>>>>>>>>>>>>> Optimization
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> On Aug 1, 2012, at 9:18 AM, "Nowicki, Tyler"
>>>>>>>>>>>>>>> <tyler.nowicki at intel.com>
>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Thanks for the reply.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Lower is too late because DIV gets expanded into 2 MOV, a
>>>>>>>>>>>>>>>> CDQ or CWD, the DIV or IDIV, and one or two more MOVs
>>>>> during
>>>>>>>>>>>>>>>> instruction selection on DAG conversion in
>>>> X86ISelDAGToDAG.
>>>>>>>>>>>>>>>> Adding
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Not during X86ISelDAGToDAG. You should do this using
>>>> custom
>>>>>>>>>>>>>>> lowering in X86ISelLowering. At that time, it just looks
>>>>>>>>>>>>>>> like a ISD::SDIV /
>>>>>>>>>>> UDIV node.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> this optimization after would require a custom inserter
>>>>>>>>>>>>>>>> to search and remove
>>>>>>>>>>>>>>> these instructions, and generate new BBs. This is
>>>>>>>>>>>>>>> complicated by
>>>>>>>>>>>>>>> O2 optimization which may remove MOVs when possible, by
>>>> the
>>>>>>>>>>>>>>> scheduler on Atom which changes the order of instructions
>>>>>>>>>>>>>>> to avoid stalls, and by which form of the DIV instruction
>>>>>>>>>>>>>>> is
>>>> used.
>>>>>>>>>>>>>>> Also, depending on the spills and neighboring instructions
>>>>>>>>>>>>>>> it is difficult to enumerate all the possible ways this
>>>>>>>>>>>>>>> optimization can be
>>>>>>>> carried out.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> The right thing to do is to custom lower into a X86ISD
>>>>>>>>>>>>>>> target node and select to a pseudo instruction. It can be
>>>>>>>>>>>>>>> marked with usesCustomInserter = 1 which allows you to
>>>>>>>>>>>>>>> expand it into the machine
>>>>>>>>>>>>> instructions and create new BBs if needed.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> It makes sense to do the optimization as an IR pass
>>>>>>>>>>>>>>>> because none of this has
>>>>>>>>>>>>>>> happened yet. Then CSE and scheduling can operate on the
>>>>>>>>>>>>>>> new
>>>>>>>> BBs.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Scheduling doesn't happen until later and machine CSE can
>>>>>>>>>>>>>>> handle these instructions. Please look at some examples to
>>>>>>>>>>>>>>> learn more about how existing targets handle complex
>> cases.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Evan
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Tyler
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>>>>>>> From: Evan Cheng [mailto:evan.cheng at apple.com]
>>>>>>>>>>>>>>>>> Sent: Tuesday, July 31, 2012 6:41 PM
>>>>>>>>>>>>>>>>> To: Nowicki, Tyler
>>>>>>>>>>>>>>>>> Cc: Eli Friedman; llvm-commits at cs.uiuc.edu
>>>>>>>>>>>>>>>>> Subject: Re: [llvm-commits] [PATCH]IDIV->DIVB Atom
>>>>>>>>>>>>>>>>> Optimization
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> Hi Tyler,
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> This doesn't look like the right approach. I don't
>>>>>>>>>>>>>>>>> understand why you are adding a LLVM ir pass to do the
>>>>>>>>>>>>>>>>> expansion. Why can't this be done at x86 instruction
>>>>>>>>>>>>>>>>> lowering
>>>>>>> time?
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> Evan
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> On Jul 31, 2012, at 1:25 PM, "Nowicki, Tyler"
>>>>>>>>>>>>>>>>> <tyler.nowicki at intel.com>
>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> Thanks for the review. Good catch on the missing check,
>>>>>>>>>>>>>>>>>> I've also made use
>>>>>>>>>>>>>>>>> of DenseMap as suggested. Test cases omitted from the
>>>>>>>>>>>>>>>>> previously patch are in place again and a test case for
>>>>>>>>>>>>>>>>> the missing check you found has been added.
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> Please review my corrections.
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> Tyler Nowicki
>>>>>>>>>>>>>>>>>> Intel
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>>>>>>>>> From: Eli Friedman [mailto:eli.friedman at gmail.com]
>>>>>>>>>>>>>>>>>>> Sent: Friday, July 27, 2012 2:17 PM
>>>>>>>>>>>>>>>>>>> To: Nowicki, Tyler
>>>>>>>>>>>>>>>>>>> Cc: Sean Silva; llvm-commits at cs.uiuc.edu
>>>>>>>>>>>>>>>>>>> Subject: Re: [llvm-commits] [PATCH]IDIV->DIVB Atom
>>>>>>>>>>>>>>>>>>> Optimization
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> On Fri, Jul 27, 2012 at 8:35 AM, Nowicki, Tyler
>>>>>>>>>>>>>>>>>>> <tyler.nowicki at intel.com>
>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>> This patch has not yet been reviewed. Could someone
>>>>> take
>>>>>>>>>>>>>>>>>>>> a look at
>>>>>>>>>>>>> it?
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> +    typedef std::pair<Value *, Value *> DivBOperands;
>>>>>>>>>>>>>>>>>>> + //Dividend
>>>>>>>>>>>>>>>>>>> Value, Divisor Value
>>>>>>>>>>>>>>>>>>> +    typedef std::pair<PHINode *, PHINode *>
>>>>>>>>>>>>>>>>>>> + DivBPhiNodes; //Quotient
>>>>>>>>>>>>>>>>>>> PHI, Remainder Phi
>>>>>>>>>>>>>>>>>>> +    typedef std::map<DivBOperands, DivBPhiNodes>
>>>>>>>>>>>>>>>>>>> + DivBCacheTy;
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> llvm::DenseMap?
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> +  // Replace operation value with previously
>>>>>>>>>>>>>>>>>>> + generated phi node DivBPhiNodes Value =
>>>>>>>>>>>>>>>>>>> + CacheI->second;  if
>>>>>>> (UseDivOp) {
>>>>>>>>>>>>>>>>>>> +    // Replace all uses of div instruction with
>>>>>>>>>>>>>>>>>>> + quotient phi
>>>>>>> node
>>>>>>>>>>>>>>>>>>> +    J->replaceAllUsesWith(Value.first);
>>>>>>>>>>>>>>>>>>> +  } else {
>>>>>>>>>>>>>>>>>>> +    // Replace all uses of rem instruction with
>>>>>>>>>>>>>>>>>>> + remainder phi
>>>>>>>> node
>>>>>>>>>>>>>>>>>>> +    J->replaceAllUsesWith(Value.second);
>>>>>>>>>>>>>>>>>>> +  }
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> Looks like there's a missing check for UseSignedOp
>> here.
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> The patch doesn't appear to include any testcases.
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> -Eli
>>>>>>>>>>>>>>>>>> <idivtodivb-
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>> svn.patch>_______________________________________________
>>>>>>>>>>>>>>>>>> llvm-commits mailing list llvm-commits at cs.uiuc.edu
>>>>>>>>>>>>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> _______________________________________________
>>>>>>>>> llvm-commits mailing list
>>>>>>>>> llvm-commits at cs.uiuc.edu
>>>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>>>>> 
>>>>>> <BypassSlowDiv5-svn.patch>
>>> 
> 
> <GenericBypassSlowDiv3-svn.patch>




More information about the llvm-commits mailing list