[llvm-commits] [PATCH][FastMath, InstCombine] Fadd/Fsub optimizations
Eli Friedman
eli.friedman at gmail.com
Mon Dec 17 16:49:32 PST 2012
On Mon, Dec 17, 2012 at 4:34 PM, Shuxin Yang <shuxin.llvm at gmail.com> wrote:
>
> On 12/17/12 3:49 PM, Eli Friedman wrote:
>>
>> On Fri, Dec 14, 2012 at 10:45 AM, Shuxin Yang <shuxin.llvm at gmail.com>
>> wrote:
>>>
>>> I'd like to resurrect my previous change. See the attachment. It was
>>> revised per
>>> your requests. In addition to that, I change the Coeffficient class a bit
>>> such
>>> that the cost of default constructor is merely a 4-byte-store-zero. This
>>> cost
>>> renders it possible to "abuse" this data structure.
>>>
>>> As to the determinstic sorting, I have not yet figure out a cost
>>> effective
>>> way
>>> for that matter. Note that stable_sort() won't help as all Addends
>>> sharing
>>> the same symbolic value will be folded, we don't care their relative
>>> posittion
>>> before folding.
>>
>> We discussed this in person; it sounded like you have a viable solution.
>
> Yes
>
>
>
>> + bool IsFp;
>> +
>> + // True iff FpValBuf contains a instance of APFloat.
>> + bool BufHasFpVal;
>>
>> How are these different?
>
> "BufHasFPVal" indicate that char-buff is not an array of garbage, there is a
> APFloat instance over there,
> it needs to be destructed finally.
>
> Once BufHasFPVal is set, it won't be reset.
>
> IsFP could be flipped back and forth.
OK.
>> It doesn't look like you addressed my earlier comment about
>> FAddend::drillDownOneStep.
>
> Sorry. I will change the name.
>
>> + Value *V0 = I->getOperand(0);
>> + Value *V1 = I->getOperand(1);
>> + InstQuota = ((!isa<Constant>(V0) && V0->hasOneUse()) &&
>> + (!isa<Constant>(V1) && V1->hasOneUse())) ? 2 : 1;
>>
>> I'm not sure what your logic is here; can you add a comment to explain?
>
> Suppose the input instruction sequence is:
>
> V0 = ...
> V1 = ...
> this-instruction-to-be-simplified = V0 +/- V1 (or other variant).
>
> The output instruction sequence is:
> New1 =
> New2 = ..
>
> If the output is better than the input, it should save at least one
> instruction.
> In light of that rule, if both V0 and V1 have multiple use, the output
> sequence should have
> no more than one instruction.
>
> If neither V0 nor V1 has multi-use, we can afford two instructions in the
> simplfied instruction sequence.
>
> If only of the V0 and V1 has multi-use, the quota is one.
>
> If both V0 and V1 have multi-uses. the quota is one. In this case, we may
> not save instruction,
> but it doesn't make things worse.
>
> e.g.
> V0 = x + y (mult-use)
> V1 = x - y (multi-use)
> if "this instruction" is "R = V0 - V1", the quota for the output
> instruction is 1 instruction. So, we
> get the result "Y + Y", which doesn't reduce the overall instruction #, it
> does increase the # of instruction
> either.
By the same logic, why isn't the quota 3 for the case where nothing is
multi-use? Or are you concerned about the transformation falling into
an infinite loop?
>> +unsigned FAddend::drillDownOneStep
>> + (Value *Val, FAddend &Addend0, FAddend &Addend1) {
>> + Instruction *I = 0;
>> + if (Val == 0 || !(I = dyn_cast<Instruction>(Val)))
>> + return 0;
>>
>> I'm not sure what you're trying to do with this check.
>>
>> + // step 4: Try to optimize Opnd0 + Opnd1_0 [+ Opnd1_1]
>> + if (Opnd1_ExpNum) {
>> + AddendVect AllOpnds;
>> + AllOpnds.push_back(&Opnd0);
>> + AllOpnds.push_back(&Opnd1_0);
>> + if (Opnd1_ExpNum == 2)
>> + AllOpnds.push_back(&Opnd1_1);
>> +
>> + if (Value *R = simplifyFAdd(AllOpnds, 1))
>> + return R;
>> + }
>>
>> This seems redundant: why wouldn't it be handled in "step 3"? Maybe
>> the way drillDownOneStep works should be changed a little?
>>
>
> Nope.
> Consider instruction sequence:
> V0 = X + Y
> V1 = m - V0
> V2 = V1 + V0
>
> Step 3 tries to optimize : X + Y + m + (-1)*V0, which lead to nothing.
> Step 4 tires to optimize V0 + m + (-1)*V0, and get simplified expr
> "m".
Ah.
-Eli
More information about the llvm-commits
mailing list