[llvm-commits] [PATCH][FastMath, InstCombine] Fadd/Fsub optimizations
Shuxin Yang
shuxin.llvm at gmail.com
Wed Dec 12 16:21:56 PST 2012
>> Pleas advice what I should do next. I personally think it is very difficult
>> to significant reduce the is.patch.
>> If you believe the *.patch has lots of room to improve in terms of
>> "complexity", could you please pin
>> point the code such that I can focus on it.
>>
>> IMHO, I believe pattern match would only things even worse.
> (Please read this part carefully; sorry it took me longer than I
> should have to figure this out.) I think it's worth comparing this
> patch to what we do for integer operations. For integer operations,
> we handle the transformations you cited with a combination of
> InstCombiner::SimplifyAssociativeOrCommutative,
> InstCombiner::SimplifyUsingDistributiveLaws, and a few simple
> peepholes which I'm pretty sure are already implemented for fadd.
> Would it be possible to extend those methods instead of implementing
> new infrastructure for floating-point?
I need to re-read this two functions one more time and go back to you.
After I modify the Instruction::isAssociative(), these two functions can
pick up some
opportunities, but not all that we need.
One weak argument I have right now to differentiate int & fp
implementation is that
FP is way more complicate. If we try to solve int & fp in a uniform way,
we have to
slow int-side down a bit.
On the other hand, fp opt has its unique feature. For instance, if I
optimize
4.0 * x - 3.0 * x => 1.0 * x, we are better off demoting 1.0 into integer 1
to expose more opportunities.
Int opt doesn't need to handle such situation.
>
> Some more detailed comments below, if you decide to continue in the
> direction of this patch.
>
> + /// This functor works with std::sort to permute addends such that those
> + /// having same symbolic-value are clustered together.
> + struct FAddendCmp {
> + bool operator()(const FAddend *A1, const FAddend *A2) {
> + return A1->getSymVal() < A2->getSymVal();
> + }
> + };
>
> Doing relational comparisons on pointers is generally a bad idea; in
> this instance, it looks like this leads to the code generating the
> simplified addition in random order.
True. I will figure out a way to make it more deterministic.
>
> More generally, it seems like you should have some sort of unified
> approach for the order in which the operations are generated, to
> encourage further simplifications. See
> InstCombiner::SimplifyAssociativeOrCommutative, getComplexity in
> InstCombine.h, etc.
>
> + if (isInt && That.isInt) {
> + int Res = IntVal * (int)That.IntVal;
> + assert(!InsaneIntVal(Res) && "Insane int value");
>
> What prevents you from hitting this assertion? Is this just because
> it treats x * 2.0 differently from x+x? FAddendCoef doesn't really
> seem to reflect that properly.
For now, I don't try to demote a FP coefficient (say 2.0) to integer value.
Now that, there are no more than 4 addends involved in the optimization.
The *INT* coefficient should fall in [-4, 4].
>
> + // The constructor has to initialize a APFloat, which is uncessary for
> + // most addends which have coefficient either 1 or -1. So, the constructor
> + // is expensive. In order to avoid the cost of the constructor, we should
> + // reuse some instances whenever possible. The pre-created instances
> + // FAddCombine::Add[0-5] embodies this idea.
>
> Outdated comment?
Outdated for is.patch. Valid for was.patch.
> + if ((C0 = dyn_cast<ConstantFP>(Opnd0)) && C0->isZero())
> + Opnd0 = 0;
>
> isZero() only matches 0.0, which is probably not what you want here.
As far as I can understand ConstantFP::isZero() call APFloat()::isZero()
which return
true if the value is +/- 0.0.
>
> + if (I->getOpcode() == Instruction::FMul) {
> + Value *V0 = I->getOperand(0);
> + Value *V1 = I->getOperand(1);
> + if (ConstantFP *C = dyn_cast<ConstantFP>(V0)) {
> + Addend0.set(C, V1);
> + return 1;
> + }
> +
> + if (ConstantFP *C = dyn_cast<ConstantFP>(V1)) {
> + Addend0.set(C, V0);
> + return 1;
> + }
> + }
>
> ConstantFP values should be canonicalized to the RHS of fmul, so you
> don't need to check for one on the LHS.
>
> + if (I->getOpcode() != Instruction::FAdd &&
> + I->getOpcode() != Instruction::FSub)
> + return 0;
>
> Assert?
You are right, Assert is better to save compile time as the the
condition is already checked before
calling this function.
>
>
> Having two versions of FAddend::drillDownOneStep is kind of confusing;
> do we actually need them both?
One is used to break a give "Value* V" into 1 or 2 addends.
The other one is to break an addend (say <3.4, V2>, NOTE: the
coefficient is not 1) into addends.
Suppose V = X + Y, the two addends we have are 3.4*X and 3.4Y.
> (Avoiding creating a single FAddend
> seems like a micro-optimization which isn't worthwhile.) If we do
> need both, can you rename one?
I'm sure I understand. Could you please elaborate?
> There are a bunch of members on FAddCombine which it seems like you
> could just move into FAddCombine::simplify; I suppose that's just
> because you were experimenting?
>
They can be added to to FAddCombine::simplify(), but this function
already has 70 LOC.
I'd like a function is reasonable small to make the logic clear, and let
inliner to help me out
inlining helper functions whenever and wherever it fits.
More information about the llvm-commits
mailing list