[PATCH] [X86] Integer multiplication improvements
Michael Kuperstein
michael.m.kuperstein at intel.com
Sun Feb 1 07:29:27 PST 2015
Thanks, Chandler!
================
Comment at: lib/Target/X86/X86ISelLowering.cpp:24019-24023
@@ +24018,7 @@
+
+ // How many steps will we have to peform to replace the
+ // MUL. We limit this to 3 steps, based on imul latency of 4
+ // (If the latency is equal, this is probably not a win)
+ unsigned Steps = 0;
+ const unsigned MAX_STEPS = 3;
+ // Having to negate or shift is a step
----------------
chandlerc wrote:
> This doesn't really make sense. We shouldn't be hard coding these things here.
>
> - LEA as slower on Atom than on other X86 chips.
> - IMUL can be anything from 6 to 14 cycles on some X86 chips.
>
> We have a schedule, we should consult it for the expected latency of these and use those to drive the limits. (And we should fix the schedule to be correct if it is currently wrong.)
>
> Also, I wouldn't use capitals here. This isn't a macro.
Right, didn't think about Atom, I'll check whether it's worthwhile there, thanks!
The "4" is just a conservative estimate - precisely because it can be anything in a fairly large range (can be below 6 - starts at 4 on HSW). I don't think we model that anywhere, so taking a number from the schedule didn't look like the right thing to do. In any case, I'll take a look at what the schedule currently is, and whether it makes sense.
================
Comment at: lib/Target/X86/X86ISelLowering.cpp:24025-24026
@@ +24024,4 @@
+ // Having to negate or shift is a step
+ Steps += (IsNeg ? 1 : 0);
+ Steps += (ShiftAmt ? 1 : 0);
+
----------------
chandlerc wrote:
> Just use the boolean?
I always thought adding a boolean looks odd, but sure.
================
Comment at: lib/Target/X86/X86ISelLowering.cpp:24060-24061
@@ -24032,1 +24059,4 @@
+ // Do not add new nodes to DAG combiner worklist.
+ if (MulAmt == 1)
DCI.CombineTo(N, NewMul, false);
+
----------------
chandlerc wrote:
> It's pretty hostile to the DAG to create nodes unless they are absolutely going to be used. It's almost certainly better to do the math in a tight loop first to verify that we'll actually combine this node.
>
> That will also let you use early exit.
Got it.
http://reviews.llvm.org/D7320
EMAIL PREFERENCES
http://reviews.llvm.org/settings/panel/emailpreferences/
More information about the llvm-commits
mailing list