[PATCH] D37121: [DivRemHoist] add a pass to move div/rem pairs into the same block (PR31028)

Sanjay Patel via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Sep 6 14:59:23 PDT 2017


spatel added a comment.

In https://reviews.llvm.org/D37121#862515, @hfinkel wrote:

> In https://reviews.llvm.org/D37121#862251, @spatel wrote:
>
> > Patch updated:
> >  As I was rummaging around in the cost models (realizing they're quite wrong for div/rem/mul in their own ways) and trying to bend them to express what I wanted...
> >
> > I decided it would be better to just add a new TTI shim (hasDivRemOp()) to implement the behavior that I initially had when this was a CGP add-on. I realize we're at the edge of IR vs. backend transforms here, so let me know if this is wrong. Based on the existing hooks for things like masked ops, however, this doesn't look out of place to me.
> >
> > I also figured it was best to just replace a non-hoistable rem instruction in-place and avoid getting into the inaccurate cost model mess to decide when hoisting would be the right thing to do. So now the i128 test for x86 shows that optimization instead of just bailing out. I think that's also the right thing to do for all targets that don't have HW divrem (everything in trunk besides x86?).
>
>
> Is this worthwhile only if the target has a combined dev/rem operation? Isn't the multiplication plus subtraction generally cheaper than an independent remainder operation in any case. Maybe, in the latter case, you simply want to do that operational replacement instead of hoisting?


No. Sorry if this wasn't clear - we're doing the replacement for all targets, not just x86. The hoisting of rem is the special case for x86.



================
Comment at: include/llvm/Analysis/TargetTransformInfo.h:452
+  /// would typically be allowed using throughput or size cost models.
+  bool hasDivRemOp(Type *DataType) const;
+
----------------
hfinkel wrote:
> I think that you might as well add a Boolean parameter indicating whether or not this is for a signed or unsigned division (and then, in the X86 implementation, you can test the right ISD opcode for each).
Yes - let me update that; that was just an oversight.


https://reviews.llvm.org/D37121





More information about the llvm-commits mailing list