[llvm-dev] RFC: Moving DAG heuristic-based transforms to MI passes
Andrew V. Tischenko via llvm-dev
llvm-dev at lists.llvm.org
Mon Jan 30 00:42:28 PST 2017
Does it mean I have LGTM for https://reviews.llvm.org/D2685?
And does it mean I should open and fix a new PR related to "worst-case
On 1/28/2017 6:21 PM, Hal Finkel wrote:
> On 01/28/2017 04:19 AM, Andrew V. Tischenko wrote:
>> In fact to commit the change before dealing with worst-case
>> performance is a good idea because here we have 2 different issues.
>> But the main idea of this RFC is an attempt to show the better
>> approach to to these kinds of transformations and to suggest to use
>> this approach in the future.
>> At the same time, I'm trying to explain that this patch is not the
>> performance one because the generated code is almost identical to
>> what we have just now. It is a suggestion to change the strategy in
>> such transformations elaborating. If the community accept this new
>> strategy we're ready to introduce new similar transformations,
>> automate the framework, etc. But of course it will be themes for new
>> RFCs and discussions.
> I'm not sure that your RFC is detailed enough to really get feedback
> from those not already attuned to the situation. As someone who is,
> I'm in favor of moving forward with this. We're already using the
> MachineCombiner in many targets (AArch64, PowerPC, and already X86) to
> perform these kinds of transformations, and while we need work on the
> infrastructure in the near term (fixing MachineCombiner worst-case
> performance issues, TableGen integration, etc.), this is important for
> increasing code quality in cases where there are non-trivial
> latency/throughput tradeoffs.
>> On 1/27/2017 11:56 PM, Hal Finkel wrote:
>>> On 01/27/2017 10:30 AM, Andrew V. Tischenko via llvm-dev wrote:
>>>> All llvm-devs,
>>>> We're going to introduce the new possible implementation for such
>>>> optimizations as reciprocal estimation instead of fdiv. In short
>>>> it's a replacement of fdiv instruction (which is very expensive in
>>>> most of CPUs) with alternative sequence of instructions which is
>>>> usually cheaper but has appropriate precision (see genReciprocalDiv
>>>> in lib/Target/X86/X86InstrInfo.cpp for details). There are other
>>>> similar optimizations like usage of rsqrt, etc. but at the moment
>>>> we're dealing with recip estimation only - see
>>>> https://reviews.llvm.org/D26855 for details.
>>>> The current version of optimization is done at DAG Combiner level
>>>> when we don't know the exact target instructions which will be used
>>>> by CodeGen. As result we don't know the real cost of the
>>>> alternative sequence and can't compare that cost with the cost of
>>>> the single fdiv. As result the decision to select an alternative
>>>> sequence (made on compiler options only) could be wrong because
>>>> modern CPUs introduce very cheap fdiv and we should use it directly.
>>>> We suggest to move the implementation from DAG heuristics to
>>>> MI-scheduler-based transformations (Machine Combiner). At that time
>>>> we know exact target instructions and are able to use
>>>> scheduler-based cost model. This knowledge allows as to select
>>>> proper code sequence for final target code generation.
>>>> A possible disadvantage of the new implementation is compile time
>>>> increasing (as discussed in D26855), but we expect to make
>>>> improvements in that area. For the initial change (reciprocal
>>>> transform), any difference is limited to fast-math compilations.
>>>> Any objections, suggestion, comments?
>>> Are you asking whether is okay to commit the change first and then
>>> look at the MachineCombiner's worst-case performance in followup? In
>>> general, I think that moving to using the MachineCombiner for these
>>> kinds of transformations, where there are complex tradeoffs between
>>> latency, throughput, etc., is the right direction.
>>>> LLVM Developers mailing list
>>>> llvm-dev at lists.llvm.org
More information about the llvm-dev