<div style="line-height:1.7;color:#000000;font-size:14px;font-family:Arial"><div style="margin: 0;">Thanks. I have checked the hook <span style="font-family: arial; white-space: pre-wrap;">DAGCombiner::isMulAddWithConstProfitable</span></div><div style="margin: 0;">And I think the above condition is too aggressive.</div><div style="margin: 0;"><br></div><div style="margin: 0;"><div style="margin: 0;">  // If the add only has one use, this would be OK to do.</div><div style="margin: 0;">  if (AddNode.getNode()->hasOneUse())</div><div style="margin: 0;">    return true;</div><div><br></div><div>Shall we make it to </div></div><div style="margin: 0px;">  if (AddNode.getNode()->hasOneUse() && TargetLowering.isCheaperCommuteAddMul(......))</div><div style="margin: 0px;">    return true;</div><div style="margin: 0;"><br></div><div style="margin: 0;">The virtual hook nethod isCheaperCommuteAddMul will return true by default, but specific targets like arm/riscv can make their own decision.</div><div style="margin: 0;"><br></div><div style="margin: 0;">Just like virtual bool TargetLowering::decomposeMulByConstant</div><div style="margin: 0;"><br></div><div style="margin: 0;">What's your opinion?</div><div style="margin: 0;"><br></div><div style="margin: 0;">Ben</div><div style="margin: 0;"><br></div><div style="position:relative;zoom:1"></div><div id="divNeteaseMailCard"></div><p style="margin: 0;"><br></p><pre><br>At 2020-07-01 02:29:36, "Sam Elliott" <selliott@lowrisc.org> wrote:

>The relevant DAGCombine is controlled by a hook: `DAGCombiner::isMulAddWithConstProfitable`.

>

>It may be that, when optimising for size, this hook should take into account whether `c1*c2` fits into an add immediate, something that the DAGCombiner has access to via TargetLoweringInfo (but does not often use). 

>

>I imagine any change here could have far reaching consequences in terms of introducing changes across lots of targets - I'm not sure what the best approach is to handle that. 

>

>Sam

>

>> On 30 Jun 2020, at 2:08 pm, Sanjay Patel via llvm-dev <llvm-dev@lists.llvm.org> wrote:

>> 

>> Yes - this has been in InstCombine for a long time:

>> https://github.com/llvm/llvm-project/blob/master/llvm/lib/Transforms/InstCombine/InstCombineMulDivRem.cpp#L268

>> 

>> We could say that the canonicalization should be reversed, but that probably uncovers more missing optimizations.

>> 

>> The code size concern is legitimate. For example on x86, gcc asm is 2 bytes smaller on this example:

>> https://godbolt.org/z/GK9FEL

>> 

>> To improve this, we could add a generic transform to DAGCombiner to invert the transform that was done in IR. That transform would only be enabled with a TargetLowering hook that allows targets to decide if the constants or other factors (optimizing for size) make it worthwhile to reorder the ops.

>> 

>> On Mon, Jun 29, 2020 at 11:54 PM Craig Topper via llvm-dev <llvm-dev@lists.llvm.org> wrote:

>> This is likely more of a canonicalization than an optimization. This is done so that if you got the add followed by mul input or the mul followed by add input they would be canonicalized to the same sequence. Maybe not the optimal sequence but at least the same. I didn't check, but I suspect this is happening in InstCombine in the middle end.

>> 

>> ~Craig

>> 

>> 

>> On Mon, Jun 29, 2020 at 8:37 PM Ben Shi via llvm-dev <llvm-dev@lists.llvm.org> wrote:

>> Hi, James,

>> 

>> Thanks for your reply. 

>> 

>> I do not think it is always true, that "mul then add" is faster than "add then mul".

>> 

>> For example, 

>> 

>> A small immediate can be directly encoded in the instruction, but it becomes a larger one after a multiplication, which has to be loaded from the constant pool (extra memory access).

>> 

>> So I wonder, is it possile to prevent it, via changes to the config of the base class TargetLowering，than writing special custom C++ code.

>> 

>> Ben

>> 

>> 

>> 

>> 

>> 

>> 

>> At 2020-06-30 07:46:41, "James Courtier-Dutton" <james.dutton@gmail.com

>> > wrote:

>> >Hi Ben,

>> >

>> >Why do you want to stop it?

>> >"mul then add" is faster than "add then mul".

>> >The result is the same in both cases.

>> >

>> >On Mon, 29 Jun 2020 at 22:11, Ben Shi via llvm-dev

>> ><

>> llvm-dev@lists.llvm.org

>> > wrote:

>> >>

>> >> Hello,

>> >> I have an instruction pattern like

>> >>   %2 = add i32 %0, 25

>> >>   %3 = mul i32 %2, 525

>> >>

>> >> and llvm will optimize it to

>> >> %2 = mul i32 %0, 525

>> >> %3 = add i32 %2, 12525

>> >>

>> >> how to prevent it?

>> >>

>> >> _______________________________________________

>> >> LLVM Developers mailing list

>> >> 

>> llvm-dev@lists.llvm.org

>> 

>> >> 

>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

>> _______________________________________________

>> LLVM Developers mailing list

>> llvm-dev@lists.llvm.org

>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

>> _______________________________________________

>> LLVM Developers mailing list

>> llvm-dev@lists.llvm.org

>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

>> _______________________________________________

>> LLVM Developers mailing list

>> llvm-dev@lists.llvm.org

>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

>

>--

>Sam Elliott

>Software Team Lead

>Senior Software Developer - LLVM and OpenTitan

>lowRISC CIC

</pre></div>