<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class="">
Hi Ahmed,
<div class=""><br class="">
</div>
<div class="">Thanks for the explanation.</div>
<div class=""><br class="">
</div>
<div class="">I think this should be simpler than you think. SDAGBuilder::visitSelect is very conservative currently. It will only look through SplitVectors, and expect to find a Legal or Custom action at the end of the chain.</div>
<div class=""><br class="">
</div>
<div class="">If we teach SDAGBuilder::visitSelect to understand about promoted types, it could understand that f16 would be promoted to f32, and FMINNAN/FMINNUM are legal on f32.</div>
<div class=""><br class="">
</div>
<div class="">If we do this (I’ve noticed a bug where fminnan doesn’t know how to promote itself… whoops), we get:</div>
<div class=""><br class="">
</div>
<div class="">
<div class=""><span class="Apple-tab-span" style="white-space:pre"></span>fcvt<span class="Apple-tab-span" style="white-space:pre">
</span>s1, h1</div>
<div class=""><span class="Apple-tab-span" style="white-space:pre"></span>fcvt<span class="Apple-tab-span" style="white-space:pre">
</span>s0, h0</div>
<div class=""><span class="Apple-tab-span" style="white-space:pre"></span>fminnm<span class="Apple-tab-span" style="white-space:pre">
</span>s0, s0, s1</div>
<div class=""><span class="Apple-tab-span" style="white-space:pre"></span>fcvt<span class="Apple-tab-span" style="white-space:pre">
</span>h0, s0</div>
<div class=""><span class="Apple-tab-span" style="white-space:pre"></span>fcvt<span class="Apple-tab-span" style="white-space:pre">
</span>s0, h0</div>
<div class=""><span class="Apple-tab-span" style="white-space:pre"></span>ret</div>
</div>
<div class=""><br class="">
</div>
<div class="">Which looks about right? So a fairly simple improvement.</div>
<div class=""><br class="">
</div>
<div class="">Cheers,</div>
<div class=""><br class="">
</div>
<div class="">James</div>
<div class=""><br class="">
<div>
<blockquote type="cite" class="">
<div class="">On 18 Aug 2015, at 21:54, Ahmed Bougacha <<a href="mailto:ahmed.bougacha@gmail.com" class="">ahmed.bougacha@gmail.com</a>> wrote:</div>
<br class="Apple-interchange-newline">
<div class="">
<div dir="ltr" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class="">
<span style="line-height: normal;" class="">On Tue, Aug 18, 2015 at 12:51 PM, James Molloy<span class="Apple-converted-space"> </span></span><span dir="ltr" style="line-height: normal;" class=""><<a href="mailto:James.Molloy@arm.com" target="_blank" class="">James.Molloy@arm.com</a>></span><span style="line-height: normal;" class=""><span class="Apple-converted-space"> </span>wrote:</span><br class="">
<div class="gmail_extra">
<div class="gmail_quote">
<blockquote class="gmail_quote" style="margin: 0px 0px 0px 0.8ex; border-left-width: 1px; border-left-color: rgb(204, 204, 204); border-left-style: solid; padding-left: 1ex;">
<div dir="auto" class="">
<div class="">Hi Ahmed,</div>
<div class=""><br class="">
</div>
<div class="">I'm afraid I'm not quite certain what you mean - could you elaborate?</div>
</div>
</blockquote>
<div class=""><br class="">
</div>
<div class=""><span style="line-height: normal;" class="">I was thinking of something like:</span><br class="">
</div>
<div class=""><br class="">
</div>
<div class="">
<div class="">
<div class="">
<div class="">
<div class="">
<div class="">
<div class="">define double @f(float %a, float %b) #0 {</div>
<div class=""> %cc = fcmp olt float %a, %b</div>
<div class=""> %xa = fpext float %a to double</div>
<div class=""> %xb = fpext float %b to double</div>
<div class=""> %r = select i1 %cc, double %xa, double %xb</div>
<div class=""> ret double %r</div>
<div class="">}</div>
<div class=""><br class="">
</div>
<div class="">attributes #0 = { "no-infs-fp-math"="true" "no-nans-fp-math"="true" "unsafe-fp-math"="true" }</div>
</div>
</div>
</div>
</div>
</div>
</div>
<div class=""><br class="">
</div>
<div class="">for which the previous lowering would have generated something like:</div>
<div class=""><br class="">
</div>
<div class=""> fcvt d0, s0</div>
<div class=""> fcvt d1, s1</div>
<div class=""> fmin d0, d0, d1</div>
<div class=""><br class="">
</div>
<div class="">But we now generate:</div>
<div class=""><br class="">
</div>
<div class="">
<div class="">
<div class="">
<div class=""> fcvt d2, s0</div>
<div class=""> fcvt d3, s1</div>
<div class=""> fcmp s0, s1</div>
<div class=""> fcsel d0, d2, d3, lt</div>
<div class=""><br class="">
</div>
</div>
</div>
</div>
<div class="">For float/double, it's not a problem, because t<span style="line-height: normal;" class="">he now-canonical IR, per InstCombine, would be:</span></div>
<div class=""><br class="">
</div>
<div class="">
<div class="">define double @f(float %a, float %b) #0 {</div>
<div class=""> %cc = fcmp olt float %a, %b</div>
<div class=""> %r.v = select i1 %cc, float %a, float %b</div>
<div class=""> %r = fpext float %r.v to double</div>
<div class=""> ret double %r</div>
<div class="">}</div>
</div>
<div class=""><br class="">
</div>
<div class="">Which we can indeed match. However, for half, fpexts on the fcmp operands are generated by the legalizer, so neither instcombine nor the builder can match those anymore. We basically have a DAG for:</div>
<div class=""><br class="">
</div>
<div class="">define double @f(half %a, half %b) #0 {</div>
<div class=""> %sa = fpext half %a to float</div>
<div class=""> %sb = fpext half %b to float</div>
<div class=""> %cc = fcmp olt float %sa, %sb</div>
<div class=""> %xa = fpext float %sa to double</div>
<div class=""> %xb = fpext float %sb to double</div>
<div class=""> %r = select i1 %cc, double %xa, double %xb</div>
<div class=""> ret double %r</div>
<div class="">}</div>
<div class=""><br class="">
</div>
<div class="">Perhaps we want a similar DAG combine, or even have the builder match the fpext on the select result directly? Not sure either's a good idea.</div>
<div class=""><br class="">
</div>
<div class="">Again, not a big problem by any means, just wondering if we can/should improve this somehow!</div>
<div class=""><span style="line-height: normal;" class=""><br class="">
</span></div>
<div class=""><span style="line-height: normal;" class="">-Ahmed</span></div>
<div class=""><span style="line-height: normal;" class=""> </span><br class="">
</div>
<blockquote class="gmail_quote" style="margin: 0px 0px 0px 0.8ex; border-left-width: 1px; border-left-color: rgb(204, 204, 204); border-left-style: solid; padding-left: 1ex;">
<div dir="auto" class="">
<div class="">Cheers,</div>
<div class=""><br class="">
</div>
<div class="">James</div>
<div class="">
<div class="">
<div class=""><br class="">
<br class="">
</div>
<div class=""><br class="">
On 18 Aug 2015, at 19:36, Ahmed Bougacha <<a href="mailto:ahmed.bougacha@gmail.com" target="_blank" class="">ahmed.bougacha@gmail.com</a>> wrote:<br class="">
<br class="">
</div>
<blockquote type="cite" class="">
<div class="">
<div dir="ltr" class="">Hey James,
<div class=""><br class="">
</div>
<div class="">If I read this correctly, we can't fold fpext anymore, right? <span style="line-height: normal;" class="">Not a big deal right now, as the main case where I see this being useful is a promoted f16 min/max, and that's not even caught currently.</span></div>
<div class=""><span style="line-height: normal;" class="">Do you think we should do something similar in the builder? If so, let's file a PR!</span></div>
</div>
<div class="gmail_extra"><br clear="all" class="">
<div class="">
<div class="">-Ahmed</div>
</div>
<br class="">
<div class="gmail_quote">On Mon, Aug 17, 2015 at 12:13 AM, James Molloy via llvm-commits<span class="Apple-converted-space"> </span><span dir="ltr" class=""><<a href="mailto:llvm-commits@lists.llvm.org" target="_blank" class="">llvm-commits@lists.llvm.org</a>></span><span class="Apple-converted-space"> </span>wrote:<br class="">
<blockquote class="gmail_quote" style="margin: 0px 0px 0px 0.8ex; border-left-width: 1px; border-left-color: rgb(204, 204, 204); border-left-style: solid; padding-left: 1ex;">
Author: jamesm<br class="">
Date: Mon Aug 17 02:13:20 2015<br class="">
New Revision: 245198<br class="">
<br class="">
URL:<span class="Apple-converted-space"> </span><a href="http://llvm.org/viewvc/llvm-project?rev=245198&view=rev" rel="noreferrer" target="_blank" class="">http://llvm.org/viewvc/llvm-project?rev=245198&view=rev</a><br class="">
Log:<br class="">
Remove hand-rolled matching for fmin and fmax.<br class="">
<br class="">
SDAGBuilder now does this all for us.<br class="">
<br class="">
Modified:<br class="">
<span class="Apple-converted-space"> </span>llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.cpp<br class="">
<br class="">
Modified: llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.cpp<br class="">
URL:<span class="Apple-converted-space"> </span><a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.cpp?rev=245198&r1=245197&r2=245198&view=diff" rel="noreferrer" target="_blank" class="">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.cpp?rev=245198&r1=245197&r2=245198&view=diff</a><br class="">
==============================================================================<br class="">
--- llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.cpp (original)<br class="">
+++ llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.cpp Mon Aug 17 02:13:20 2015<br class="">
@@ -393,6 +393,8 @@ AArch64TargetLowering::AArch64TargetLowe<br class="">
setOperationAction(ISD::FROUND, Ty, Legal);<br class="">
setOperationAction(ISD::FMINNUM, Ty, Legal);<br class="">
setOperationAction(ISD::FMAXNUM, Ty, Legal);<br class="">
+ setOperationAction(ISD::FMINNAN, Ty, Legal);<br class="">
+ setOperationAction(ISD::FMAXNAN, Ty, Legal);<br class="">
}<br class="">
<br class="">
setOperationAction(ISD::PREFETCH, MVT::Other, Custom);<br class="">
@@ -484,7 +486,6 @@ AArch64TargetLowering::AArch64TargetLowe<br class="">
<br class="">
setTargetDAGCombine(ISD::SELECT);<br class="">
setTargetDAGCombine(ISD::VSELECT);<br class="">
- setTargetDAGCombine(ISD::SELECT_CC);<br class="">
<br class="">
setTargetDAGCombine(ISD::INTRINSIC_VOID);<br class="">
setTargetDAGCombine(ISD::INTRINSIC_W_CHAIN);<br class="">
@@ -3826,32 +3827,6 @@ SDValue AArch64TargetLowering::LowerSETC<br class="">
}<br class="">
}<br class="">
<br class="">
-/// A SELECT_CC operation is really some kind of max or min if both values being<br class="">
-/// compared are, in some sense, equal to the results in either case. However,<br class="">
-/// it is permissible to compare f32 values and produce directly extended f64<br class="">
-/// values.<br class="">
-///<br class="">
-/// Extending the comparison operands would also be allowed, but is less likely<br class="">
-/// to happen in practice since their use is right here. Note that truncate<br class="">
-/// operations would *not* be semantically equivalent.<br class="">
-static bool selectCCOpsAreFMaxCompatible(SDValue Cmp, SDValue Result) {<br class="">
- if (Cmp == Result)<br class="">
- return (Cmp.getValueType() == MVT::f32 ||<br class="">
- Cmp.getValueType() == MVT::f64);<br class="">
-<br class="">
- ConstantFPSDNode *CCmp = dyn_cast<ConstantFPSDNode>(Cmp);<br class="">
- ConstantFPSDNode *CResult = dyn_cast<ConstantFPSDNode>(Result);<br class="">
- if (CCmp && CResult && Cmp.getValueType() == MVT::f32 &&<br class="">
- Result.getValueType() == MVT::f64) {<br class="">
- bool Lossy;<br class="">
- APFloat CmpVal = CCmp->getValueAPF();<br class="">
- CmpVal.convert(APFloat::IEEEdouble, APFloat::rmNearestTiesToEven, &Lossy);<br class="">
- return CResult->getValueAPF().bitwiseIsEqual(CmpVal);<br class="">
- }<br class="">
-<br class="">
- return Result->getOpcode() == ISD::FP_EXTEND && Result->getOperand(0) == Cmp;<br class="">
-}<br class="">
-<br class="">
SDValue AArch64TargetLowering::LowerSELECT_CC(ISD::CondCode CC, SDValue LHS,<br class="">
SDValue RHS, SDValue TVal,<br class="">
SDValue FVal, SDLoc dl,<br class="">
@@ -9124,75 +9099,6 @@ static SDValue performSelectCombine(SDNo<br class="">
return DAG.getSelect(DL, ResVT, Mask, N->getOperand(1), N->getOperand(2));<br class="">
}<br class="">
<br class="">
-/// performSelectCCCombine - Target-specific DAG combining for ISD::SELECT_CC<br class="">
-/// to match FMIN/FMAX patterns.<br class="">
-static SDValue performSelectCCCombine(SDNode *N, SelectionDAG &DAG) {<br class="">
- // Try to use FMIN/FMAX instructions for FP selects like "x < y ? x : y".<br class="">
- // Unless the NoNaNsFPMath option is set, be careful about NaNs:<br class="">
- // vmax/vmin return NaN if either operand is a NaN;<br class="">
- // only do the transformation when it matches that behavior.<br class="">
-<br class="">
- SDValue CondLHS = N->getOperand(0);<br class="">
- SDValue CondRHS = N->getOperand(1);<br class="">
- SDValue LHS = N->getOperand(2);<br class="">
- SDValue RHS = N->getOperand(3);<br class="">
- ISD::CondCode CC = cast<CondCodeSDNode>(N->getOperand(4))->get();<br class="">
-<br class="">
- unsigned Opcode;<br class="">
- bool IsReversed;<br class="">
- if (selectCCOpsAreFMaxCompatible(CondLHS, LHS) &&<br class="">
- selectCCOpsAreFMaxCompatible(CondRHS, RHS)) {<br class="">
- IsReversed = false; // x CC y ? x : y<br class="">
- } else if (selectCCOpsAreFMaxCompatible(CondRHS, LHS) &&<br class="">
- selectCCOpsAreFMaxCompatible(CondLHS, RHS)) {<br class="">
- IsReversed = true ; // x CC y ? y : x<br class="">
- } else {<br class="">
- return SDValue();<br class="">
- }<br class="">
-<br class="">
- bool IsUnordered = false, IsOrEqual;<br class="">
- switch (CC) {<br class="">
- default:<br class="">
- return SDValue();<br class="">
- case ISD::SETULT:<br class="">
- case ISD::SETULE:<br class="">
- IsUnordered = true;<br class="">
- case ISD::SETOLT:<br class="">
- case ISD::SETOLE:<br class="">
- case ISD::SETLT:<br class="">
- case ISD::SETLE:<br class="">
- IsOrEqual = (CC == ISD::SETLE || CC == ISD::SETOLE || CC == ISD::SETULE);<br class="">
- Opcode = IsReversed ? ISD::FMAXNAN : ISD::FMINNAN;<br class="">
- break;<br class="">
-<br class="">
- case ISD::SETUGT:<br class="">
- case ISD::SETUGE:<br class="">
- IsUnordered = true;<br class="">
- case ISD::SETOGT:<br class="">
- case ISD::SETOGE:<br class="">
- case ISD::SETGT:<br class="">
- case ISD::SETGE:<br class="">
- IsOrEqual = (CC == ISD::SETGE || CC == ISD::SETOGE || CC == ISD::SETUGE);<br class="">
- Opcode = IsReversed ? ISD::FMINNAN : ISD::FMAXNAN;<br class="">
- break;<br class="">
- }<br class="">
-<br class="">
- // If LHS is NaN, an ordered comparison will be false and the result will be<br class="">
- // the RHS, but FMIN(NaN, RHS) = FMAX(NaN, RHS) = NaN. Avoid this by checking<br class="">
- // that LHS != NaN. Likewise, for unordered comparisons, check for RHS != NaN.<br class="">
- if (!DAG.isKnownNeverNaN(IsUnordered ? RHS : LHS))<br class="">
- return SDValue();<br class="">
-<br class="">
- // For xxx-or-equal comparisons, "+0 <= -0" and "-0 >= +0" will both be true,<br class="">
- // but FMIN will return -0, and FMAX will return +0. So FMIN/FMAX can only be<br class="">
- // used for unsafe math or if one of the operands is known to be nonzero.<br class="">
- if (IsOrEqual && !DAG.getTarget().Options.UnsafeFPMath &&<br class="">
- !(DAG.isKnownNeverZero(LHS) || DAG.isKnownNeverZero(RHS)))<br class="">
- return SDValue();<br class="">
-<br class="">
- return DAG.getNode(Opcode, SDLoc(N), N->getValueType(0), LHS, RHS);<br class="">
-}<br class="">
-<br class="">
/// Get rid of unnecessary NVCASTs (that don't change the type).<br class="">
static SDValue performNVCASTCombine(SDNode *N) {<br class="">
if (N->getValueType(0) == N->getOperand(0).getValueType())<br class="">
@@ -9233,8 +9139,6 @@ SDValue AArch64TargetLowering::PerformDA<br class="">
return performSelectCombine(N, DCI);<br class="">
case ISD::VSELECT:<br class="">
return performVSelectCombine(N, DCI.DAG);<br class="">
- case ISD::SELECT_CC:<br class="">
- return performSelectCCCombine(N, DCI.DAG);<br class="">
case ISD::STORE:<br class="">
return performSTORECombine(N, DCI, DAG, Subtarget);<br class="">
case AArch64ISD::BRCOND:<br class="">
<br class="">
<br class="">
_______________________________________________<br class="">
llvm-commits mailing list<br class="">
<a href="mailto:llvm-commits@lists.llvm.org" target="_blank" class="">llvm-commits@lists.llvm.org</a><br class="">
<a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits" rel="noreferrer" target="_blank" class="">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits</a><br class="">
</blockquote>
</div>
<br class="">
</div>
</div>
</blockquote>
<br class="">
</div>
</div>
<font face="Arial" size="2" class="">-- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents
to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.<br class="">
<br class="">
ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in England & Wales, Company No: 2557590<br class="">
ARM Holdings plc, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in England & Wales, Company No: 2548782</font></div>
</blockquote>
</div>
</div>
</div>
</div>
</blockquote>
</div>
<br class="">
</div>
<br>
<font face="Arial" color="Black" size="2">-- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents
to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.<br>
<br>
ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in England & Wales, Company No: 2557590<br>
ARM Holdings plc, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in England & Wales, Company No: 2548782<br>
</font>
</body>
</html>