<html><head><meta http-equiv="Content-Type" content="text/html charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class=""><br class=""><div><blockquote type="cite" class=""><div class="">On Feb 1, 2015, at 3:22 PM, Hal Finkel <<a href="mailto:hfinkel@anl.gov" class="">hfinkel@anl.gov</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;" class="">----- Original Message -----</span><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><blockquote type="cite" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class="">From: "Adam Nemet" <<a href="mailto:anemet@apple.com" class="">anemet@apple.com</a>><br class="">To: <a href="mailto:llvm-commits@cs.uiuc.edu" class="">llvm-commits@cs.uiuc.edu</a><br class="">Sent: Thursday, October 23, 2014 7:03:00 PM<br class="">Subject: [llvm] r220540 - [AVX512] FMA support for the 231 variants<br class=""><br class="">Author: anemet<br class="">Date: Thu Oct 23 19:03:00 2014<br class="">New Revision: 220540<br class=""><br class="">URL: <a href="http://llvm.org/viewvc/llvm-project?rev=220540&view=rev" class="">http://llvm.org/viewvc/llvm-project?rev=220540&view=rev</a><br class="">Log:<br class="">[AVX512] FMA support for the 231 variants<br class=""><br class="">This is asm/diasm-only support, similar to AVX.<br class=""><br class="">For ISeling the register variant, they are no different from 213<br class="">other than<br class="">whether the multiplication or the addition operand is destructed.<br class=""><br class="">For ISeling the memory variant, i.e. to fold a load, they are no<br class="">different<br class="">than the 132 variant.  The addition operand (op3) in both cases can<br class="">come from<br class="">memory.  Again the ony difference is which operand is destructed.<br class=""><br class="">There could be a post-RA pass that would convert a 213 or 132 into a<br class="">231.<br class=""></blockquote><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;" class="">Hi Adam,</span><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;" class="">If I understand the situation correctly, the PPC backend solves the same problem for VSX FMA instructions (for the register-operand case) using the pass in lib/Target/PowerPC/PPCVSXFMAMutate.cpp. The PPCVSXFMAMutate pass runs in between MI scheduling and RA, and mutates the FMA form (from the addend-destructing form to the multiplicand-destructing form when doing so will eliminate a copy). I think that this could be made target-independent pretty easily, and then we could handle the AVX-512 FMA mutation as well. What do you think?</span><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""></div></blockquote><div><br class=""></div><div>Hi Hal,</div><div><br class=""></div><div>Yes that’s the same idea.  Thanks for the pointer.  Copying Elena as well.</div><div><br class=""></div><div>Adam</div><br class=""><blockquote type="cite" class=""><div class=""><span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;" class="">Thanks again,</span><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;" class="">Hal</span><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><blockquote type="cite" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><br class="">Part of <<a href="rdar://problem/17082571" class="">rdar://problem/17082571</a>><br class=""><br class="">Modified:<br class="">   llvm/trunk/lib/Target/X86/X86InstrAVX512.td<br class="">   llvm/trunk/test/MC/X86/avx512-encodings.s<br class=""><br class="">Modified: llvm/trunk/lib/Target/X86/X86InstrAVX512.td<br class="">URL:<br class=""><a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrAVX512.td?rev=220540&r1=220539&r2=220540&view=diff" class="">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrAVX512.td?rev=220540&r1=220539&r2=220540&view=diff</a><br class="">==============================================================================<br class="">--- llvm/trunk/lib/Target/X86/X86InstrAVX512.td (original)<br class="">+++ llvm/trunk/lib/Target/X86/X86InstrAVX512.td Thu Oct 23 19:03:00<br class="">2014<br class="">@@ -3356,40 +3356,44 @@ multiclass avx512_fma3p_rm<bits<8> opc,<br class="">}<br class="">} // Constraints = "$src1 = $dst"<br class=""><br class="">-multiclass avx512_fma3p_forms<bits<8> opc213,<br class="">+multiclass avx512_fma3p_forms<bits<8> opc213, bits<8> opc231,<br class="">                              string OpcodeStr, X86VectorVTInfo VTI,<br class="">                              SDPatternOperator OpNode> {<br class="">  defm v213 : avx512_fma3p_rm<opc213, !strconcat(OpcodeStr, "213",<br class="">  VTI.Suffix),<br class="">                              VTI, OpNode>,<br class="">              EVEX_V512, EVEX_CD8<VTI.EltSize, CD8VF>;<br class="">+<br class="">+  defm v231 : avx512_fma3p_rm<opc231, !strconcat(OpcodeStr, "231",<br class="">VTI.Suffix),<br class="">+                              VTI>,<br class="">+              EVEX_V512, EVEX_CD8<VTI.EltSize, CD8VF>;<br class="">}<br class=""><br class="">let ExeDomain = SSEPackedSingle in {<br class="">-  defm VFMADDPSZ    : avx512_fma3p_forms<0xA8, "vfmadd",<br class="">+  defm VFMADDPSZ    : avx512_fma3p_forms<0xA8, 0xB8, "vfmadd",<br class="">                                         v16f32_info, X86Fmadd>;<br class="">-  defm VFMSUBPSZ    : avx512_fma3p_forms<0xAA, "vfmsub",<br class="">+  defm VFMSUBPSZ    : avx512_fma3p_forms<0xAA, 0xBA, "vfmsub",<br class="">                                         v16f32_info, X86Fmsub>;<br class="">-  defm VFMADDSUBPSZ : avx512_fma3p_forms<0xA6, "vfmaddsub",<br class="">+  defm VFMADDSUBPSZ : avx512_fma3p_forms<0xA6, 0xB6, "vfmaddsub",<br class="">                                         v16f32_info, X86Fmaddsub>;<br class="">-  defm VFMSUBADDPSZ : avx512_fma3p_forms<0xA7, "vfmsubadd",<br class="">+  defm VFMSUBADDPSZ : avx512_fma3p_forms<0xA7, 0xB7, "vfmsubadd",<br class="">                                         v16f32_info, X86Fmsubadd>;<br class="">-  defm VFNMADDPSZ   : avx512_fma3p_forms<0xAC, "vfnmadd",<br class="">+  defm VFNMADDPSZ   : avx512_fma3p_forms<0xAC, 0xBC, "vfnmadd",<br class="">                                         v16f32_info, X86Fnmadd>;<br class="">-  defm VFNMSUBPSZ   : avx512_fma3p_forms<0xAE, "vfnmsub",<br class="">+  defm VFNMSUBPSZ   : avx512_fma3p_forms<0xAE, 0xBE, "vfnmsub",<br class="">                                         v16f32_info, X86Fnmsub>;<br class="">}<br class="">let ExeDomain = SSEPackedDouble in {<br class="">-  defm VFMADDPDZ    : avx512_fma3p_forms<0xA8, "vfmadd",<br class="">+  defm VFMADDPDZ    : avx512_fma3p_forms<0xA8, 0xB8, "vfmadd",<br class="">                                         v8f64_info, X86Fmadd>,<br class="">                                         VEX_W;<br class="">-  defm VFMSUBPDZ    : avx512_fma3p_forms<0xAA, "vfmsub",<br class="">+  defm VFMSUBPDZ    : avx512_fma3p_forms<0xAA, 0xBA, "vfmsub",<br class="">                                         v8f64_info, X86Fmsub>,<br class="">                                         VEX_W;<br class="">-  defm VFMADDSUBPDZ : avx512_fma3p_forms<0xA6, "vfmaddsub",<br class="">+  defm VFMADDSUBPDZ : avx512_fma3p_forms<0xA6, 0xB6, "vfmaddsub",<br class="">                                         v8f64_info, X86Fmaddsub>,<br class="">                                         VEX_W;<br class="">-  defm VFMSUBADDPDZ : avx512_fma3p_forms<0xA7, "vfmsubadd",<br class="">+  defm VFMSUBADDPDZ : avx512_fma3p_forms<0xA7, 0xB7, "vfmsubadd",<br class="">                                         v8f64_info, X86Fmsubadd>,<br class="">                                         VEX_W;<br class="">-  defm VFNMADDPDZ :   avx512_fma3p_forms<0xAC, "vfnmadd",<br class="">+  defm VFNMADDPDZ :   avx512_fma3p_forms<0xAC, 0xBC, "vfnmadd",<br class="">                                         v8f64_info, X86Fnmadd>,<br class="">                                         VEX_W;<br class="">-  defm VFNMSUBPDZ :   avx512_fma3p_forms<0xAE, "vfnmsub",<br class="">+  defm VFNMSUBPDZ :   avx512_fma3p_forms<0xAE, 0xBE, "vfnmsub",<br class="">                                         v8f64_info, X86Fnmsub>,<br class="">                                         VEX_W;<br class="">}<br class=""><br class=""><br class="">Modified: llvm/trunk/test/MC/X86/avx512-encodings.s<br class="">URL:<br class="">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/MC/X86/avx512-encodings.s?rev=220540&r1=220539&r2=220540&view=diff<br class="">==============================================================================<br class="">--- llvm/trunk/test/MC/X86/avx512-encodings.s (original)<br class="">+++ llvm/trunk/test/MC/X86/avx512-encodings.s Thu Oct 23 19:03:00<br class="">2014<br class="">@@ -4351,3 +4351,27 @@ vextractf32x4  $3, %zmm3, %xmm1 {%k1}<br class="">// CHECK: vextracti64x4 $1<br class="">// CHECK: encoding: [0x62,0x53,0xfd,0xcb,0x3b,0xf4,0x01]<br class="">vextracti64x4  $1, %zmm14, %ymm12 {%k3} {z}<br class="">+<br class="">+// CHECK: vfmadd231ps<br class="">+// CHECK: encoding: [0x62,0xb2,0x1d,0x48,0xb8,0xe7]<br class="">+vfmadd231ps %zmm23, %zmm12, %zmm4<br class="">+<br class="">+// CHECK: vfmsub231pd<br class="">+// CHECK: encoding: [0x62,0xe2,0xed,0x48,0xba,0x73,0x08]<br class="">+vfmsub231pd 0x200(%rbx), %zmm2, %zmm22<br class="">+<br class="">+// CHECK: vfmaddsub231ps<br class="">+// CHECK: encoding: [0x62,0xd2,0x65,0x4b,0xb6,0xec]<br class="">+vfmaddsub231ps %zmm12, %zmm3, %zmm5 {%k3}<br class="">+<br class="">+// CHECK: vfmsubadd231pd<br class="">+// CHECK: encoding: [0x62,0x72,0x85,0xc5,0xb7,0xdd]<br class="">+vfmsubadd231pd %zmm5, %zmm31, %zmm11 {%k5}{z}<br class="">+<br class="">+// CHECK: vfnmadd231ps<br class="">+// CHECK: encoding: [0x62,0xf2,0x4d,0x48,0xbc,0xfd]<br class="">+vfnmadd231ps %zmm5, %zmm6, %zmm7<br class="">+<br class="">+// CHECK: vfnmsub231pd<br class="">+// CHECK: encoding: [0x62,0xf2,0xcd,0x48,0xbe,0xfd]<br class="">+vfnmsub231pd %zmm5, %zmm6, %zmm7<br class=""><br class=""><br class="">_______________________________________________<br class="">llvm-commits mailing list<br class="">llvm-commits@cs.uiuc.edu<br class="">http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits<br class=""><br class=""></blockquote><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;" class="">--<span class="Apple-converted-space"> </span></span><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;" class="">Hal Finkel</span><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;" class="">Assistant Computational Scientist</span><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;" class="">Leadership Computing Facility</span><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;" class="">Argonne National Laboratory</span></div></blockquote></div><br class=""></body></html>