<div dir="ltr">add and mul are both commutable and tablegen tries to create patterns with every possible permutation of commutable operations.<div><br clear="all"><div><div dir="ltr" class="gmail_signature" data-smartmail="gmail_signature">~Craig</div></div><br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Mon, Sep 9, 2019 at 9:08 PM Sebastian Pop via llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hi,<br>
I implemented a pattern matching of the dot product for arm64<br>
and it seemed to work well for the basic case, i.e.,<br>
<br>
class mulB<SDPatternOperator ldop> :<br>
  PatFrag<(ops node:$Rn, node:$Rm, node:$offset),<br>
          (mul (ldop (add node:$Rn, node:$offset)),<br>
               (ldop (add node:$Rm, node:$offset)))>;<br>
class mulBz<SDPatternOperator ldop> :<br>
  PatFrag<(ops node:$Rn, node:$Rm),<br>
          (mul (ldop node:$Rn), (ldop node:$Rm))>;<br>
<br>
class DotProductI32<Instruction DOT, SDPatternOperator ldop> :<br>
  Pat<(i32 (add (mulB<ldop> GPR64sp:$Rn, GPR64sp:$Rm, (i64 3)),<br>
           (add (mulB<ldop> GPR64sp:$Rn, GPR64sp:$Rm, (i64 2)),<br>
           (add (mulB<ldop> GPR64sp:$Rn, GPR64sp:$Rm, (i64 1)),<br>
                (mulBz<ldop> GPR64sp:$Rn, GPR64sp:$Rm))))),<br>
      (EXTRACT_SUBREG<br>
       (i64 (DOT (DUPv2i32gpr WZR),<br>
                 (v8i8 (LD1Onev8b GPR64sp:$Rn)),<br>
                 (v8i8 (LD1Onev8b GPR64sp:$Rm)))),<br>
       sub_32)>, Requires<[HasDotProd]>;<br>
<br>
  def : DotProductI32<SDOTv8i8, sextloadi8>;<br>
  def : DotProductI32<UDOTv8i8, zextloadi8>;<br>
<br>
Then when I extended it to 8 element vectors, the time spent by tblgen exploded:<br>
from under 7 seconds (on A-72) on the AArch64 td files and the above patch<br>
to more than half an hour when I decided to terminate the processes.<br>
<br>
Here are the additional def'pats that produce the exponential behavior:<br>
<br>
def VADDV_32 : OutPatFrag<(ops node:$R), (ADDPv2i32 node:$R, node:$R)>;<br>
<br>
class DotProduct2I32<Instruction DOT, SDPatternOperator ldop> :<br>
  Pat<(i32 (add (mulB<ldop> GPR64sp:$Rn, GPR64sp:$Rm, (i64 7)),<br>
           (add (mulB<ldop> GPR64sp:$Rn, GPR64sp:$Rm, (i64 6)),<br>
           (add (mulB<ldop> GPR64sp:$Rn, GPR64sp:$Rm, (i64 5)),<br>
           (add (mulB<ldop> GPR64sp:$Rn, GPR64sp:$Rm, (i64 4)),<br>
           (add (mulB<ldop> GPR64sp:$Rn, GPR64sp:$Rm, (i64 3)),<br>
           (add (mulB<ldop> GPR64sp:$Rn, GPR64sp:$Rm, (i64 2)),<br>
           (add (mulB<ldop> GPR64sp:$Rn, GPR64sp:$Rm, (i64 1)),<br>
                (mulBz<ldop> GPR64sp:$Rn, GPR64sp:$Rm))))))))),<br>
      (EXTRACT_SUBREG<br>
       (VADDV_32<br>
        (i64 (DOT (DUPv2i32gpr WZR),<br>
                  (v8i8 (LD1Onev8b GPR64sp:$Rn)),<br>
                  (v8i8 (LD1Onev8b GPR64sp:$Rm))))),<br>
       sub_32)>, Requires<[HasDotProd]>;<br>
<br>
  def : DotProduct2I32<SDOTv8i8, sextloadi8>;<br>
  def : DotProduct2I32<UDOTv8i8, zextloadi8>;<br>
<br>
linux-perf profile for the first minute executing llvm-tblgen shows<br>
that most of the time is spent in isIsomorphicTo:<br>
<br>
  28.25%  llvm-tblgen  llvm-tblgen          [.]<br>
llvm::TreePatternNode::isIsomorphicTo<br>
  21.62%  llvm-tblgen  llvm-tblgen          [.]<br>
llvm::TypeSetByHwMode::operator==<br>
  15.25%  llvm-tblgen  <a href="http://libc-2.27.so" rel="noreferrer" target="_blank">libc-2.27.so</a>         [.] memcmp<br>
  14.61%  llvm-tblgen  llvm-tblgen          [.]<br>
std::__shared_ptr<llvm::TreePatternNode,<br>
(__gnu_cxx::_Lock_policy)2>::__shared_ptr<br>
<br>
In call-graph mode `perf -g` points to GenerateVariants that generates<br>
most of the calls to isIsomorphicTo:<br>
<br>
+  100.00%     0.00%  llvm-tblgen  llvm-tblgen          [.] main<br>
+  100.00%     0.00%  llvm-tblgen  llvm-tblgen          [.] llvm::TableGenMain<br>
+   99.85%     0.00%  llvm-tblgen  llvm-tblgen          [.] (anonymous<br>
namespace)::LLVMTableGenMain<br>
+   99.85%     0.00%  llvm-tblgen  llvm-tblgen          [.] llvm::EmitDAGISel<br>
+   99.85%     0.00%  llvm-tblgen  llvm-tblgen          [.]<br>
llvm::CodeGenDAGPatterns::CodeGenDAGPatterns<br>
+   99.46%    98.01%  llvm-tblgen  llvm-tblgen          [.]<br>
llvm::CodeGenDAGPatterns::GenerateVariants<br>
     0.38%     0.00%  llvm-tblgen  llvm-tblgen          [.] GenerateVariantsOf<br>
<br>
Sebastian<br>
_______________________________________________<br>
LLVM Developers mailing list<br>
<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a><br>
<a href="https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br>
</blockquote></div>