[PATCH] D13269: Improved X86-FMA3 mem-folding & coalescing
Vyacheslav Klochkov via llvm-commits
llvm-commits at lists.llvm.org
Mon Oct 5 17:07:18 PDT 2015
v_klochkov added a comment.
Hi Quentin,
Thank you for the very useful comments.
I updated the change-set. Please check the additional changes.
Thank you,
Slava
================
Comment at: llvm/lib/Target/X86/X86InstrInfo.cpp:3402
@@ +3401,3 @@
+ const unsigned Form231Index = 2;
+ const unsigned LastFormIndex = 2;
+
----------------
qcolombet wrote:
> Maybe put three and adapt the comparison.
Ok, but using 3 instead of 2 required renaming LastFormIndex to FormsNum.
================
Comment at: llvm/lib/Target/X86/X86InstrInfo.cpp:3430
@@ +3429,3 @@
+ // (A * b + C) ==> (A * b + C);
+ // FMA132 A, C, b; ==> FMA231 C, A, b;
+ RetOpc = OpcodeGroups[GroupIndex][Form231Index];
----------------
qcolombet wrote:
> I like the comments!
> Keep them, but I believe we can refactor the code a bit.
>
> What you basically have is given the indices, you map each FormIndex to a new FormIndex and you have three choices each time.
> It looks to me like we could do save this table statically like
> static const unsigned Mapping[][3] = {
> // SrcOpIdx1 == 1 && SrcOpIdx2 == 2
> // FMA132 A, C, b; ==> FMA231 C, A, b;
> // FMA213 B, A, c; ==> FMA213 A, B, c;
> // FMA231 C, A, b; ==> FMA132 A, C, b;
> { Form231Index, Form213Index, Form132Index },
> // (SrcOpIdx1 == 1 && SrcOpIdx2 == 3)
> // etc.
> // (SrcOpIdx1 == 2 && SrcOpIdx2 == 3)
> };
> unsigned Case;
> if (SrcOpIdx1 == 1 && SrcOpIdx2 == 2)
> Case = 0;
> else if (SrcOpIdx1 == 1 && SrcOpIdx2 == 3)
> Case = 1;
> else if (SrcOpIdx1 == 2 && SrcOpIdx2 == 3)
> Case = 2;
> else
> assert(“Invalid commutable indices for FMA”); // or return 0;
>
> RetOpc == Mapping[Case][FormIndex];
This is a great idea! I did the corresponding changes.
================
Comment at: llvm/lib/Target/X86/X86InstrInfo.cpp:3406
@@ +3405,3 @@
+ unsigned GroupIndex = 0, FormIndex = FormsNum;
+ for (; GroupIndex < OpcodeGroupsNum && FormIndex == FormsNum; GroupIndex++) {
+ for (FormIndex = 0; FormIndex < FormsNum; FormIndex++) {
----------------
Ok. BTW, this change required a fix for GroupIndex after the loop (do GroupIndex-- once).
http://reviews.llvm.org/D13269
More information about the llvm-commits
mailing list