[PATCH] D62908: [PowerPC] Improve float vector gather codegen

Tue Sep 17 10:06:01 PDT 2019

nemanjai requested changes to this revision.
nemanjai added a comment.

Requesting changes because there is no BE support.

================
Comment at: llvm/lib/Target/PowerPC/PPCInstrVSX.td:3982
                         (COPY_TO_REGCLASS $A, VSRC), 0))>;
-
+    def : Pat<(v4f32 (build_vector (f32 (load xoaddr:$A)),
+                                   (f32 (load xoaddr:$B)),
----------------
jsji wrote:
> What about BigEndian?
Yes, by all means, we need BE support as well.

================
Comment at: llvm/lib/Target/PowerPC/PPCInstrVSX.td:3986
+                                   (f32 (load xoaddr:$D)))),
+              (v4f32 (XXPERMDI (XXMRGHW AlignValues.LD32D, AlignValues.LD32C),
+                               (XXMRGHW AlignValues.LD32B, AlignValues.LD32A), 3))>;
----------------
jsji wrote:
> What is the benefit of merging 'AB', 'CD', instead of original 'AC', 'BD' then `vmrgew`?
> 
> `vmrgew` is 2 cycle ALU instruction, should still be better than 3 cycler `xxpermdi` here.
We should favour the larger register file available to `XXPERMDI` here rather than `VMRGEW`.
Besides, where does the information about `XXPERMDI` taking 3 cycles come from? It is not listed in the UM and a similar instruction (`XXSEL` is a 2 cycle ALU instruction as well).

Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D62908/new/

https://reviews.llvm.org/D62908