[PATCH] D20965: [X86][SSE] Add general lowering of nontemporal vector loads

Michael Kuperstein via llvm-commits llvm-commits at lists.llvm.org
Mon Jun 6 11:41:10 PDT 2016


mkuper accepted this revision.
mkuper added a comment.
This revision is now accepted and ready to land.

LGTM


================
Comment at: lib/Target/X86/X86InstrAVX512.td:3347
@@ -3333,1 +3346,3 @@
 
+  def : Pat<(v4f64 (alignednontemporalload addr:$src)),
+            (VMOVNTDQAZ256rm addr:$src)>;
----------------
Any reason we support more types for loads than for stores? Are they just missing for stores?

================
Comment at: test/CodeGen/X86/fast-isel-nontemporal.ll:599
@@ +598,3 @@
+; AVX1:       # BB#0: # %entry
+; AVX1-NEXT:    vmovdqa (%rdi), %ymm0
+; AVX1-NEXT:    retq
----------------
I wonder if this is better or worse, in practice, than 2 * vmovntdqa %xmm.


Repository:
  rL LLVM

http://reviews.llvm.org/D20965





More information about the llvm-commits mailing list