[PATCH] D33728: [X86][SSE] Improve handling of non-temporal aligned loads
Filipe Cabecinhas via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Jun 5 07:07:58 PDT 2017
filcab added a comment.
LGTM with a minor comment nit (if I'm right).
Code expansion is annoying, but it becomes closer to source semantics.
Thanks!
Filipe
================
Comment at: lib/Target/X86/X86ISelLowering.cpp:32303
// For chips with slow 32-byte unaligned loads, break the 32-byte operation
- // into two 16-byte operations.
+ // into two 16-byte operations. Also split non-temporal aligned loads on AVX1
+ // targets as 32-byte loads will lower to regular temporal loads.
----------------
"pre-AVX2" (or "targets without AVX2"), no? I'd expect this to also happen on SSE4.1 (also has 128bit NT loads).
Repository:
rL LLVM
https://reviews.llvm.org/D33728
More information about the llvm-commits
mailing list