[PATCH] D33728: [X86][SSE] Improve handling of non-temporal aligned loads

Mon Jun 5 09:41:16 PDT 2017

RKSimon added a comment.

> I've wondered for a while if we shouldn't have a converged memop that lifts the alignment restriction if AVX is enabled. I think that would shrink the size of the DAG isel table because AVX and SSE would use the same predicate. The only issue I know of is that SHA1 instructions use memop, but don't have a VEX equivalent documented yet so must always obey the alignment even when AVX is enabled.

@craig.topper  Yes - it does look like we could start by having memop inherit from vec128load (simplifying the repeated non-temporal logic I'm about to add), there are probably other cases like that as well. Or we could just go for the SSE/AVX refactor straight away?

Not perfect but we could get around the SHA issue by only folding alignedloadv2i64 and add a couple of extra folding patterns to the SHA instructions for the hasSSEUnalignedMem cases? It'd be messy though...

Repository:
  rL LLVM

https://reviews.llvm.org/D33728