[PATCH] optimize merging of scalar loads for 32-byte vectors [X86, AVX] (PR21710)

Sun Dec 7 04:08:03 PST 2014

>>! In D6536#15, @spatel wrote:
> I think we're ok without testing every type, but this does raise a potential corner case for an AVX-only machine: is it perf worse to use a 32-byte FP store when dealing with ints? Ie, is there a domain-crossing penalty for a store of the 'wrong' type? Would we ever have a 32-byte vector of ints incoming to this code on an AVX-only machine?

The get/setExecutionDomain code should deal with domain crossing of load/stores as well as bitwise ops. If the incoming AVX1 code has gone to the trouble of wanting to load integers into 256-bit vectors then we have to assume that it knows what its doing - hopefully performing float domain only ops, although shuffles might be an issue.

REPOSITORY
  rL LLVM

http://reviews.llvm.org/D6536