[PATCH] Use broadcasts to optimize overall size when loading constant splat vectors (x86-64 with AVX or AVX2)
Sanjay Patel
spatel at rotateright.com
Wed Sep 17 15:15:41 PDT 2014
>>! In D5347#18, @spatel wrote:
> Is there a way to use patterns but still distinguish between the conflicting optimization goals of speed and size in that one case?
> Or just let it slide that vpbroadcastq is an extra byte and always use that instruction for v2i64 with AVX2? (That's what was
> happening with my patch anyway.)
I thought I had stumbled into the answer with:
def : Pat<(v2f64 (X86VBroadcast (loadf64 addr:$src))),
(VMOVDDUPrm addr:$src)>, Requires<[OptForSize]>;
def : Pat<(v2i64 (X86VBroadcast (loadi64 addr:$src))),
(VMOVDDUPrm addr:$src)>, Requires<[OptForSize]>;
But that doesn't change the failing testcase in avx2-broadcast.ll - we're still generating vmovddup even without an OptForSize atrtribute on the function.
http://reviews.llvm.org/D5347
More information about the llvm-commits
mailing list