[PATCH] Use broadcasts to optimize overall size when loading constant splat vectors (x86-64 with AVX or AVX2)

Sanjay Patel spatel at rotateright.com
Wed Sep 17 15:15:41 PDT 2014


>>! In D5347#18, @spatel wrote:
> Is there a way to use patterns but still distinguish between the conflicting optimization goals of speed and size in that one case? 
> Or just let it slide that vpbroadcastq is an extra byte and always use that instruction for v2i64 with AVX2? (That's what was 
> happening with my patch anyway.)

I thought I had stumbled into the answer with:

  def : Pat<(v2f64 (X86VBroadcast (loadf64 addr:$src))),
            (VMOVDDUPrm addr:$src)>, Requires<[OptForSize]>;
  def : Pat<(v2i64 (X86VBroadcast (loadi64 addr:$src))),
            (VMOVDDUPrm addr:$src)>, Requires<[OptForSize]>;

But that doesn't change the failing testcase in avx2-broadcast.ll - we're still generating vmovddup even without an OptForSize atrtribute on the function.

http://reviews.llvm.org/D5347






More information about the llvm-commits mailing list