[PATCH] Use broadcasts to optimize overall size when loading constant splat vectors (x86-64 with AVX or AVX2)

Thu Sep 18 12:58:59 PDT 2014

>>! In D5347#17, @spatel wrote:
>>>! In D5347#14, @rafael wrote:
>> Then yes, the linker will merge them. For ELF the entsize can be any
>> value, no sure if linkers actually merge all possible sizes. We could
>> do a better job at merging these in the IR, but we don't at the
>> moment.
> 
> Thanks, Rafael. I'll stick with the assumption that it's still worthwhile to splat for size then, but I'll add a comment to revisit the optimization if we start merging in IR.

Bug for doing constant pool merging in IR:
http://llvm.org/bugs/show_bug.cgi?id=16711

I added a TODO comment to this patch about multiple loads of the same constant. It's possible that this patch will already increase overall size today just based on link-time constant merging, but I think that's unlikely in general. 

For this patch to be detrimental to size, we would have to generate 2 or more new splat loads (10 new bytes of instructions) of a single 64-bit scalar instead of 1 fused load/op of a 128-bit vector. That's the worst case. For a v8f32, it would take at least 6 loads of the same constant (+30 bytes of splat load instructions) to override the 28 data bytes of savings from using a scalar constant instead of a vector constant.

http://reviews.llvm.org/D5347