[LLVMdev] unaligned AVX store gets split into two instructions
eli.friedman at gmail.com
Tue Jul 9 22:01:33 PDT 2013
On Tue, Jul 9, 2013 at 9:01 PM, Zach Devito <zdevito at gmail.com> wrote:
> I'm seeing a difference in how LLVM 3.3 and 3.2 emit unaligned vector loads
> on AVX.
> 3.3 is splitting up an unaligned vector load but in 3.2, it was emitted as a
> single instruction (details below).
> In a matrix-matrix inner-kernel, I see a ~25% decrease in performance, which
> seems to be due to this.
> Any ideas why this changed? Thanks!
This was intentional; apparently doing it with two instructions is
supposed to be faster. See r172868/r172894.
Adding Nadav in case he has anything more to say.
More information about the llvm-dev