[llvm] r276416 - [X86][AVX] Added support for lowering to VBROADCASTF128/VBROADCASTI128 (reapplied)

Chandler Carruth via llvm-commits llvm-commits at lists.llvm.org
Wed Aug 10 00:58:15 PDT 2016


On Fri, Jul 22, 2016 at 7:06 AM Simon Pilgrim via llvm-commits <
llvm-commits at lists.llvm.org> wrote:

> Author: rksimon
> Date: Fri Jul 22 08:58:44 2016
> New Revision: 276416
>
> URL: http://llvm.org/viewvc/llvm-project?rev=276416&view=rev
> Log:
> [X86][AVX] Added support for lowering to VBROADCASTF128/VBROADCASTI128
> (reapplied)
>
> As reported on PR26235, we don't currently make use of the
> VBROADCASTF128/VBROADCASTI128 instructions (or the AVX512 equivalents) to
> load+splat a 128-bit vector to both lanes of a 256-bit vector.
>
> This patch enables lowering from subvector insertion/concatenation
> patterns and auto-upgrades the llvm.x86.avx.vbroadcastf128.pd.256 /
> llvm.x86.avx.vbroadcastf128.ps.256 intrinsics to match.
>
> We could possibly investigate using VBROADCASTF128/VBROADCASTI128 to load
> repeated constants as well (similar to how we already do for scalar
> broadcasts).
>
> Reapplied with fix for PR28657 - removed intrinsic definitions (clang
> companion patch to be be submitted shortly).
>

I fixed a bug in the Clang emission in r278202 where it explicitly aligned
the operand at 16 bytes, and it turns out the auto upgrade code here:


>        Value *Load = Builder.CreateLoad(VT, Op);
>

seems like it might have the same bug. I don't recall what the "deduced"
alignment is for a <double, double> but I suspect that it is 16-byte
alignment. If so, this would miscompile things during auto-upgrading.

I've made this an explicitly unaligned load much like we do in Clang for
unaligned SSE and AVX loads in r278203. Does that make sense to you? Would
you prefer a different approach?

One issue I have with that patch is that I can't test it becuase we only
have assembly tests for autoupgrade.

I think we really need to have the autoupgrade test file check both the
*IR* after auto-upgrade runs and the assembly so that we can write precise
regression tests when we fix an IR-level bug even if the asm doesn't
change. Is that something you could look into as you're working on this
area?


Anyways, thanks for all the awesome improvements to the intrinsic lowering!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20160810/fa9907f3/attachment.html>


More information about the llvm-commits mailing list