[llvm] r276416 - [X86][AVX] Added support for lowering to VBROADCASTF128/VBROADCASTI128 (reapplied)

Simon Pilgrim via llvm-commits llvm-commits at lists.llvm.org
Thu Aug 11 02:21:48 PDT 2016


> On 10 Aug 2016, at 08:58, Chandler Carruth <chandlerc at gmail.com> wrote:
> 
> On Fri, Jul 22, 2016 at 7:06 AM Simon Pilgrim via llvm-commits <llvm-commits at lists.llvm.org <mailto:llvm-commits at lists.llvm.org>> wrote:
> Author: rksimon
> Date: Fri Jul 22 08:58:44 2016
> New Revision: 276416
> 
> URL: http://llvm.org/viewvc/llvm-project?rev=276416&view=rev <http://llvm.org/viewvc/llvm-project?rev=276416&view=rev>
> Log:
> [X86][AVX] Added support for lowering to VBROADCASTF128/VBROADCASTI128 (reapplied)
> 
> As reported on PR26235, we don't currently make use of the VBROADCASTF128/VBROADCASTI128 instructions (or the AVX512 equivalents) to load+splat a 128-bit vector to both lanes of a 256-bit vector.
> 
> This patch enables lowering from subvector insertion/concatenation patterns and auto-upgrades the llvm.x86.avx.vbroadcastf128.pd.256 / llvm.x86.avx.vbroadcastf128.ps.256 intrinsics to match.
> 
> We could possibly investigate using VBROADCASTF128/VBROADCASTI128 to load repeated constants as well (similar to how we already do for scalar broadcasts).
> 
> Reapplied with fix for PR28657 - removed intrinsic definitions (clang companion patch to be be submitted shortly).
> 
> I fixed a bug in the Clang emission in r278202 where it explicitly aligned the operand at 16 bytes, and it turns out the auto upgrade code here:
>  
>        Value *Load = Builder.CreateLoad(VT, Op);
> 
> seems like it might have the same bug. I don't recall what the "deduced" alignment is for a <double, double> but I suspect that it is 16-byte alignment. If so, this would miscompile things during auto-upgrading.
> 
> I've made this an explicitly unaligned load much like we do in Clang for unaligned SSE and AVX loads in r278203. Does that make sense to you? Would you prefer a different approach?

This looks fine to me.

No alignment of vbroadcastf128/vbroadcasti128 seems to work fine on the Jaguar, Excavator and SB machines I’ve tested. I have no way to test the AVX512 equivalents.

> One issue I have with that patch is that I can't test it becuase we only have assembly tests for autoupgrade.
> 
> I think we really need to have the autoupgrade test file check both the *IR* after auto-upgrade runs and the assembly so that we can write precise regression tests when we fix an IR-level bug even if the asm doesn't change. Is that something you could look into as you're working on this area?

I see what you mean - have twin test runs on the upgrade files - one outputs + tests the upgraded IR, the second (as now) outputs + tests the final assembly. I’ll take a look at best approaches for this.

We don’t seem to have much in the way of best practices for auto upgrade at all - for instance how long should these upgrades stay in code AND there is no way to indicate when an upgrade was added (short of svn log trawling) so that we could easily weed out old code when the time comes. Should auto-upgrades fire some sort of -Wdeprecated-intrinsic warning?

> Anyways, thanks for all the awesome improvements to the intrinsic lowering!

An seemingly never ending struggle…..

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20160811/33f3c6fb/attachment.html>


More information about the llvm-commits mailing list