[llvm-commits] [llvm] r142152 - in /llvm/trunk: lib/CodeGen/SelectionDAG/ test/CodeGen/ARM/ test/CodeGen/CellSPU/ test/CodeGen/X86/

Tue Oct 18 10:56:14 PDT 2011

Owen, 

Currently the LLVM type-legalizer is not context aware. I think that in one of the bug-reports, Duncan mentioned a general plan to unify the type-legalizer and the operation legalizer. This will allow us to make context-aware legalization decisions. Maybe we can discuss this in the next dev-meeting. Thank you again for looking at this. 

Nadav

-----Original Message-----
From: Owen Anderson [mailto:resistor at mac.com] 
Sent: Tuesday, October 18, 2011 18:52
To: Anton Korobeynikov
Cc: Rotem, Nadav; llvm-commits at cs.uiuc.edu
Subject: Re: [llvm-commits] [llvm] r142152 - in /llvm/trunk: lib/CodeGen/SelectionDAG/ test/CodeGen/ARM/ test/CodeGen/CellSPU/ test/CodeGen/X86/

On Oct 18, 2011, at 5:19 AM, Anton Korobeynikov wrote:

> Hi Nadav,
> 
>> I discussed the legalization of <2 x i16> stores on ARM with Anton. As you mentioned, i16 is illegal on ARM and it is not possible to scalarize the store in the Legalizer.
>> This was the main reason for moving the legalization of vector memory ops into LegalizeVectorOps.
> Well, this is completely different story. Your question was about
> trunc-stores, but here it seems to prevent important codegen sequence.
> 
>> I agree that in some cases promoting the elements in the vector is less efficient than widening the number of elements.  However, generally 'promotion' is a better strategy.  I am mostly interested in code-generation of auto-vectorized IR.  What workloads are you mostly interested in ? Maybe we can discuss the needed optimizations for these workloads.
> On NEON you can do pretty efficient vector manipulations via shuffles
> (e.g. any shuffle of 4 elements can be codegen'ed in 5 or less
> instructions, usually 2-3). This is really important for ARM.
> 
> In the meantime I'd suggest you adding target-specific "vector select
> strategy" flag, so target can choose how to deal with all the stuff
> and make sure your code is disabled for e.g. ARM and CellSPU.

I had an extended discussion with Dan about this yesterday, and we came to two conclusions:

1) The particular testcase in question is really weird, and probably isn't a great representative of code we care about.  As long as we don't generate something completely braindead for it, we're OK with it being regressed if this is an overall win.

2) Determining a policy for vector-widening vs. element-widening is really, really hard, and also context dependent.  In this particular case, we want to widen the vector because a later immutable use (the store) is expressed in terms of i16's.  However, it's easy to imagine other circumstances where the later use would prefer the element-widening approach.

--Owen
---------------------------------------------------------------------
Intel Israel (74) Limited

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.