PATCH: WIP SLPVectorize: Enable vectorization of allocas

Nadav Rotem nrotem at apple.com
Fri Oct 25 11:18:39 PDT 2013


I see the point that Hal and Chandler made about ExtractElement being the canonical form. If we can cannonicalize this pattern (in InstCombine) into ExtractElement with a dynamic index then we should probably do it.  We can probably undo this canonicalization during isle without generating worse code. I prefer to perform this kind of canonicalization without TTI inside InstCombine. Hal, why do you suspect that lowering dynamic extract element would result in worse code had we not canonicalized it ?


On Oct 25, 2013, at 1:14 AM, Hal Finkel <hfinkel at anl.gov> wrote:

> ----- Original Message -----
>> Hi Tom,
>> 
>> I am not sure that it is a good idea to generate ExtractElement
>> instructions with dynamic indices at IR level. 
> 
> I think that we should handle this decision using TTI. ExtractElement with a dynamic index, in theory, seems like it should be the preferred canonical form. In practice, I suspect that preferring it without target input will lead to hard-to-fix regressions. On my target, I can generate dynamic EEs in a relatively inexpensive way (one integer op, one shuffle generation and one shuffle instruction), but for other targets, dynamic EEs are always expanded through stack-slot load/stores. In the latter case, these extra load/stores may not be removable by DAGCombine (even if they might otherwise be in this case) because the EE's input may have been moved into some predecessor BB.
> 
> -Hal
> 
>> I think that this
>> kind of patterns should be matched during instruction selection.
>> Assuming that you can pattern match it during isel, you are still
>> left with the alloca.  Allocas are lowered into stack slots and I
>> think that we should be able to move stack slots that are only
>> stored to.
>> 
>> Thanks,
>> Nadav
>> 
>> 
>> On Oct 24, 2013, at 9:02 PM, Tom Stellard <tom at stellard.net> wrote:
>> 
>>> Hi,
>>> 
>>> As a follow up to this discussion:
>>> http://lists.cs.uiuc.edu/pipermail/llvmdev/2013-October/066780.html
>>> 
>>> I put together a very simple patch that begins to implement the
>>> transformation
>>> mentioned in the llvm-dev thread.  The patch is incomplete and is
>>> mostly comments
>>> with question about how to do certain things, but it does work for
>>> the
>>> simple test case included in the patch.
>>> 
>>> I'd appreciate any feedback people can give me on this patch and
>>> the
>>> questions posed in the comments.
>>> 
>>> Thanks,
>>> Tom
>>> <vectorize-alloca.diff>
>> 
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>> 
> 
> -- 
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20131025/0c4726a9/attachment.html>


More information about the llvm-commits mailing list