[PATCH] [CodeGen] Add hooks/combine to form vector extloads, and enable it on X86.

Ahmed Bougacha ahmed.bougacha at gmail.com
Fri Jan 9 13:10:48 PST 2015


http://reviews.llvm.org/D6904, squashed. Let me know if you prefer
them split up.
- Ahmed


On Fri, Jan 9, 2015 at 12:29 PM, Chandler Carruth <chandlerc at google.com> wrote:
> Could you post the patches with Phabricator? That makes it so much easier to
> review.
>
> On Fri, Jan 9, 2015 at 10:27 AM, Ahmed Bougacha <ahmed.bougacha at gmail.com>
> wrote:
>>
>> Hi all,
>>
>> As the last change in my extload series, here are 3 (WIP) patches to
>> actually form extloads on vector types.
>> They used to be disabled, because "None of the supported targets knows
>> how to perform load and  sign extend on vectors in one instruction."
>>
>> The first patch enables the combine on legal vectors, but hides it
>> behind a profitability callback.
>> For instance, on ARM, several instructions have folded extload forms,
>> so it's not always beneficial to create an extload node (and trying to
>> match extloads is a whole 'nother can of worms).
>>
>> The second patch adds a combine to fold extloads of illegal
>> (splittable) vector types, to replace it directly by multiple smaller
>> extloads.  I'm not a big fan of this kind of pseudo-legalization in
>> combines, but I tried the alternative: form illegal extloads, and
>> later try to split them up, but then, you sometimes generate extloads
>> that can't be split up, but have a valid ext+load expansion.  At
>> vector-op legalization time, it's too late to generate this kind of
>> thing, so it's better to just avoid creating egregiously illegal
>> nodes.
>>
>>
>> Finally, the last patches enables this all, unconditionally, on X86.
>>
>> Note that the splitting combine is happy with "custom" extloads.  As
>> is, this bypasses the actual custom lowering, and just unrolls the
>> extload.  But from what I've seen, this is still much better than the
>> current custom lowering, which does some kind of unrolling at the end
>> anyway (see for instance load_sext_4i8_to_4i64 on SSE2, and the added
>> FIXME).
>>
>> Also note that there's a regression in the widen_load-2.ll test, where
>> we can no longer fold the load. I'll look into that later.
>>
>>
>> Anyway: as can be seen from the nice testcase cleanups, there's
>> something to be done here.  The combines feel a bit dirty, but I don't
>> see a better alternative.  Finally, I didn't see changes on the
>> testsuite (SSE2 X86-64, I'll try SSE4.1 and AVX2 as well.)
>>
>> Feedback heartily welcome!
>>
>> Thanks,
>>
>> - Ahmed
>>
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>
>



More information about the llvm-commits mailing list