[PATCH] D30529: [GlobalISel] Enable specifying how to legalize non-power-of-2 size types.

Tue Mar 14 04:18:40 PDT 2017

kristof.beyls added a comment.

In https://reviews.llvm.org/D30529#699569, @dsanders wrote:

> I haven't read the actual code yet, but I've got a couple questions and a comment based on the description and the conversation so far.

Thanks for the comments - they're very useful!

>> Another major change is that getAction no longer returns a single action, but
>>  returns a sequence of actions, as legalizing non-power-of-2 types may need
>>  multiple actions. For example: findLegalAction({G_REM, 13}) should return
>> 
>> [(WidenScalar, 32), (Lower, 32)], indicating to first widen the s13
>>  scalar to 32 bits, and to then lower it, assuming the setAction on SREM
>>  was something like:
>>  setAction({G_REM, LLT:scalar(1)},
>> 
>>   {{1, WidenScalar},  // bit sizes [ 1, 31[
>>    {32, Lower},       // bit sizes [32, 33[
>>    {33, NarrowScalar} // bit sizes [65, +inf[
>>   });
> 
> Does findLegalAction() need to return a sequence here? I'm thinking that it could simply be called twice:
> 
>   iterator I = findLegalAction({G_REM, 13}); // *I == (WidenScalar, 32)
>   iterator J = findLegalAction({G_REM, 32}); // *J == (Lower, 32), I could also be given as an argument to speed up the search

I think both options are doable (calling multiple times like your example above, or returning all the legalization steps in on go like the patch currently does).
I don't have a very strong preference on one versus the other - at the moment, just returning the full sequence of legalization steps seemed a little bit conceptually cleaner to me.
I think that the decision on which option to take probably should be done based on which one is the higher-performing one.
My expectation is that the linear or binary search through the vector might be the slowest part, and therefore gathering all legalization steps in one go may be most efficient.
But of course, if you'd keep an iterator and pass it on between different findLegalAction calls, maybe the performance difference wouldn't be big.

I'm also assuming that we'll end up caching the results to findLegalAction queries per function to speed this up, and then the speed difference may be completely irrelevant.

> Also, given that the 2nd argument to setAction() describes all the bit sizes, is the bit-size of the LLT::scalar(1) still required for something?

The only relevant part is that it's a LLT::scalar and not an LLT::pointer.
I could get rid of it by having separate setScalarAction/setPointerAction functions. The idea crossed my mind before.
Probably that will make the specifications easier to read, as I agree that the LLT::scalar(1) is a bit confusing.
I'll look into changing that.

>>> Would this allow merging data? 
>>  > When sub-arch specific decisions are concerned, having a way to override a base default case would reduce the amount of code on both table-gen and c++ parts. 
>>  > For example, we could have a default catch-all case like {1,widen; 32,legal; 33,unsupp}, and later on, based on sub-arch decisions, increment legality, lib calls, etc.
>>  With the patch as is, you indeed need to re-specify the info for all bit sizes.
>>  It could be that I pushed the current patch way too far in breaking existing APIs and if I turned back the patch to only change the internal representations used in LegalizerInfo.cpp,
>>  this would work. So, the idea being that tablegen info and targets specify for which data types an action can be taken that doesn't require changing bitsize, such as Legal, Lower, Libcall.
>>  And then the target-independent logic can hopefully make decisions on most/all operations on how to adapt types to different bitsizes when that's needed.
>>  I'll look into that.
> 
> One way to allow mergable data in your current API is with an 'Inherit' action like so:
> 
>   setAction({G_ADD, LLT::scalar(1)}, {{1, Inherit}, {33, WidenScalar}, {64, Legal}, {65, NarrowScalar}});
> 
> This would keep existing actions for sizes 1-32, and replace the actions for size 33 and up.

That sounds like a promising idea!
It seems to have the nice quality that you could check that "Inherit" covers all but the largest bitsize specified before, so that asserts can protect users of the API from accidentally overwriting earlier specifications.
Or in other words, checking that when using "Inherit", you only extend the specification details, not re-specify earlier specifications. I just think it'd be nice to have those kinds of asserts as on architectures with many different subTargets, it's probably easy to make some mistake somewhere.
My assumption here is that most subTarget extensions will make more bitsizes natively supported/legal, rather than fewer.
This probably needs a bit more experimenting/going through several examples to see how it would play out in practice.

> One other possibility is to move the specification to tablegen, and have it figure out an array layout that is cheap to configure. For example, it could decide to take
> 
>   {{1, WidenScalar}, {32, Legal}, {33, NarrowScalar}
>   {{1, WidenScalar}, {32, Legal}, {33, WidenScalar}, {64, Legal}, {65, NarrowScalar}} // if 64-bit supported
> 
> and create a default array like so:
> 
>   {{1, WidenScalar}, {32, Legal}, {33, NarrowScalar}, {64, NarrowScalar}, {65, NarrowScalar}}
> 
> so that when 64-bit support is enabled it can just replace elements 2 and 3 with:
> 
>   {33, WidenScalar}, {64, Legal}

Moving this kind of information into tablegen currently seems a bit of overkill to me - but maybe I'm not getting the full reason why it might be beneficial.
As to the idea where you can replace elements in a fixed-size array: my assumption so far is that the creation of these data structures will not be on the critical path, so not worth optimizing for that. But I could be wrong of course.
Unless the technique would have benefits beyond making the creation of the data structures faster?

https://reviews.llvm.org/D30529