[PATCH/RFC] fold zext(setcc) into a larger setcc

Matt Arsenault Matthew.Arsenault at amd.com
Mon Jan 26 17:33:35 PST 2015


On 01/26/2015 04:57 PM, Fiona Glaser wrote:
>
>> On Jan 26, 2015, at 4:51 PM, Matt Arsenault 
>> <Matthew.Arsenault at amd.com <mailto:Matthew.Arsenault at amd.com>> wrote:
>>
>> On 01/26/2015 04:35 PM, Fiona Glaser wrote:
>>> On an out-of-tree target, IR similar to the attached causes the 
>>> emission of the following instructions (pseudocode):
>>>
>>> setcc dst16, cc
>>> zext dst32, dst16
>>>
>>> The attached patch fixes it to be the expected:
>>>
>>> setcc dst32, cc
>>>
>>> because the out-of-tree target supports 32-bit setcc.
>>
>> Does it also support 16-bit setcc result types or return different 
>> sized types for different compared types? I don't think this patch is 
>> generally correct. It isn't correct to blindly promote the setcc 
>> type, especially with any given promotion type. At the very least 
>> this would only be valid to combine zext if the target's boolean 
>> content is ZeroOrOneBooleanContent etc. setccs should only be created 
>> with a type from getSetCCResultType, and any random extended type you 
>> give it probably won't work
>>
>> -Matt
>
> What’s the correct way to do this then? Should I only perform the 
> merge if the target supports the operation (i.e. never do it to a 
> non-legal type)? That feels slightly weird given that it seemed like 
> every other canonicalization done in that function supports non-legal 
> types, so I’d at least like to know what the reason for that 
> inconsistency is.
>
> I was trying to model the optimization after the code that immediately 
> follows it, which already assumes that the setCC resolves to a 0 or 1:
>
>      // aext(setcc x,y,cc) -> select_cc x, y, 1, 0, cc
>      SDValue SCC =
>        SimplifySelectCC(SDLoc(N), N0.getOperand(0), N0.getOperand(1),
>
> Fiona

The aext -> select_cc here is OK, but in your patch it will combine zext 
(setcc) -> setcc. On a target that uses ZeroOrNegativeOneBooleanContent, 
that would produce a wider type with 0 upper bits which is not the same 
as the all 1 bits the larger compare result would give. There are also a 
lot of pre-existing bugs that mishandle this now that still assume 0/1 
setcc results.

I guess I don't understand why this wouldn't already be happening during 
legalization if getSetCCResultType gives i32. I would not expect two 
different result types for the same setcc to ever work.

-Matt

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150126/f300bd5b/attachment.html>


More information about the llvm-commits mailing list