[PATCH/RFC] fold zext(setcc) into a larger setcc
Matt Arsenault
Matthew.Arsenault at amd.com
Mon Jan 26 17:33:35 PST 2015
On 01/26/2015 04:57 PM, Fiona Glaser wrote:
>
>> On Jan 26, 2015, at 4:51 PM, Matt Arsenault
>> <Matthew.Arsenault at amd.com <mailto:Matthew.Arsenault at amd.com>> wrote:
>>
>> On 01/26/2015 04:35 PM, Fiona Glaser wrote:
>>> On an out-of-tree target, IR similar to the attached causes the
>>> emission of the following instructions (pseudocode):
>>>
>>> setcc dst16, cc
>>> zext dst32, dst16
>>>
>>> The attached patch fixes it to be the expected:
>>>
>>> setcc dst32, cc
>>>
>>> because the out-of-tree target supports 32-bit setcc.
>>
>> Does it also support 16-bit setcc result types or return different
>> sized types for different compared types? I don't think this patch is
>> generally correct. It isn't correct to blindly promote the setcc
>> type, especially with any given promotion type. At the very least
>> this would only be valid to combine zext if the target's boolean
>> content is ZeroOrOneBooleanContent etc. setccs should only be created
>> with a type from getSetCCResultType, and any random extended type you
>> give it probably won't work
>>
>> -Matt
>
> What’s the correct way to do this then? Should I only perform the
> merge if the target supports the operation (i.e. never do it to a
> non-legal type)? That feels slightly weird given that it seemed like
> every other canonicalization done in that function supports non-legal
> types, so I’d at least like to know what the reason for that
> inconsistency is.
>
> I was trying to model the optimization after the code that immediately
> follows it, which already assumes that the setCC resolves to a 0 or 1:
>
> // aext(setcc x,y,cc) -> select_cc x, y, 1, 0, cc
> SDValue SCC =
> SimplifySelectCC(SDLoc(N), N0.getOperand(0), N0.getOperand(1),
>
> Fiona
The aext -> select_cc here is OK, but in your patch it will combine zext
(setcc) -> setcc. On a target that uses ZeroOrNegativeOneBooleanContent,
that would produce a wider type with 0 upper bits which is not the same
as the all 1 bits the larger compare result would give. There are also a
lot of pre-existing bugs that mishandle this now that still assume 0/1
setcc results.
I guess I don't understand why this wouldn't already be happening during
legalization if getSetCCResultType gives i32. I would not expect two
different result types for the same setcc to ever work.
-Matt
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150126/f300bd5b/attachment.html>
More information about the llvm-commits
mailing list