[PATCH][X86] Prevent the creation of redundant ops for sadd and ssub with overflow.

Juergen Ributzka juergen at apple.com
Thu Jan 23 22:22:12 PST 2014


Hi Nadav,

I opened and closed that can of worms very quickly today - thanks for git stash. I first tried to solve it the other way around and remove all INCs and DECs from the backend and only generate ADDs.
I think we are currently only generating INCs for EmitTest, LowerXALUO and now also in LowerBRCOND, so it won’t be to difficult to remove them. In the end I didn’t do it, because I didn’t know
of an easy way during ISEL that would tell me if it is safe to convert an ADD + 1 to an INC. I fully agree with you and I would also prefer a pass that cleans up afterwards if we want to reclaim the extra
bytes we wasted, but I opted for the quick fix instead (for now) ;-)

Should I open a bug on bugzilla for this?

Thanks

Cheers,
Juergen


On Jan 23, 2014, at 9:47 PM, Nadav Rotem <nrotem at apple.com> wrote:

> Hi Juergen, 
> 
> Thanks for working on this. The motivation for this is clear and I think that it would be great if we could generate better code for overflow checks. The patch looks great, but I was wondering why you decided to convert the ADD to INC and not the other way around. Are there lots of other places that generate INC instructions in the x86 backend? 
> 
> Maybe I am opening a can of worms here, but I wanted to mention that I think that we should be generating more ADDs and less INCs. INCs are smaller to encode, but create a dependency on the flags register because (the CF is not set). I think that if we really cared about the performance-size tradeoffs of ADD vs INC then we should solve this in a later pass that can take scheduling into account.  In any case, your patch looks great, and if we want to address the INC problem one day we can also search-and-replace this code. 
> 
> Thanks,
> Nadav
> 
> 
> On Jan 23, 2014, at 9:13 PM, Juergen Ributzka <juergen at apple.com> wrote:
> 
>> Hi @ll,
>> Hi Nadav,
>> 
>> the use of the sadd.with.overflow (ssub.with.overflow) intrinsic on X86 creates suboptimal code. In the X86 backend the intrinsic itself might be lowered to a different X86 specific instruction than the conditional branch that depends on the overflow result. For example a simple loop counter with overflow check would get expanded to two different instructions. The intrinsic lowering would create an INC instruction, because the loop counter is incremented by 1. The conditional branch lowering on the other side wouldn’t optimize for this special case and just emit and ADD instruction. This prevented the SelectionDAG from recognizing and removing one of the redundant additions. This patch simply modifies the conditional branch lowering to generate the same code as the intrinsic lowering for sadd/ssub would do.
>> 
>> Cheers,
>> Juergen
>> 
>> <0001-X86-Prevent-the-creation-of-redundant-ops-for-sadd-a.patch>
> 





More information about the llvm-commits mailing list