[PATCH] Fix PR16938 - rewrite CSE heuristic for same block uses

Wed Apr 30 11:12:21 PDT 2014

On Tue, Apr 29, 2014 at 5:32 PM, Andrew Trick <atrick at apple.com> wrote:
>
> On Apr 29, 2014, at 12:38 PM, Eric Christopher <echristo at gmail.com> wrote:
>
>> Ping :)
>>
>> -eric
>>
>> On Tue, Apr 22, 2014 at 4:21 PM, Eric Christopher <echristo at gmail.com> wrote:
>>> Hi Evan,
>>>
>>> In taking a look at 16938 it seems that what you want is to check the
>>> uses of the original register in all of the basic blocks and then make
>>> sure that the new CSE'd expression is in one of them rather than the
>>> inverse?
>>>
>>> This will fix the PR mentioned above and no regressions anywhere. I
>>> ran the performance tests and didn't see anything outside of the noise
>>> as well.
>>>
>>> Thoughts?
>
> I can guess what the old heuristic was doing, but we don’t have a test case for it so I’m only speculating.

Right. It seems like (after our discussion yesterday) that the old
heursitic was looking for occasions when we already had a split range
and refusing to CSE if we'd unify one of those ranges.

>
> Your new heuristic is totally different. For your tests it would be just as good to simply delete the original heuristic. So we’re left with the question, why does your new heuristic do anything good (it doesn’t match the comments and it’s nonobvious)? Do we have any test case or performance results that show goodness?
>

I'd only tried to rewrite the heuristic to make a profitability
assessment on "should we do this", but I agree that it doesn't
necessarily make any sense there.

> In this situation, I would delete the old heuristic, see what regresses, then try to fix it with a new one.
>

Conveniently for me, that was my original patch that I'd already done
all of this with as well. The first and second heuristics still have
some use - as in there are testcases that begin to legitimately fail
in the backend due to additional register spills so it looks like some
of the heuristics are still required. I'll try to see if there's a
case where we will cse and then fail to split a live range in a way
that matters here, but I'm not confident - there's a lot of dependence
on further code and coming up with a testcase isn't always easy.

Thanks for the look and helping me understand the comment. I'll
probably update the comment to be a little more clear to me in the
meantime.

-eric