[PATCH] Check Dest Register Liveness using MRI in CondOpt pass
Pete Cooper
peter_cooper at apple.com
Tue Jun 2 13:23:40 PDT 2015
Hi all
We’ve discovered that this change (which was meant to be a compile-time improvement only and so NFC) is changing behavior. I’d like to get an idea of how you would all like to proceed.
So the behavioral difference isn’t in CondOpt, but the next pass called AArch64ConditionalCompares. CondOpt ran first, but AArch64ConditionalCompares was also implicitly using the results of LiveInternvals via MachineTraceMetrics.
On bzip2 decompress, this patch has resulted in AArch64ConditionalCompares *not* performing a transformation because the metrics suggest its not profitable. This results in performance improving almost 6% on that benchmark.
The specific difference in performance is the branch on line 871 of AArch64ConditionalCompares. Prior to this patch the ResDepth and HeadDepth would be such that its profitable to do the work. After the change, that branch finds its not profitable. Looking in to where we get the data for that branch, MachineTraceMetrics is returning different depths due to the presence of kill/dead flags.
The worrying thing here is that MachineTraceMetrics is implicitly relying on LiveIntervals. It doesn’t need it, but it silently giving different results before/after LI runs isn’t ideal. I don’t know whether to document that, or say its a bug or not.
I can think of a few solutions, comments welcome on where to go from here.
1 - Just revert this patch, it wasn’t NFC as it was meant to be.
2 - Make AArch64ConditionalCompares require LiveIntervals. This is an incremental improvement over what we had before as we’ve at least removed one pass being dependent on LI.
3 - Teach MachineTraceMetrics to use MRI as this patch did for CondOpt. This will only work for virtual register dead def checks, so may not be suitable given physical register dead defs and kill flags will still differ depending on LI.
4 - Leave things as they are as we actually have a performance improvement, and file a PR to try work out why AArch64ConditionalCompares can actually slow things down, i.e., reevaluate its heuristics in light of this data.
BTW, i’ve CCed Gerolf, Yi, and Michael who all did most of the work discovering the change here. They and I are happy to help work out whats going on an how to proceed. We can try provide more data, debug dumps, and IR if needed.
Thanks,
Pete
> On Apr 22, 2015, at 11:08 AM, Pete Cooper <peter_cooper at apple.com> wrote:
>
> Thanks Chad and Zhaoshi.
>
> r235532.
>
> Thanks,
> Pete
>> On Apr 22, 2015, at 11:02 AM, Chad Rosier <mcrosier at codeaurora.org <mailto:mcrosier at codeaurora.org>> wrote:
>>
>> Zhaoshi gave your patch a LGTM. Go for it. Thanks, Pete.
>>
>> From: Pete Cooper [mailto:peter_cooper at apple.com <mailto:peter_cooper at apple.com>]
>> Sent: Wednesday, April 22, 2015 1:04 PM
>> To: mcrosier at codeaurora.org <mailto:mcrosier at codeaurora.org>
>> Cc: zhaoshiz at codeaurora.org; apazos at codeaurora.org; Tim Northover; Jiangning Liu; sdmitrouk at accesssoftek.com; llvm-commits
>> Subject: Re: [PATCH] Check Dest Register Liveness using MRI in CondOpt pass
>>
>>
>>> On Apr 22, 2015, at 8:04 AM, Chad Rosier <mcrosier at codeaurora.org <mailto:mcrosier at codeaurora.org>> wrote:
>>>
>>> Pete,
>>> I’ve pinged Zhaoshi and Ana using our internal emails. Hopefully, they can provide some feedback shortly.
>> Thanks Chad. Sounds good.
>>
>> Pete
>>
>>>
>>> Chad
>>>
>>> From: Pete Cooper [mailto:peter_cooper at apple.com <mailto:peter_cooper at apple.com>]
>>> Sent: Tuesday, April 21, 2015 6:51 PM
>>> To: mcrosier at codeaurora.org <mailto:mcrosier at codeaurora.org>
>>> Cc: zhaoshiz at codeaurora.org <mailto:zhaoshiz at codeaurora.org>; apazos at codeaurora.org <mailto:apazos at codeaurora.org>; Tim Northover; Jiangning Liu; sdmitrouk at accesssoftek.com <mailto:sdmitrouk at accesssoftek.com>; llvm-commits
>>> Subject: [PATCH] Check Dest Register Liveness using MRI in CondOpt pass
>>>
>>> Hi Chad
>>>
>>> http://reviews.llvm.org/D6048 <http://reviews.llvm.org/D6048> (everyone CCed here was CCed on it) added register liveness checking to the CondOpt pass. It does this with
>>>
>>>> if (I->getOperand(0).isDead())
>>>
>>> However, this requires that some kind of liveness has been run beforehand, so the pass had to require live intervals. Looking at the pass manager dump, we then invalidated live intervals shortly after CondOpt because other passes don’t preserve it.
>>>
>>> Attached is a patch which removes the dependency on LiveIntervals by checking if the def has any uses. If it has no uses then it was going to be marked dead anyway, so this is equivalent to your original code. I’ve verified that this passes ‘make check’ and specifically stepped through the examples in combine-comparisons-by-cse.ll in lldb to ensure the behavior was the same.
>>>
>>> When building a bitcode containing all of llc with/without this change, it results in one less run of LiveIntervals per MF and saves 4s out of 80s total compile time.
>>>
>>> Thanks,
>>> Pete
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150602/948318a3/attachment.html>
More information about the llvm-commits
mailing list