[PATCH] D23253: [X86] Generalized transformation of `definstr gr8; movzx gr32, gr8` to `xor gr32, gr32; definstr gr8`

Quentin Colombet via llvm-commits llvm-commits at lists.llvm.org
Mon Aug 22 15:14:28 PDT 2016


> On Aug 22, 2016, at 2:52 PM, bryant <3.14472+reviews.llvm.org at gmail.com> wrote:
> 
> bryant added a comment.
> 
> In https://reviews.llvm.org/D23253#522534, @qcolombet wrote:
> 
>> Hi,
> 
> 
> Thanks for actually reviewing this.
> 
>> What is the performance impact (both runtime and compile time) of the pass?
> 
> 
> Run-time impact, according to IACA, would be 1% to 40% (`pre-ra-sched.ll`) faster. Never worse. I'll measure the compile-time perf tonight. Do you have any suggestions on the preferred way to do this?

Yes, do the compile-time and runtime measurements on the llvm test-suite (see my next answer).

> 
>> You list results for test/CodeGen/X86, which is not usually what we use for perform testing. Could you try on the LLVM test suite?
> 
> 
> My measurements have thus far centered around the performance of generated code. Since this pass only ever runs when -march=x86/-64, there would be no difference in the rest of the arches, improvements or otherwise. If you really want, I could run the rest of the suite but the result would be a bunch of passes and xfails. What sort of results do you really want me to record here?

I meant the LLVM test-suite:
http://llvm.org/svn/llvm-project/test-suite/trunk

Not the tests from other targets :).

The listed repository has a bunch of benchmark that we use for performance comparison.
http://llvm.org/docs/TestingGuide.html

> 
>> Moreover, like Michael said, we should try to merge this pass with something else. A possible approach would be a peephole like optimization where we can plug more patterns.
> 
> 
> As I've mentioned before, this depends on virtual register LiveInterval information that is lost after the rewrite pass.

I don’t see why this is a problem. We have other means to compute liveness and if we really need to run that pass at this specific point, we could have such a mode in that hypothetical peephole.
By the way, have you consider running the pass after rewrite and use LivePhysReg utility?
I believe that should work and probably make the code simpler.

> 
>> Cheers,
> 
>> -Quentin
> 
> 
> 
> 
> 
> ================
> Comment at: lib/CodeGen/CalcSpillWeights.cpp:46-48
> @@ -45,5 +45,5 @@
> // Return the preferred allocation register for reg, given a COPY instruction.
> -static unsigned copyHint(const MachineInstr *mi, unsigned reg,
> -                         const TargetRegisterInfo &tri,
> -                         const MachineRegisterInfo &mri) {
> +unsigned VirtRegAuxInfo::copyHint(const MachineInstr *mi, unsigned reg,
> +                                  const TargetRegisterInfo &tri,
> +                                  const MachineRegisterInfo &mri) {
>   unsigned sub, hreg, hsub;
> ----------------
> qcolombet wrote:
>> This seems wrong to export this interface.
>> External users should rely on MachineRegisterInfo::getSimpleHint.
>> 
>> What is the problem in using MachineRegisterInfo::getSimpleHint?
> In cases where there is more than one hint, getSimpleHint only returns one and misses the rest.

Each virtual register has only one hint in the current scheme. I actually don’t see how you would get different results with copyHint (this is what is used to populate those hints). Could you elaborate on the cases you saw differences?

Thanks,
-Quentin

> 
> 
> Repository:
>  rL LLVM
> 
> https://reviews.llvm.org/D23253
> 
> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20160822/153f32ec/attachment.html>


More information about the llvm-commits mailing list