[llvm-commits] [llvm] r161152 - in /llvm/trunk: include/llvm/Target/TargetInstrInfo.h lib/CodeGen/PeepholeOptimizer.cpp lib/Target/X86/X86InstrInfo.cpp lib/Target/X86/X86InstrInfo.h test/CodeGen/X86/2012-05-19-avx2-store.ll test/CodeGen/X86/break-sse-dep.ll test/CodeGen/X86/fold-load.ll test/CodeGen/X86/fold-pcmpeqd-1.ll test/CodeGen/X86/sse-minmax.ll test/CodeGen/X86/vec_compare.ll

Jakob Stoklund Olesen stoklund at 2pi.dk
Thu Aug 2 14:22:59 PDT 2012


On Aug 2, 2012, at 12:49 PM, Michael Liao <michael.liao at intel.com> wrote:

>> And then teach X86InstrInfo::breakPartialRegDependency() to unfold the load instead of inserting an xorps dependency breaking instruction:
>> 
>>  xorps %xmm1, %xmm1
>>  sqrtsd (…), %xmm1
> 
> In fact, this's what I want to suggestion to break partial register
> install. xorps idiom is better than movsd + sqrtsd by saving 1 byte in
> instruction as well as having much efficient support in OOO proccesoors.
> 
> If no one works on that, I would start to develop a machine pass to
> break this kind partial register stalls.

No need, ExecutionDepsFix.cpp already does that. See X86InstrInfo::getPartialRegUpdateClearance() and breakPartialRegDependency().

AFAICT, the only thing missing is that hasPartialRegUpdate() doesn't yet know about the 'rm' versions of the instructions.

We also need to do something about AVX instructions where the superfluous dependency is explicit in an extra operand. I think they just all get xmm0 currently.

/jakob





More information about the llvm-commits mailing list