[llvm-commits] [llvm] r161152 - in /llvm/trunk: include/llvm/Target/TargetInstrInfo.h lib/CodeGen/PeepholeOptimizer.cpp lib/Target/X86/X86InstrInfo.cpp lib/Target/X86/X86InstrInfo.h test/CodeGen/X86/2012-05-19-avx2-store.ll test/CodeGen/X86/break-sse-dep.ll test/CodeGen/X86/fold-load.ll test/CodeGen/X86/fold-pcmpeqd-1.ll test/CodeGen/X86/sse-minmax.ll test/CodeGen/X86/vec_compare.ll
Jakob Stoklund Olesen
stoklund at 2pi.dk
Thu Aug 2 14:22:59 PDT 2012
On Aug 2, 2012, at 12:49 PM, Michael Liao <michael.liao at intel.com> wrote:
>> And then teach X86InstrInfo::breakPartialRegDependency() to unfold the load instead of inserting an xorps dependency breaking instruction:
>>
>> xorps %xmm1, %xmm1
>> sqrtsd (…), %xmm1
>
> In fact, this's what I want to suggestion to break partial register
> install. xorps idiom is better than movsd + sqrtsd by saving 1 byte in
> instruction as well as having much efficient support in OOO proccesoors.
>
> If no one works on that, I would start to develop a machine pass to
> break this kind partial register stalls.
No need, ExecutionDepsFix.cpp already does that. See X86InstrInfo::getPartialRegUpdateClearance() and breakPartialRegDependency().
AFAICT, the only thing missing is that hasPartialRegUpdate() doesn't yet know about the 'rm' versions of the instructions.
We also need to do something about AVX instructions where the superfluous dependency is explicit in an extra operand. I think they just all get xmm0 currently.
/jakob
More information about the llvm-commits
mailing list