[llvm-commits] [llvm] r91672 - in /llvm/trunk: lib/Target/X86/X86.td lib/Target/X86/X86InstrInfo.cpp lib/Target/X86/X86InstrInfo.td lib/Target/X86/X86InstrSSE.td lib/Target/X86/X86Subtarget.cpp lib/Target/X86/X86Subtarget.h test/CodeGen/X86/break-sse-dep.ll

Chris Lattner clattner at apple.com
Mon Dec 21 11:13:24 PST 2009


On Dec 21, 2009, at 11:05 AM, Evan Cheng wrote:

>> Unless there is a reason to have this, I'd prefer to not have it clutter up the td files.  I can't imagine a reasonable (non-scalarizing) implementation of SSE that wouldn't have this issue.
> 
> Really? It's a big surprise to Dan and I (and the engineer who noticed this) that unfolding the load actually breaks the register dependency. It's not documented anywhere in the public Intel manual.

It was also surprising to me, but it makes perfect sense in retrospect.  If a scalar load zeros the top of the register, it "obviously" has no dependence on the top bits coming in.

>> For example, you didn't add this flag to any of the AMD chips.
> 
> That's intentional. I don't have a AMD machine to try it on and I did not want to introduce a regression. 

I'm almost certain they have the same issue.

> 
> I don't really care that much whether it's a subtarget feature. If no one pipes up soon, I'll remove it.


Thanks!

-Chris



More information about the llvm-commits mailing list