<div dir="ltr"><div>I found a bug in X86InstrInfo::foldMemoryOperandImpl which results in incorrectly folding a zero-extending 64-bit load from the stack into an addpd, which is a SIMD add of two packed double precision floating point values.</div><div><br></div><div><div>The machine IR looks like this before peephole optimization:</div><div><br></div><div><span class="" style="white-space:pre">        </span>%vreg21<def> = MOVSDrm <fi#0>, 1, %noreg, 0, %noreg; mem:LD8[%7](align=16)(tbaa=<badref>) VR128:%vreg21</div><div><span class="" style="white-space:pre">  </span>%vreg23<def,tied1> = ADDPDrr %vreg20<tied0>, %vreg21; VR128:%vreg23,%vreg20,%vreg21</div><div><br></div><div>After the optimization, MOVSDrm is folded into ADDPDrm:</div><div><br></div><div><span class="" style="white-space:pre">    </span>%vreg23<def,tied1> = ADDPDrm %vreg20<tied0>, <fi#0>, 1, %noreg, 0, %noreg; mem:LD8[%7](align=16)(tbaa=<badref>) VR128:%vreg23,%vreg20</div></div><div><br></div><div>ADDPDrm loads a 128-bit value from the memory and adds it to %vreg20.</div><div><br></div><div>X86InstrInfo::foldMemoryOperandImpl already has the logic that prevents this from happening (near line 4544), however the check isn't conducted for loads from stack objects. The attached patch factors out this logic into a new function and uses it for checking loads from stack slots are not zero-extending loads.</div><div><br></div><div>This fixes rdar://problem/18236850.</div></div>