Just a small addition that I will investigate tomorrow: The whole “get an i32 from memory into an xmm element” gets much easier if my transform emits a pinsrd for that case. I will change it to use pinsrd (for the i32 from mem case) tomorrow. Filipe http://reviews.llvm.org/D3475