sebpop added a comment. The result takes one extra packet, which is a perf regression on Hexagon. I think this is due to the fact that the sink of copies is post-ra, and there doesn't seem to be a propagation pass to remove the extra transfer r2=r0. https://reviews.llvm.org/D41463