[PATCH] D52177: [InstCombine] Fold ~A - Min/Max(~A, O) -> Max/Min(A, ~O) - A

Thu Oct 11 12:13:51 PDT 2018

craig.topper added a comment.

I'm seeing a regression on goldmont and silvermont cpus in an rgb cmyk conversion benchmark in 32-bit mode. What I've observed is that the 3 subtracts in the code all now have the same LHS register. X86 destroys the LHS of a subtract instruction so we have to make copies before the subtracts. We're in 32-bit mode so our 8-bit register choices are %al, %bl, %cl, %dl, %ah, %bh, %ch, %dh. Silvermont and Goldmont have bad partial register handling for writing high and low 8-bit registers. Unlike Sandy Bridge, Haswell, Skylake, the high and low registers aren't renamed independently. I tried playing around with promoting everything to 32-bits to avoid the partial registers, but that was actually worse somehow.

@dmgreen what target are you using?

Repository:
  rL LLVM

https://reviews.llvm.org/D52177