[PATCH] D13740: Catch combine opportunities for redundant imuls

Tue Oct 20 09:59:08 PDT 2015

spatel added a comment.

Thanks, Zia. Sadly, the scalar case won't fire on x86-64 even after r250560 because of the sexts, but that can be a follow-on patch.

Just a couple of nitpicks, otherwise LGTM.

I think Simon was concerned with the non-splat vector case though, so I'll let him give this another look if that scenario needs a different test case.

================
Comment at: test/CodeGen/X86/combine-multiplies.ll:92
@@ +91,3 @@
+; Again, we want to make sure we don't generate two different multiplies.
+; We should have a single multiple for "v1 * {22, 22, 22, 22}" (made up of two
+; pmuludq instructions), followed by two adds. Without this optimization, we'd
----------------
multiple -> multiply

================
Comment at: test/CodeGen/X86/combine-multiplies.ll:119
@@ +118,3 @@
+; Function Attrs: nounwind
+define void @foo_splat(<4 x i32> %v1) "target-cpu"="pentium4" {
+entry:
----------------
Would a "-mattr=sse2" on the RUN line constrain this enough? I prefer to specify necessary attributes rather than CPU models as proxies for those attributes.

http://reviews.llvm.org/D13740