[PATCH] D55604: [AggressiveInstCombine] convert rotate with guard branch into funnel shift (PR34924)

Wed Dec 12 12:15:42 PST 2018

nikic added inline comments.

================
Comment at: lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp:95
+    return Intrinsic::not_intrinsic;
+  };
+
----------------
spatel wrote:
> nikic wrote:
> > Rather than matching the rotate pattern here, would it be possible to leave the conversion to fsh to InstCombine (as you say, this is planned anyway), and only do the removal of the zero check here? Or is general canonicalization in InstCombine still blocked on some optimization issues?
> There's no clear limit on the power of aggressive-instcombine, but I'm assuming that we should follow the same rules as regular instcombine. Ie, you should be able to transfer anything in here to instcombine if you don't care about compile-time.
> 
> So we defer CFG changes to other passes (simplifycfg in particular) and just focus on logical transforms. This one blurs the lines of course, but I don't think we should be trying to alter the branch rather than the phi here.
> 
> The only remaining optimization issue that I'm aware of for funnel shift canonicalization is addressed by D55563.
I'm not suggesting to alter the branches here, but rather to split this transform into a two step process. First InstCombine converts the `(x << y) | (x >> (bw - y))` into `fshl(x, x, y)` (which is legal independently of the null check for rotates, but not general funnel shifts) and then here we only handle the `y == 0 ? x : fshl(x, x, y)`  pattern, rather than the shift-based variant.

However, after reading https://bugs.llvm.org/show_bug.cgi?id=34924, maybe this would not even be necessary? In one of the comments at the end you mention that if the rotate is represented as fsh, then simplifycfg already converts this pattern into a select. Together with the fsh+select simplification you already added in instsimplify, wouldn't that mean that a combination of instcombine+simplifycfg+instsimplify will already take care of the whole pattern (if we canonicalize the shl/ashr/or sequence to fsh in instcombine)?

So the process would go like this:

```
define i32 @rotl(i32 %a, i32 %b) {
entry:
  %cmp = icmp eq i32 %b, 0
  br i1 %cmp, label %end, label %rotbb

rotbb:
  %sub = sub i32 32, %b
  %shr = lshr i32 %a, %sub
  %shl = shl i32 %a, %b
  %or = or i32 %shr, %shl
  br label %end

end:
  %cond = phi i32 [ %or, %rotbb ], [ %a, %entry ]
  ret i32 %cond
}
// -instcombine
define i32 @rotl(i32 %a, i32 %b) {
entry:
  %cmp = icmp eq i32 %b, 0
  br i1 %cmp, label %end, label %rotbb

rotbb:
  %or = call i32 @llvm.fshl(i32 %a, i32 %a, i32 %b)
  br label %end

end:
  %cond = phi i32 [ %or, %rotbb ], [ %a, %entry ]
  ret i32 %cond
}
// -simplifycfg
define i32 @rotl(i32 %a, i32 %b) {
entry:
  %cmp = icmp eq i32 %b, 0
  %or = call i32 @llvm.fshl(i32 %a, i32 %a, i32 %b)
  %cond = select i1 %cmp, i32 %or, %a
  ret i32 %cond
}
// -instsimplify
define i32 @rotl(i32 %a, i32 %b) {
entry:
  %or = call i32 @llvm.fshl(i32 %a, i32 %a, i32 %b)
  ret i32 %or
}
```

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D55604/new/

https://reviews.llvm.org/D55604