[llvm-dev] InstCombine: Narrow switch instructions using known bits
Akira Hatanaka via llvm-dev
llvm-dev at lists.llvm.org
Wed Oct 10 12:46:02 PDT 2018
On Fri, Sep 21, 2018 at 7:02 AM Bakhvalov, Denis via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> Hello,
>
>
>
> I found one case where narrowing switch instructions transformation in
> InstCombine produces worse code.
>
> Let's suppose that I have such code:
>
>
>
> $ cat a.c
>
> void foo();
>
> void bar();
>
> void zoo();
>
>
>
> void my_func(unsigned int a) {
>
> unsigned char b = a & 0xF;
>
> switch (b) {
>
> case 0: foo(); break;
>
> case 1: bar(); break;
>
> case 2: foo(); break;
>
> case 3: foo(); break;
>
> case 4: foo(); break;
>
> case 5: bar(); break;
>
> case 6: foo(); break;
>
> case 7: foo(); break;
>
> case 8: bar(); break;
>
> case 9: foo(); break;
>
> case 10: foo(); break;
>
>
>
> default: zoo();
>
> }
>
> }
>
>
>
> Using recent clang:
>
>
>
> $ clang -O3 -S -c a.c -o a.s
>
> I have the following assembly in the beginning of my_func:
>
> # bad case
>
> movl %edi, %eax
>
> andb $15, %al
>
> cmpb $10, %al
>
> ja .LBB0_9 # jump to the default
> case
>
>
>
> andl $15, %edi
>
> jmpq *.LJTI0_0(,%rdi,8) # go to jump table
>
>
>
> I found that if I disable switch shrinking like shown below:
>
>
>
> $ git diff
>
> diff --git a/lib/Transforms/InstCombine/InstructionCombining.cpp
> b/lib/Transforms/InstCombine/InstructionCombining.cpp
>
> index 5d5a9b2..3682b88 100644
>
> --- a/lib/Transforms/InstCombine/InstructionCombining.cpp
>
> +++ b/lib/Transforms/InstCombine/InstructionCombining.cpp
>
> @@ -2429,6 +2429,8 @@ Instruction
> *InstCombiner::visitSwitchInst(SwitchInst &SI) {
>
> return &SI;
>
> }
>
>
>
> + return nullptr;
>
> +
>
> KnownBits Known = computeKnownBits(Cond, 0, &SI);
>
> unsigned LeadingKnownZeros = Known.countMinLeadingZeros();
>
> unsigned LeadingKnownOnes = Known.countMinLeadingOnes();
>
>
>
> I get better assembly (there is no additional MOV and AND instructions):
>
> # good case
>
> andl $15, %edi
>
> cmpl $10, %edi
>
> ja .LBB0_9
>
>
>
> jmpq *.LJTI0_0(,%rdi,8)
>
>
>
> This transformation was introduced in the commit:
>
> commit 4eb03123dfda2de88a84852834845678833c8c36
>
> Author: Akira Hatanaka <ahatanaka at apple.com>
>
> Date: Thu Oct 16 06:00:46 2014 +0000
>
> Reapply r219832 - InstCombine: Narrow switch instructions using known
> bits.
>
>
>
> From IR point of view, after the ‘opt –codegenprepare’ the difference is
> that in good case we have simple ‘and’ operation (all calculations are made
> in i32). In the bad case there is 'trunc' to i4 and then ‘zext’ to i8.
>
>
>
> During instruction selection we expand switch into a jump table. In the
> bad case we use 2 copies of the value that we are switching on. First is in
> i8 that we use to determine whether we should jump to default case. The
> second is in i64, which we use for calculating address in the jump table.
> In the good case they were combined.
>
>
>
> But there is still one thing that I don’t understand. What is the bad case
> that this transformation (narrowing switch instructions) was supposed to
> fix, i.e. does this transformation still make sense?
>
>
>
I couldn't find the original test case, but the patch was committed to fix
a switch over a 128-bit value that was causing llvm to generate suboptimal
code in some cases. I'm not sure whether this optimization is still
necessary today.
> Best regards,
> Denis Bakhvalov.
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20181010/2e6bb86f/attachment.html>
More information about the llvm-dev
mailing list