[PATCH] D108977: [AArch64] Support target specific isSuitableForJumpTable

JinGu Kang via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Sep 2 04:37:58 PDT 2021


jaykang10 added inline comments.


================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:10753
+  // scores.
+  if (NumCases <= 16)
+    return false;
----------------
dmgreen wrote:
> Where did the number 16 come from? Just from benchmarks? Unless there are jump tables in hot areas of the code this is more likely to be measuring the knock-on effects of changing code alignment than anything actually to do with the jump tables. It just turns into noise that doesn't stand up over time as other changes get made by the compiler.
> 
> It would likely be best to at least have it as an option. Some of the tests below could then keep small testing jumptables.
Thanks for comment @dmgreen 

As you know, AArch64 target generates jump table as below.
```
// %bb.0:                               // %entry
	sub	w8, w0, #1
	cmp	w8, #3
	b.hi	.LBB0_6
// %bb.1:                               // %entry
	adrp	x9, .LJTI0_0
	add	x9, x9, :lo12:.LJTI0_0
	adr	x10, .LBB0_2
	ldrb	w11, [x9, x8]
	add	x10, x10, x11, lsl #2
	br	x10
...
.LJTI0_0:
	.byte	(.LBB0_2-.LBB0_2)>>2
	.byte	(.LBB0_3-.LBB0_2)>>2
	.byte	(.LBB0_4-.LBB0_2)>>2
	.byte	(.LBB0_5-.LBB0_2)>>2
```
Let's say we consider only the number of instructions rather than instruction cycle and latency information. There are usually 9 instructions for jump table. If we do not generate jump table, there are `cmp` and `branch` instructions per each case. It means that if there are switch instruction with 4 cases, jump table could be slower than the combination of `cmp` and `branch` instructions. For other case, if the `cmp` and `branch` instructions at the beginning are taken, the jump table could be slower. If we consider each architecture's cycle information, it will be more complex to decide the number...

At this moment, AArch64 target is using the default `setMinimumJumpTableEntries(4)` but it does not consider each cluster which might be optimized for adjacent cases with same destination block. It could be good to implement AArch64 specific `isSuitableForJumpTable`.

I agree with you. It will be better to have option to tune it.

Let me think about this patch more carefully.


================
Comment at: llvm/test/CodeGen/AArch64/jump-table.ll:6
 
 define i32 @test_jumptable(i32 %in) {
 
----------------
dmgreen wrote:
> Any of these test checking for the structure of jump tables are probably best left testing jump tables, perhaps with an option to overwrite the default.
Yep, I agree with you.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D108977/new/

https://reviews.llvm.org/D108977



More information about the llvm-commits mailing list