[PATCH] D18890: [AArch64] add SSA Load Store optimization pass

Jongwon Lee via llvm-commits llvm-commits at lists.llvm.org
Thu Apr 14 03:58:19 PDT 2016


JongwonLee marked 16 inline comments as done.
JongwonLee added a comment.

In http://reviews.llvm.org/D18890#398755, @junbuml wrote:

> Hi Jongwon
>  Thanks you for the update with the additional alias check. However, as you merge up the second load/store to the first load/stores, I believe you need to check if there is any instruction in between the first and second load/store, which may alias with the second load/store, not just for the first store and the second load.
>
> I'm also curious how this pass was motivated. Did you see any performance gain with this change?
>
> Please, see my inline comments for minor issues.


Hi junbum,
Thanks for the comments.

I think the alias check is needed for (first store, second load) and (second store, first load). I think we don't need to check any other instructions except 4 instructions (2 loads, 2 stores) because the first store is the only use instruction of the first load and the second store is also the only use instruction of the second load.

  (1) reg1 = load [mem1]
  (2) reg2 = load [mem1 +4]
  (3) store reg1, [mem2]
  (4) store reg2, [mem2 + 4]
  
  (3) is the only use instruction of reg1
  (4) is the only use instruction of reg2

This work is motivated from some test cases. Even if the work doesn't get the performance gain for all cases, it has positive effect for some cases. I'll show the data when ready.


================
Comment at: lib/Target/AArch64/AArch64SSALoadStoreOptimizer.cpp:182-184
@@ +181,5 @@
+    if (Offset % 2)
+      return UnscaledOffset < 256 && UnscaledOffset >= -256;
+    else
+      return UnscaledOffset <= 16380 && UnscaledOffset >= 0;
+  }
----------------
junbuml wrote:
> Is there any programmatic way to express the meaning of constants you use here instead of directly using them here? 
Fixed the code like the below.

#define MAX_UNSCALED_OFFSET 255
#define MIN_UNSCALED_OFFSET 256
...
 return UnscaledOffset <= MAX_UNSCALED_OFFSET && UnscaledOffset >= MIN_UNSCALED_OFFSET;
...

Is that O.K?

================
Comment at: lib/Target/AArch64/AArch64SSALoadStoreOptimizer.cpp:330
@@ +329,3 @@
+  // operand.
+  if (!getLdStBaseOp(FirstMI).isReg() || !getLdStOffsetOp(FirstMI).isImm())
+    return false;
----------------
junbuml wrote:
> Should it be assert () because you specifically handle AArch64::STURWi, AArch64::STRWui ? 
Some test cases shows that this condition is not satisfied when opcode is AArch64::STURWi or AArch64::STRWui.

================
Comment at: lib/Target/AArch64/AArch64SSALoadStoreOptimizer.cpp:349
@@ +348,3 @@
+  // operand.
+  if (!getLdStBaseOp(&DefInst).isReg() || !getLdStOffsetOp(&DefInst).isImm())
+    return false;
----------------
junbuml wrote:
> Same here. I think it should be assert () because you specifically handle AArch64::LDRWui, AArch64::LDURWi ? 
Some test cases shows that this condition is not satisfied when opcode is AArch64::LDRWui or AArch64::LDURWui.

================
Comment at: lib/Target/AArch64/AArch64SSALoadStoreOptimizer.cpp:369
@@ +368,3 @@
+    // operand.
+    if (!getLdStBaseOp(MI).isReg() || !getLdStOffsetOp(MI).isImm())
+      return false;
----------------
junbuml wrote:
> Maybe assert() here as you specifically detect STURWi and STRWui.
Some test cases shows that this condition is not satisfied when opcode is AArch64::STURWi or AArch64::STRWui.

================
Comment at: lib/Target/AArch64/AArch64SSALoadStoreOptimizer.cpp:413-420
@@ +412,10 @@
+
+      // If consecutive loads/stores are found, the followings are satisfied.
+      //  dst reg of first load = src reg of first store
+      //  dst reg of second load = src reg of second store
+      if (getLdStRegOp(consecutive_loads[0]).getReg() !=
+              getLdStRegOp(consecutive_stores[0]).getReg() ||
+          getLdStRegOp(consecutive_loads[1]).getReg() !=
+              getLdStRegOp(consecutive_stores[1]).getReg())
+        return false;
+
----------------
junbuml wrote:
> Do we need to check this again? Or maybe assert() ?
We need to check this. For example,


```
reg2 = load [mem1]
reg1 = load [mem1 +4]
store reg1, [mem2]
store reg2, [mem2 +4]
```

If we don't check, we will get the following code.

```
reg2 = wide-load [mem1]
wide-store reg1, [mem2]
```

Resultant code gets to use reg1 without definition.



================
Comment at: lib/Target/AArch64/AArch64SSALoadStoreOptimizer.cpp:424
@@ +423,3 @@
+      // instruction.
+      if (instrAliased(consecutive_stores[0], consecutive_loads[1])) {
+        DEBUG(dbgs() << "The first store instruction:\n    ");
----------------
junbuml wrote:
> Don't you also need to check  if there is any instruction in between the first and second load, which is aliased with the second loads?  The same check should be performed for stores.
I think the alias check is needed for (first store, second load) and (second store, first load). I think we don't need to check any other instructions except 4 instructions (2 loads, 2 stores) because the first store is the only use instruction of the first load and the second store is also the only use instruction of the second load.


```
(1) reg1 = load [mem1]
(2) reg2 = load [mem1 +4]
(3) store reg1, [mem2]
(4) store reg2, [mem2 + 4]

(3) is the only use instruction of reg1
(4) is the only use instruction of reg2
```




http://reviews.llvm.org/D18890





More information about the llvm-commits mailing list