<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/80616>80616</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
[LICM] Missed optimization due missing alignment information
</td>
</tr>
<tr>
<th>Labels</th>
<td>
new issue
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
lucic71
</td>
</tr>
</table>
<pre>
LICM cannot hoist a load instruction if the align attribute is missing from the pointer, see `%0 = load ptr, ptr %this, align 8 ` below.
```
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"
define i1 @_ZN1A3fooEii(ptr dereferenceable(8) %this, i1 %exitcond.not) { ;<-- align attribute is missing
entry:
br label %while.cond
while.cond: ; preds = %while.body, %entry
%indvars.iv = phi i64 [ %indvars.iv.next4, %while.body ], [ 0, %entry ]
br i1 %exitcond.not, label %cleanup, label %while.body
while.body: ; preds = %while.cond
%indvars.iv.next4 = add i64 %indvars.iv, 1
%0 = load ptr, ptr %this, align 8 ; <-- instruction that is not hoisted
%1 = load i32, ptr %0, align 4
%cmp2 = icmp eq i32 %1, 0
br i1 %cmp2, label %cleanup, label %while.cond
cleanup: ; preds = %while.body, %while.cond
%cmp = icmp slt i64 %indvars.iv, 0
ret i1 %cmp
}
```
If we add the align attribute to the pointer as follows: `define i1 @_ZN1A3fooEii(ptr align 8 dereferenceable(8) %this, i1 %exitcond.not) {` the load instruction is successfully hoisted to entry. See https://godbolt.org/z/PKzxMWonG, on the left are the IRs with and without align and on the right are the results of -O2.
This happens because in isAligned, this condition will fail as BA, i.e. the alignment of the pointer, which is 1, is less than Alignment, i.e. the alignment of the load instruction, which is 8:
https://github.com/llvm/llvm-project/blob/900e7cbfdee09c94d022e4dae923b3c7827f95e3/llvm/lib/Analysis/Loads.cpp#L32
However for architectures such as x86 this is not optimal because unaligned loads are allowed. To prove this we can do the following experiment: we compare the assembly output of the hoisted version when the alignment of the load is 8 and when the alignment of the load is 1. This will result in no difference between the two versions, see https://godbolt.org/z/MqPTb5ch7
The same case can be done for all architectures that support unaligned loads.
cc: @nunoplopes
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJykVluP2zYT_TX0y8ACRV0sPfjBu_v5a9BsGzQBCvQloMSRxZYmFZJae_PrC1KWL5ttNkUXC1mXmeHMOWeG5M7JnUZck-KOFA8LPvre2LUaW9mu0kVjxPP6_bv7R2i51sZDb6TzwEEZLkBq5-3Yemk0yA58j8CV3Gng3lvZjB5BOthL56TeQWfNPtoMRmqPlrB7cIhASkpYQYFkD1PYwcdvg7dAWOF76cLjFLkK5tCgMoeE0AdCN6drSU__8dFzu0MPgnuu-LMZfYxOGMPlnmQbXA5sRUm2yVi8hMf09jHclXm8LOV801XBKWXVUlfhppx9ynz5MWUVYexmfW_loHBe-1iVn8t8Oeq_tDnopZJ6PC53ejx7TVeBndQIMgWS089__JJuss6Y_0lJWBUwEWixQ4u6Rd4oJKyqCKuvoQqurMCj9K3RItHGR4PVHZDsjmT3y-V3eJqSQO3tM8lOKQE0FhRvUIXAh14qTELo67Sv3mabsBAMFoU7FX9yCnoKGYb04gqn8IQVUosnbl0in6LL0EuQZQ6kuLv9mmg8-vwU5BIVSPEQXxZ3QK-XiB8uVbyGzf2ltlYh1-Nw8-4q9W_qjW__ud4LSPBaFdGYCzFVev09JJBeef54e2SB5UDxdXf6nvvA8bmF8Tqt9BJcZuwqOL1Ezk_2AaP9wKKHbPcD4JfgFMMEa_oN1sH8BzF-qanZMtvAD_-9rbxXqQmlnGtyyr9Oybk6i_5S3inl1cOrs2i6vuvggJHr18akN9eTEbiDzihlDi5qq6RvzYSZ_f8wG8JcDTl8O9kduLFt0bluVOp51k_IOTZYAh8Rofd-CNkStiVsuzOiMconxu4I234lbPvh56_Hx9-N_n9IIioSQWHngVuMD-9-c3CQvgeuRbwJU_uEkxazi5W7_uJj0Y3KOzAdLH9lNxvCp1466PkwoHbQYMtHhyBDNZsQE0XII4ACAQcZKz1IpaDjUgUC7jYRrgSTC2N71D4s9mIXO_Sy7QNOsQOkA4XOhZ7TsJndvh_sJeg3UavzHH4BsvT92CSt2RO2Vepp_lkO1vyJrSds2yjTELatKcVV23QCkdZtnQvKGOaCY82yJmtXFVt1dYHZVRwZ_Daaq2cXhLN9b7hwSTsMhGXvs5sN6ydzwCe00BkL3La99Nj60WLUTR-wPFblhPVpBJnByz1XZ15GzSdSIhAu8suD_lEk8MnAYM0TThEOGI4iIKaGmboknC7wOKCVEelsE63Mfph1wp3DfaOewYx-GM-oz0p-Qusi_z3q7_HjoJrU-aZdmkAUYFTUpNIgPm1AyO7UoNCgP-Apkj-YOQ03H4zeaqnHLx8-NUXbr-BW-AiO7wNMbsKqQRBG48SPUi84iluDG4fBWP-SiJuOats4jHKqR20GZQZ0sBDrTNRZzRe4Tld0RVdVnZaLfl1VXNSszlZ1VtOVEHWTNhlNC97WTdnRZiHXjLKcMpozmhVpmRQdshpFmYmirHOWk5zinkuVBEWGqhfSuRHXFS3TchE3DhdPrYxpPED8GE5SxcPCrmMbNOPOkZwq6by7RPHSq3jcDcdaUjzAo3QOxaRJ-ZXHSSBGPJ9bLyRL3Rm7jxaL0ar1v27HmGTopljE3wEAAP__0fmV5A">