<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/101899>101899</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
Pessimization in SROA
</td>
</tr>
<tr>
<th>Labels</th>
<td>
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
ChayimFriedman2
</td>
</tr>
</table>
<pre>
Given the following LLVM IR (minimized from [real Rust code](https://github.com/rust-lang/rust/issues/128214)):
```llvm
define void @after_all_unwrap(ptr dead_on_unwind noalias nocapture noundef writable writeonly sret([16 x i8]) align 8 dereferenceable(16) %_0, ptr noalias noundef nonnull align 4 %data.0, i64 noundef %data.1, ptr noalias nocapture noundef readonly align 8 dereferenceable(16) %indices) {
start:
%_8 = alloca [16 x i8], align 8
%_3 = alloca [16 x i8], align 8
%0 = icmp eq i8 2, 2
br i1 %0, label %bb1, label %bb2
bb1:
%1 = load i64, ptr %indices, align 8
%2 = getelementptr inbounds i32, ptr %data.0, i64 %1
%3 = getelementptr inbounds i8, ptr %indices, i64 8
%4 = load i64, ptr %3, align 8
%5 = getelementptr inbounds i32, ptr %data.0, i64 %4
store ptr %2, ptr %_8, align 8
%slot.sroa.4.0._0.sroa_idx.i = getelementptr inbounds i8, ptr %_8, i64 8
store ptr %5, ptr %slot.sroa.4.0._0.sroa_idx.i, align 8
call void @llvm.memcpy.p0.p0.i64(ptr noundef nonnull align 8 dereferenceable(16) %_3, ptr noundef nonnull align 8 dereferenceable(16) %_8, i64 16, i1 false)
br label %bb3
bb2:
%6 = getelementptr inbounds i8, ptr %_3, i64 8
store i8 0, ptr %6, align 8
store ptr null, ptr %_3, align 8
br label %bb3
bb3:
%8 = load ptr, ptr %_3, align 8
%9 = icmp eq ptr %8, null
br i1 %9, label %unwrap, label %exit
unwrap:
call void @llvm.trap()
unreachable
exit: ; preds = %bb3
call void @llvm.memcpy.p0.p0.i64(ptr noundef nonnull align 8 dereferenceable(16) %_0, ptr noundef nonnull align 8 dereferenceable(16) %_3, i64 16, i1 false)
ret void
}
```
LLVM transforms it into:
```llvm
define void @after_all_unwrap(ptr dead_on_unwind noalias nocapture noundef writable writeonly sret([16 x i8]) align 8 dereferenceable(16) %_0, ptr noalias noundef nonnull align 4 %data.0, i64 noundef %data.1, ptr noalias nocapture noundef readonly align 8 dereferenceable(16) %indices) local_unnamed_addr #0 {
%0 = load i64, ptr %indices, align 8
%1 = getelementptr inbounds i8, ptr %indices, i64 8
%2 = load i64, ptr %1, align 8
%3 = getelementptr inbounds i32, ptr %data.0, i64 %0
%4 = getelementptr inbounds i32, ptr %data.0, i64 %2
%5 = ptrtoint ptr %4 to i64
%_8.sroa.2.9.extract.shift = lshr i64 %5, 8
%_8.sroa.2.9.extract.trunc = trunc nuw i64 %_8.sroa.2.9.extract.shift to i56
%_8.sroa.2.8.extract.trunc = trunc i64 %5 to i8
store ptr %3, ptr %_0, align 8
%_3.sroa.4.0._0.sroa_idx = getelementptr inbounds i8, ptr %_0, i64 8
store i8 %_8.sroa.2.8.extract.trunc, ptr %_3.sroa.4.0._0.sroa_idx, align 8
%_3.sroa.5.0._0.sroa_idx = getelementptr inbounds i8, ptr %_0, i64 9
store i56 %_8.sroa.2.9.extract.trunc, ptr %_3.sroa.5.0._0.sroa_idx, align 1
ret void
}
attributes #0 = { mustprogress nofree norecurse nosync nounwind willreturn memory(argmem: readwrite) }
```
Instead of the clearly more efficient:
```llvm
define void @after_all_unwrap(ptr dead_on_unwind noalias nocapture noundef writable writeonly sret([16 x i8]) align 8 dereferenceable(16) %_0, ptr noalias noundef nonnull align 4 %data.0, i64 noundef %data.1, ptr noalias nocapture noundef readonly align 8 dereferenceable(16) %indices) local_unnamed_addr #0 {
%0 = getelementptr inbounds i8, ptr %indices, i64 8
%1 = load i64, ptr %0, align 8
%2 = getelementptr inbounds i32, ptr %data.0, i64 %1
%3 = load i64, ptr %indices, align 8
%4 = getelementptr inbounds i32, ptr %data.0, i64 %3
store ptr %4, ptr %_0, align 8
%_3.sroa.3.0._0.sroa_idx = getelementptr inbounds i8, ptr %_0, i64 8
store ptr %2, ptr %_3.sroa.3.0._0.sroa_idx, align 8
ret void
}
attributes #0 = { mustprogress nofree norecurse nosync nounwind willreturn memory(argmem: readwrite) }
```
It's SROA who inserts the truncs and shifts, and later when we discover the condition is always true we cannot get rid of them.
Please note that the important fact is **not** the order of passes: I deliberately made it so SROA will have a chance before we notice the condition is true. In the real case, the condition does not always hold, but SROA still pessimizes the output.
In my real case SROA doesn't really matter since the value is anyway going to be written into the return pointer (like here), but even if SROA cannot see that it seems it should not pessimize the code.
Godbolt: https://godbolt.org/z/rr15o7Keb
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJzsWF-PozgS_zTOS2kQGOiQhzz0H_VqdHu61Zx0ry2Di-A7Y3O2SSb76U820CFpkp5W70q30khR4kD9-VWVXT_bzFqxU4hbkj-Q_GnFetdos31s2FG0z0Ygb5miq1Lz4_YXsUcFrkGotZT6INQOfv31X3-Hr9-A0KIVSrTid-RQG90CyR8MMgnfeuug0hxJ_kRo0TjXWZLeE_pM6PNOuKYvo0q3hD6b3rovkqndOCb0WVjboyX0OaEFTTJCN_6T3pP4icT35C4ePlLu2-ERx1oohL0WHEgWs9qheWFSvvTqYFhHaNE5AxwZf9HKPxSKg9JMCmZB6Yp1rjcISveKYw0HIxwrJYYBaiWPYA06QguSPyR38B1EEeLaAJNip6AAjgZrNKgq9JqEFsmdf09o_hIT-ggewMnj4EdppXopRyOZF-bMsSgoiLvsVXB6kby1dIndIOMB8Q8gE4qLyid6A2T9MKTSOmbca64hBFAASZ-ASakrBhcpeJwczRXSDyrEQV5UbQf4XxAFUC9HJ4nSgEiCnH8sWYnS_yvL5OL_qDF8-9fpPQDMHCXBkdSM-_xO2ZynYgkeDVo7dCixReW8jlClz7gFkdKZnfP6eY8zO-lNO8UyHG9nAkNonl2LIF3Gnn8CezbZsU4bnMTmKi_FW6-E5lZqF1mjWZRFcfQSh_GL4N8j8cMpGEzPoz9Dkc9Eb7hbSErFpHxtFb6HRC22VXeMuth_QlaLYZEtLdPbaz09rdCPK79G7B89-jlfM2nRN78Be2nmsz09n-30fNXe_Xim08tMj6kWBcQzubuFZJ5q4qN8Y_Rc-ib89Bx-cZrnnTPvGPbPN2c9ZBQOYQZkl61kc9Y6Jp6YPcLvws0RjiKvIN9OIzdQzalc0CuDrGpCoWemgun0Hkj6AJ1BbgP0eVL-5Fkaf2aWpu_NUjDoAvQx5vXTBXEPf8MmwhmmbK1Na0E4EMrpnzz_Z_K8Z2SfLsVa5C-Mc79M0vjE_zM-_ihNJp-ltxPXLrhOlp3e5tTb_BbP7GSfsEPf8G3njNNCuUklA6dDPPNd1UBZNNpE-N0ZVrnINqJ2QwJsYybzgeqKd1Sd6VUVVIeR6g-T_nVPHlR-t2i5uGp5QhW0F4hg2otM_Tq-tklcpOwfZ634OmvdiuSMShYhLG5pJtn803A3Z9sZ4Yn6RkkX4ObX4CbvNeDwzZwzouwd2nHpe-5ZP0DbW9cZvTNofcOpDfpuY7DqjfUje_TTSo9d9SCkNOh6o6DFVpsjoQUzuxZbz2y-O4WmOpwtlglg-P6qrEPGQdfhkFlJZEYeofXJwboWlUDlfpLC_wspfLK9Xz2AXekSf-zR66OM9hlSSBc7Y_ZOZ5wt8_QPa4zXj2_LrhaS8ZfqKI7QtYV_fvvHPRwaDUJZNM6G9hJ6qgWmOAQKHAqvOEjm0MChQQUHBC5spfdohpakFRdOaAXCApMHdrTeDnrBiimlna8LGDE1sTaaw_lNIgvhOgTXMBdsirbTxjHloGaV84YJvSf0Xmk3DIKUNhyNN9oxa9H6PHwFjlKUaJhD3yYZR79ztnqMV0gJDdsjMKgapiqEEmtf_kOAICp8G5MPJoKvwy1fuL6rmN_QP16Ico2-jm5KQqMl91Jl7wbv1nn3HVob7gSHlOvedb2Lzns-tMeTp0HZW1eErl14EWJzviZWqBH0nskeQxHU8cCOsNNC7fw2pBy6tUMVjhBjHGEqdX4bhn62F1L8B6FBE44qI2zce516QDAW0-JYKBHGw9HENrqXPET_Gt-YHo5nwf2ieallOOBdXHsOLyJtdoQ-_07oszFJrtd_w3LFtynfpBu2wm2ypnS9WWc0XTXbLMmSNOcxrfgmY3GRJ0lNN1imLIuTrC5XYktjmsVFnCVFHtMkSrK6yDmt2XqNmzQrSBZjy4SMwiFSm90qXK9ukzgpNptVOO_a6SLYbL3Ul7LfWX_sFNbZk54TTuL2tzF6NsweFVK36o3c3rjkDWw9_HzpjP43Vuf3vAOW_Zb-LwAA__9OTuv1">